Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up

All HF Hub posts

SeaWolf-AI 
posted an update about 23 hours ago
view post
Post
1231
AI Is Training on Your Content Without Permission — Fight Back with Invisible Watermarks

FINAL-Bench/security-scan

Most generative AI training data is crawled without consent. Your text gets summarized, images reprocessed, videos clipped — with no way to prove you're the original creator. Existing watermarks are either visible or wiped out by a single AI preprocessing pass.

Detect Before, Track After

Pre-embed — Detect theft without any watermark. Text plagiarism check, image similarity analysis (perceptual hash, SSIM, color histogram, feature matching), and video temporal matching catch copies, edits, and excerpts.

Post-embed — Embed invisible multi-layer watermarks. If one layer is destroyed, others survive independently. Even full removal leaves forensic traces as evidence.

Text: 4 Independent Layers

Four mechanisms work simultaneously: zero-width Unicode characters at morpheme/word boundaries (Korean Kiwi + English NLP), style fingerprinting via synonym-ending-connective substitution, SHA-256 timestamped evidence packages, and punctuation-anchored micro-marks. Each layer uses a different Unicode category, so attacks on one cannot eliminate the others. Full bilingual support, zero readability impact.

34-Attack Defense

7 categories, 34 attacks simulated: Unicode normalization, invisible character removal, homoglyph substitution (9,619 confusables), and AI rewriting. Each scored on Signal (watermark survival) + Trace (forensic evidence of attack) — proving deliberate removal even when watermarks are destroyed.

Image & Video

Images: DCT frequency-domain watermarks surviving JPEG compression and resize. Videos: keyframe watermarking with temporal propagation and majority-vote extraction. Both support pre-embed similarity detection.

Who Is This For

Creators, rights holders needing legal evidence, media companies, and organizations tracking document leaks. Korean/English bilingual, open source, Gradio-based.
sergiopaniego 
posted an update 1 day ago
view post
Post
1384
What happens when you make an LLM drive a car where physics are real and actions can't be undone?

I ported CARLA, the autonomous driving simulator, to OpenEnv and added training support via TRL + Hugging Face Spaces.

The model interacts with the simulator through tool calls (observe, brake, change lane) and learns from a reward signal.

In 50 training steps, Qwen 0.6B learns to swerve and brake to avoid pedestrians in emergency situations.

The project supports text and vision (VLMs can see through a camera sensor), open-world driving with traffic, and multiple driving scenarios.

This builds on the carla-env project by sinatras, which originally placed LLMs inside CARLA for evaluation. We extended it with vision, new scenarios, rubric-based rewards, and made it trainable end-to-end.

Blog: https://huggingface.co/blog/sergiopaniego/bringing-carla-to-openenv-trl/
CARLA env in OpenEnv: https://github.com/meta-pytorch/OpenEnv/tree/main/envs/carla_env
Training script: https://github.com/huggingface/trl/blob/main/examples/scripts/openenv/carla.py
YatharthS 
posted an update 1 day ago
view post
Post
1142
Just open sourced LavaSR v2: a model that can enhance 5000 seconds of audio in 1 second while being higher quality than giant and slow 6gb diffusion models!

It works with any sampling rate from 8-48khz and is nearly 5000x faster than competition while being superior in objective benchmarks.

LavaSR v2 is Perfect for
- Enhancing TTS models.
- Fixing old audio datasets.
- Restoring low quality recordings.

You can check out the examples and run it locally or online:

Repo: https://github.com/ysharma3501/LavaSR.git
Demo: YatharthS/LavaSR
Model: YatharthS/LavaSR
marksverdhei 
posted an update 3 days ago
view post
Post
1153
🤔 Many cultures penalize or look down upon self-celebratory behavior. One such example is liking your own post. So why do i do it? Two reasons:
1. I disagree that self-celebratory behavior is inherently bad.
2. On the Huggingface hub, if your post has 0 reactions, it takes TWO whole clicks to react instead of one. So it is actually a UI hack that lowers the bar to engage.

So if you see me reacting to to my own post and thing 'Ugh, this guy is so full of himself' you are only half correct 😆

Now behold as I perform this magic trick called "Exhausting all reaction options for increased visual engagement" so you don't have to click twice to react. You're welcome!
Follow this aspiring 🤗 HF Hub influencer for more half-serious bloat in your feed 😜
  • 1 reply
·
OzTianlu 
posted an update 1 day ago
view post
Post
1115
Scaling UP in Kai! 🌊
NoesisLab/Kai-3B-Instruct
Introducing NoesisLab/Kai-3B-Instruct What happens when you force a 3B model to reason entirely in its latent space ?
Meet Kai-3B, our latest industrial-grade reasoning model fine-tuned using the Adaptive Dual Search (ADS) algorithm.
GSM8K (0-shot, Direct Answer): 39.27% 🤯 (Llama-2-7B is ~14.6%)
HumanEval (Pass@1): 39.02% 💻 (Overtakes Gemma-2-2B's 30%)
MMLU (5-shot): 53.62% 📚 (Crushing the 50% barrier)
ARC-Challenge: 51.88%🎯
PIQA: 77.53%
HellaSwag: 69.53%
Kai-3B proves that reasoning density doesn't strictly require parameter bloat or verbose generation. It acts as a perfect, cold-blooded Agent action-engine—ideal for JSON routing, SWE-bench patch generation, and anywhere you need absolute structured certainty without token waste.
  • 2 replies
·
albertvillanova 
posted an update 1 day ago
view post
Post
1081
🚀 TRL v0.29.0 introduces trl-training: an agent-native training skill.

This makes the TRL CLI a structured, agent-readable capability, allowing AI agents to reliably execute training workflows such as:
- Supervised Fine-Tuning (SFT)
- Direct Preference Optimization (DPO)
- Group Relative Policy Optimization (GRPO)

We’re excited to see what the community builds on top of this.

If you’re working on AI agents, alignment research, or scalable RL training infrastructure: give TRL v0.29.0 a try! 🤗

The future of ML tooling is agent-native.
🔗 https://github.com/huggingface/trl/releases/tag/v0.29.0
GVA21q2 
posted an update 1 day ago
view post
Post
1038
# π.Guy.AI — AI-Powered Neuropedagogy Math Lessons

Students with math anxiety, ADHD, dyslexia, or low working memory need different learning experiences — but teachers can't create individualized materials for every student.

**π.Guy.AI** generates interactive HTML math lessons adapted to 7 cognitive profiles, using a multi-agent AI pipeline:

1. **Neuro-Interpreter** — enriches prompts with profile-specific adaptations
2. **Creative Agent** — generates a 12-slide lesson with SVG visualizations
3. **Quality Control** — validates against 8 neuropedagogy principles

Each lesson is a standalone HTML file with inline CSS/JS/SVG — works offline, no dependencies.

## The Model

Fine-tuned **Qwen2.5-7B-Instruct** with LoRA on 313 curated Hebrew math lessons.

- Model: [GVA21q2/piguyai-lessons-v2-enhanced](https://huggingface.co/GVA21q2/piguyai-lessons-v2-enhanced)
- Dataset: [GVA21q2/pi-guy-ai-lessons](https://huggingface.co/datasets/GVA21q2/pi-guy-ai-lessons)
- Demo: [GVA21q2/pi-guy-ai-demo]( GVA21q2/pi-guy-ai-demo)
- Web app: [gva21q2.github.io/pi.guy.ai](https://gva21q2.github.io/pi.guy.ai/)

7 profiles: math anxiety, ADHD, dyslexia, dysgraphia, low working memory, visual processing, weak inhibition.

Built by [Guy Assal](https://www.guyassal.education)
prithivMLmods 
posted an update 3 days ago
view post
Post
1934
FireRed-Image-Edit-1.0 (Rapid) Fast Experimental Demo is Out! 🚀🤗

Demo: prithivMLmods/FireRed-Image-Edit-1.0-Fast

-> Paired the EditPlusPipeline with the Diffusers-compatible transformer weights of Rapid AIO from Qwen-Image-Edit. (experimental)
-> This fusion delivers more accurate instruction following, higher image quality, and consistent visual coherence @ 4-step fast inference.
-> Better maintains text styles with high fidelity, along with high-quality old photo restoration, enhancement, and best-in-class virtual try-on.

robtacconelli 
posted an update 4 days ago
view post
Post
3565
🏆 Nacrith: a 135M model that out-compresses everything on natural language

What if a tiny LM could compress english text better than _every_ compressor out there — classical or neural, small or large?

Nacrith pairs SmolLM2-135M with an ensemble of online predictors and high-precision arithmetic coding.

What's inside

The standard LLM+arithmetic coding approach wastes ~75% of CDF precision on large vocabularies. Our CDF-24 fix alone recovers 0.5 bpb. On top: a token N-gram that skips the GPU on predictable tokens, an adaptive bias head, llama.cpp backend (7× faster than PyTorch), multi-GPU parallel compression, and a binary file format (NC06) — the first LLM-based binary compressor we know of.

Runs on a GTX 1050 Ti. ~500 MB weights, ~1.2 GB VRAM per worker.

💻 Code: https://github.com/robtacconelli/Nacrith-GPU
⭐ Space: robtacconelli/Nacrith-GPU
📄 Paper: Nacrith: Neural Lossless Compression via Ensemble Context Modeling and High-Precision CDF Coding (2602.19626)

Try it, break it, share your results — all feedback welcome. ⭐ on the repo appreciated!

Results across all systems we tested:
- alice29.txt → 0.918 bpb (−44% vs CMIX, −20% vs ts_zip) — below the 2nd-order Shannon entropy bound
- enwik8 (100 MB) → 0.9389 bpb (−8% vs FineZip/LLMZip's 8B model, −15% vs ts_zip)
- Unseen text → 0.723 bpb on a doc published after training cutoff — no memorization, 26% better than FineZip/LLMZip on the same model

SmolLM2-135M by
HuggingFaceTB
  • 1 reply
·
branikita 
posted an update 1 day ago
view post
Post
1091
Our engineer Alan Subin from Robonine has started preparations for testing the manipulator on the mobile two-wheeled platform.
  • 2 replies
·