Instructions to use mozilla-ai/DuoGuard-DuoGuard-0.5B-encoderfile with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- encoderfile
How to use mozilla-ai/DuoGuard-DuoGuard-0.5B-encoderfile with encoderfile:
# No code snippets available yet for this library. # To use this model, check the repository files and the library's documentation. # Want to help? PRs adding snippets are welcome at: # https://github.com/huggingface/huggingface.js
- Notebooks
- Google Colab
- Kaggle
why is the file-size 2 gb?
I wonder why the file is so large when the underlying model is a 0.5B Qwen? I had expected at most 1 GB for the model in 8-bit and backend/runtime.
Hi @phi0112358 ! Thanks for your comment. I will verify this in the next few days, but my guess is that the data type got silently upcasted to fp32 from bf16 (which is what qwen 0.5B/duoguard was trained with) during the ONNX export step. This would explain the nearly 2x size increase (original safetensors is 988mb). bf16 support is spotty among accelerator/CPU providers, so I'm hesitant to make the canonical encoderfile in anything but fp32, but I will take a look into seeing if we can decrease the size. At the very least, weights compression is already on the docket ;) hope to have a response to you soon!