Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
RLAIF
/
dpo_thinking_reddit_judge_position_bias_extra_1e-6_0.02_4B_4B
like
0
Follow
RLAIF
21
Safetensors
Model card
Files
Files and versions
xet
Community
main
dpo_thinking_reddit_judge_position_bias_extra_1e-6_0.02_4B_4B
8.84 GB
1 contributor
History:
3 commits
AngelRaychev
Upload folder using huggingface_hub
13c23fa
verified
4 months ago
global_step_208
Upload folder using huggingface_hub
4 months ago
.gitattributes
Safe
1.59 kB
Upload folder using huggingface_hub
4 months ago