Finetuning Question

by tikeape - opened Dec 14, 2025

TeichAI org Dec 14, 2025

Hello, I was wondering if you could share you notebook or script that you use with your datasets to distill the reasoning into the COT and not just responses. When I finetune using your reasoning datasets it tends to create a hybrid reasoning model with no COT but one that will sometimes reason its response for certain prompts.

armand0e

TeichAI org Dec 14, 2025

interesting, we don't do any sort of 'reasoning-only loss' or anything like that, we just train until loss is around 0.1-0.04 using slightly altered settings from the unsloth notebook. please add me on discord (@armand0e ) and I can share the scripts we've been using and help you debug your issue better

tikeape

TeichAI org Dec 15, 2025

I added you and recieved the script and I believe I was just using far too little training steps as well using the wrong notebook to train the thinking models. Anyway it is working better now. So thank you!

tikeape changed discussion status to closed Dec 15, 2025

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment