Running Featured 41 QED-Nano: Teaching a Tiny Model to Prove Hard Theorems π 41 Who needs 1T parameters? Olympiad proofs with a 4B model
Running 52 Bringing paper to life: A modern template for scientific writing π 52 Download a ready-to-use scientific paper template
Running 3.7k The Ultra-Scale Playbook π 3.7k The ultimate guide to training LLM on large GPU Clusters
Running 88 Scaling FineWeb to 1000+ languages: Step 1: finding signal in 100s of evaluation tasks π 88 Evaluate multilingual models using FineTasks
Running 132 TxT360: Trillion Extracted Text π 132 Explore and download the TxT360 LLM preβtraining dataset