HRM-Text: Efficient Pretraining Beyond Scaling Paper • 2605.20613 • Published 15 days ago • 210
Progressive Residual Warmup for Language Model Pretraining Paper • 2603.05369 • Published Mar 5 • 36