LeapAlign: Post-Training Flow Matching Models at Any Generation Step by Building Two-Step Trajectories Paper • 2604.15311 • Published 12 days ago • 12
Cross-Tokenizer LLM Distillation through a Byte-Level Interface Paper • 2604.07466 • Published 15 days ago • 6
LongAct: Harnessing Intrinsic Activation Patterns for Long-Context Reinforcement Learning Paper • 2604.14922 • Published 12 days ago • 7
Exploration and Exploitation Errors Are Measurable for Language Model Agents Paper • 2604.13151 • Published 14 days ago • 24
Tracing the Roots: A Multi-Agent Framework for Uncovering Data Lineage in Post-Training LLMs Paper • 2604.10480 • Published 16 days ago • 20
You Only Judge Once: Multi-response Reward Modeling in a Single Forward Pass Paper • 2604.10966 • Published 15 days ago • 11
How to Fine-Tune a Reasoning Model? A Teacher-Student Cooperation Framework to Synthesize Student-Consistent SFT Data Paper • 2604.14164 • Published Mar 23 • 34
KV Packet: Recomputation-Free Context-Independent KV Caching for LLMs Paper • 2604.13226 • Published 14 days ago • 10
Attention Sink in Transformers: A Survey on Utilization, Interpretation, and Mitigation Paper • 2604.10098 • Published 17 days ago • 76
Audio Flamingo Next: Next-Generation Open Audio-Language Models for Speech, Sound, and Music Paper • 2604.10905 • Published 15 days ago • 28
VLMs Need Words: Vision Language Models Ignore Visual Detail In Favor of Semantic Anchors Paper • 2604.02486 • Published 26 days ago • 10
FP4 Explore, BF16 Train: Diffusion Reinforcement Learning via Efficient Rollout Scaling Paper • 2604.06916 • Published 20 days ago • 34
Think in Strokes, Not Pixels: Process-Driven Image Generation via Interleaved Reasoning Paper • 2604.04746 • Published 20 days ago • 70
Delta Attention: Fast and Accurate Sparse Attention Inference by Delta Correction Paper • 2505.11254 • Published May 16, 2025 • 49
InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU Paper • 2502.08910 • Published Feb 13, 2025 • 150