TUA-Bench: A Benchmark for General-Purpose Terminal-Use Agents Paper • 2606.28480 • Published 7 days ago • 44
The Geometry of Reasoning: Flowing Logics in Representation Space Paper • 2510.09782 • Published Oct 10, 2025 • 7
Diversity Empowers Intelligence: Integrating Expertise of Software Engineering Agents Paper • 2408.07060 • Published Aug 13, 2024 • 41