VNHSGE: VietNamese High School Graduation Examination Dataset for Large Language Models Paper • 2305.12199 • Published May 20, 2023 • 1
WMPO: World Model-based Policy Optimization for Vision-Language-Action Models Paper • 2511.09515 • Published Nov 12, 2025 • 18
Reasoning with Confidence: Efficient Verification of LLM Reasoning Steps via Uncertainty Heads Paper • 2511.06209 • Published Nov 9, 2025 • 18
Towards Agentic RAG with Deep Reasoning: A Survey of RAG-Reasoning Systems in LLMs Paper • 2507.09477 • Published Jul 13, 2025 • 86
LongCodeZip: Compress Long Context for Code Language Models Paper • 2510.00446 • Published Oct 1, 2025 • 106
Reverse-Engineered Reasoning for Open-Ended Generation Paper • 2509.06160 • Published Sep 7, 2025 • 150
SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering? Paper • 2502.12115 • Published Feb 17, 2025 • 46
Beyond Pass@1: Self-Play with Variational Problem Synthesis Sustains RLVR Paper • 2508.14029 • Published Aug 19, 2025 • 118
Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL Paper • 2508.13167 • Published Aug 6, 2025 • 129
view article Article Introducing smolagents: simple agents that write actions in code. +1 Dec 31, 2024 • 1.16k
Agent Lightning: Train ANY AI Agents with Reinforcement Learning Paper • 2508.03680 • Published Aug 5, 2025 • 122
Efficient Agents: Building Effective Agents While Reducing Cost Paper • 2508.02694 • Published Jul 24, 2025 • 86
Cyber-Zero: Training Cybersecurity Agents without Runtime Paper • 2508.00910 • Published Jul 29, 2025 • 8
SWE-Perf: Can Language Models Optimize Code Performance on Real-World Repositories? Paper • 2507.12415 • Published Jul 16, 2025 • 42
view article Article DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge Feb 7, 2025 • 267
BENCHAGENTS: Automated Benchmark Creation with Agent Interaction Paper • 2410.22584 • Published Oct 29, 2024 • 1
Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models Paper • 2506.05176 • Published Jun 5, 2025 • 77