shoaibmohd 's Collections Learning from examples - training/inference
updated
ExGRPO: Learning to Reason from Experience
Paper
• 2510.02245
• Published
• 80
A Practitioner's Guide to Multi-turn Agentic Reinforcement Learning
Paper
• 2510.01132
• Published
• 6
Agentic Context Engineering: Evolving Contexts for Self-Improving
Language Models
Paper
• 2510.04618
• Published
• 129
MixReasoning: Switching Modes to Think
Paper
• 2510.06052
• Published
• 22
Agent Learning via Early Experience
Paper
• 2510.08558
• Published
• 273
Learning on the Job: An Experience-Driven Self-Evolving Agent for
Long-Horizon Tasks
Paper
• 2510.08002
• Published
• 23
When Thoughts Meet Facts: Reusable Reasoning for Long-Context LMs
Paper
• 2510.07499
• Published
• 48
Dr.LLM: Dynamic Layer Routing in LLMs
Paper
• 2510.12773
• Published
• 32
Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning
Paper
• 2511.16043
• Published
• 109
Agent-R1: Training Powerful LLM Agents with End-to-End Reinforcement Learning
Paper
• 2511.14460
• Published
• 21