A Survey of Context Engineering for Large Language Models
Paper
• 2507.13334
• Published
• 261
GUI-G^2: Gaussian Reward Modeling for GUI Grounding
Paper
• 2507.15846
• Published
• 133
ScreenCoder: Advancing Visual-to-Code Generation for Front-End
Automation via Modular Multimodal Agents
Paper
• 2507.22827
• Published
• 100
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility,
Reasoning, and Efficiency
Paper
• 2508.18265
• Published
• 214
Group Sequence Policy Optimization
Paper
• 2507.18071
• Published
• 318
Why Language Models Hallucinate
Paper
• 2509.04664
• Published
• 196
Mini-o3: Scaling Up Reasoning Patterns and Interaction Turns for Visual
Search
Paper
• 2509.07969
• Published
• 59
Visual Representation Alignment for Multimodal Large Language Models
Paper
• 2509.07979
• Published
• 84
Detect Anything via Next Point Prediction
Paper
• 2510.12798
• Published
• 50
Less is More: Recursive Reasoning with Tiny Networks
Paper
• 2510.04871
• Published
• 509
Diffusion Language Models are Super Data Learners
Paper
• 2511.03276
• Published
• 129