Gaia2: Benchmarking LLM Agents on Dynamic and Asynchronous Environments Paper • 2602.11964 • Published 16 days ago • 12
P1-VL: Bridging Visual Perception and Scientific Reasoning in Physics Olympiads Paper • 2602.09443 • Published 18 days ago • 57
Code2World: A GUI World Model via Renderable Code Generation Paper • 2602.09856 • Published 18 days ago • 195
InftyThink+: Effective and Efficient Infinite-Horizon Reasoning via Reinforcement Learning Paper • 2602.06960 • Published 22 days ago • 13
Reinforcement World Model Learning for LLM-based Agents Paper • 2602.05842 • Published 23 days ago • 27
JustRL: Scaling a 1.5B LLM with a Simple RL Recipe Paper • 2512.16649 • Published Dec 18, 2025 • 27
P1: Mastering Physics Olympiads with Reinforcement Learning Paper • 2511.13612 • Published Nov 17, 2025 • 134
Running on CPU Upgrade Featured 3.02k The Smol Training Playbook 📚 3.02k The secrets to building world-class LLMs
Scaling Agent Learning via Experience Synthesis Paper • 2511.03773 • Published Nov 5, 2025 • 82
The End of Manual Decoding: Towards Truly End-to-End Language Models Paper • 2510.26697 • Published Oct 30, 2025 • 117
The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic, and Long-Horizon Task Execution Paper • 2510.25726 • Published Oct 29, 2025 • 46
Glyph: Scaling Context Windows via Visual-Text Compression Paper • 2510.17800 • Published Oct 20, 2025 • 68