Programming with Data: Test-Driven Data Engineering for Self-Improving LLMs from Raw Corpora Paper • 2604.24819 • Published 4 days ago • 79
GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning Paper • 2604.02721 • Published 28 days ago • 375
ClawBench: Can AI Agents Complete Everyday Online Tasks? Paper • 2604.08523 • Published 22 days ago • 261