Vision-DeepResearch: Incentivizing DeepResearch Capability in Multimodal Large Language Models Paper • 2601.22060 • Published Jan 29 • 158
Code2World: A GUI World Model via Renderable Code Generation Paper • 2602.09856 • Published 21 days ago • 196
PaperBanana: Automating Academic Illustration for AI Scientists Paper • 2601.23265 • Published Jan 30 • 207
Does Your Reasoning Model Implicitly Know When to Stop Thinking? Paper • 2602.08354 • Published 22 days ago • 255
Green-VLA: Staged Vision-Language-Action Model for Generalist Robots Paper • 2602.00919 • Published about 1 month ago • 307
CUDA Agent: Large-Scale Agentic RL for High-Performance CUDA Kernel Generation Paper • 2602.24286 • Published 4 days ago • 57
Reasoning Path and Latent State Analysis for Multi-view Visual Spatial Reasoning: A Cognitive Science Perspective Paper • 2512.02340 • Published Dec 2, 2025 • 1
Solaris: Building a Multiplayer Video World Model in Minecraft Paper • 2602.22208 • Published 6 days ago • 27
OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration Paper • 2602.05400 • Published 26 days ago • 343