PAN: A World Model for General, Interactable, and Long-Horizon World Simulation Paper • 2511.09057 • Published Nov 12, 2025 • 76
TinySQL: A Progressive Text-to-SQL Dataset for Mechanistic Interpretability Research Paper • 2503.12730 • Published Mar 17, 2025 • 4
Plan and Budget: Effective and Efficient Test-Time Scaling on Large Language Model Reasoning Paper • 2505.16122 • Published May 22, 2025 • 5