BeyondSWE: Can Current Code Agent Survive Beyond Single-Repo Bug Fixing? Paper • 2603.03194 • Published 1 day ago • 50
MiniCPM4 Collection MiniCPM4: Ultra-Efficient LLMs on End Devices • 30 items • Updated 22 days ago • 84
The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models Paper • 2505.22617 • Published May 28, 2025 • 131
Challenging the Boundaries of Reasoning: An Olympiad-Level Math Benchmark for Large Language Models Paper • 2503.21380 • Published Mar 27, 2025 • 38
ETVA: Evaluation of Text-to-Video Alignment via Fine-grained Question Generation and Answering Paper • 2503.16867 • Published Mar 21, 2025 • 12
An Empirical Study on Eliciting and Improving R1-like Reasoning Models Paper • 2503.04548 • Published Mar 6, 2025 • 9
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models Paper • 2501.03262 • Published Jan 4, 2025 • 104