R-HORIZON Models Collection models of R-HORIZON: How Far Can Your Large Reasoning Model Really Go in Breadth and Depth? • 5 items • Updated Mar 10 • 1
Advancing Block Diffusion Language Models for Test-Time Scaling Paper • 2602.09555 • Published Feb 10 • 4
OPE: Overcoming Information Saturation in Parallel Thinking via Outline-Guided Path Exploration Paper • 2602.08344 • Published Feb 9 • 5
OdysseyArena: Benchmarking Large Language Models For Long-Horizon, Active and Inductive Interactions Paper • 2602.05843 • Published Feb 5 • 61
EvoCUA: Evolving Computer Use Agents via Learning from Scalable Synthetic Experience Paper • 2601.15876 • Published Jan 22 • 92
Locate, Steer, and Improve: A Practical Survey of Actionable Mechanistic Interpretability in Large Language Models Paper • 2601.14004 • Published Jan 20 • 47
OS-Symphony: A Holistic Framework for Robust and Generalist Computer-Using Agent Paper • 2601.07779 • Published Jan 12 • 28
Dream-VL & Dream-VLA: Open Vision-Language and Vision-Language-Action Models with Diffusion Language Model Backbone Paper • 2512.22615 • Published Dec 27, 2025 • 50
Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm Paper • 2511.04570 • Published Nov 6, 2025 • 242
OS-Sentinel: Towards Safety-Enhanced Mobile GUI Agents via Hybrid Validation in Realistic Workflows Paper • 2510.24411 • Published Oct 28, 2025 • 72
JanusCoder: Towards a Foundational Visual-Programmatic Interface for Code Intelligence Paper • 2510.23538 • Published Oct 27, 2025 • 98
BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via Balanced Policy Optimization with Adaptive Clipping Paper • 2510.18927 • Published Oct 21, 2025 • 85
R-HORIZON Collection The training and evaluation datasets for Paper "How Far Can Your Large Reasoning Model Really Go in Breadth and Depth?" • 6 items • Updated Oct 22, 2025 • 8
MOSS-Speech: Towards True Speech-to-Speech Models Without Text Guidance Paper • 2510.00499 • Published Oct 1, 2025 • 21
R-Horizon: How Far Can Your Large Reasoning Model Really Go in Breadth and Depth? Paper • 2510.08189 • Published Oct 9, 2025 • 27
R-HORZION Datasets Collection Training and evaluation datasets of R-HORIZON: How Far Can Your Large Reasoning Model Really Go in Breadth and Depth? • 6 items • Updated Mar 9 • 6
ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data Paper • 2509.15221 • Published Sep 18, 2025 • 111
Harnessing Uncertainty: Entropy-Modulated Policy Gradients for Long-Horizon LLM Agents Paper • 2509.09265 • Published Sep 11, 2025 • 47