Olmo 3 Pre-training Collection All artifacts related to Olmo 3 pre-training • 10 items • Updated 5 days ago • 31
Olmo 3.1 Collection The latest members of the Olmo 3 family: another 3 weeks of RL for 32B Think, the 32B Instruct model, large post-training research datasets... • 9 items • Updated 5 days ago • 36
Olmo 3 Post-training Collection All artifacts for post-training Olmo 3. Datasets follow the model that resulted from training on them. • 32 items • Updated 5 days ago • 46
Entropy Ratio Clipping as a Soft Global Constraint for Stable Reinforcement Learning Paper • 2512.05591 • Published 23 days ago • 16
Entropy Ratio Clipping as a Soft Global Constraint for Stable Reinforcement Learning Paper • 2512.05591 • Published 23 days ago • 16
Entropy Ratio Clipping as a Soft Global Constraint for Stable Reinforcement Learning Paper • 2512.05591 • Published 23 days ago • 16 • 2
The Path Not Taken: RLVR Provably Learns Off the Principals Paper • 2511.08567 • Published Nov 11 • 33
AEPO Collection The official datasets and model checkpoints of AEPO • 5 items • Updated 8 days ago • 4