suu's picture

suu

Suu

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 9 days ago

Olmo 3

upvoted a paper 10 days ago

Step-GUI Technical Report

upvoted a collection 11 days ago

Olmo 3 Pre-training

View all activity

Organizations

upvoted a paper 9 days ago

Olmo 3

Paper • 2512.13961 • Published 12 days ago • 22

upvoted a paper 10 days ago

Step-GUI Technical Report

Paper • 2512.15431 • Published 11 days ago • 123

upvoted a collection 11 days ago

Olmo 3 Pre-training

All artifacts related to Olmo 3 pre-training • 10 items • Updated 5 days ago • 31

upvoted a collection 12 days ago

Olmo 3.1

The latest members of the Olmo 3 family: another 3 weeks of RL for 32B Think, the 32B Instruct model, large post-training research datasets... • 9 items • Updated 5 days ago • 36

liked a model 13 days ago

allenai/Olmo-3.1-32B-Think

Text Generation • 32B • Updated 13 days ago • 3.72k • • 59

upvoted 2 collections 16 days ago

Olmo 3

Artifacts for the Olmo 3 release. • 9 items • Updated 5 days ago • 156

Olmo 3 Post-training

All artifacts for post-training Olmo 3. Datasets follow the model that resulted from training on them. • 32 items • Updated 5 days ago • 46

liked 2 models 16 days ago

allenai/Olmo-3-32B-Think

Text Generation • 1.05M • Updated 16 days ago • 10.1k • • 163

allenai/Olmo-3-1125-32B

Text Generation • 32B • Updated 25 days ago • 6.78k • 101

commented a paper 17 days ago

Soft Adaptive Policy Optimization

Paper • 2511.20347 • Published Nov 25 • 41 •

authored a paper 20 days ago

Entropy Ratio Clipping as a Soft Global Constraint for Stable Reinforcement Learning

Paper • 2512.05591 • Published 23 days ago • 16

updated a collection 20 days ago

KlearReasoner

KlearReasoner • 7 items • Updated 19 days ago • 5

upvoted a paper 20 days ago

Entropy Ratio Clipping as a Soft Global Constraint for Stable Reinforcement Learning

Paper • 2512.05591 • Published 23 days ago • 16

commented a paper 20 days ago

Entropy Ratio Clipping as a Soft Global Constraint for Stable Reinforcement Learning

Paper • 2512.05591 • Published 23 days ago • 16 •

liked a model about 1 month ago

deepseek-ai/DeepSeek-Math-V2

Text Generation • 685B • Updated about 1 month ago • 11.3k • 669

upvoted a paper about 2 months ago

The Path Not Taken: RLVR Provably Learns Off the Principals

Paper • 2511.08567 • Published Nov 11 • 33

upvoted a collection 2 months ago

AEPO

The official datasets and model checkpoints of AEPO • 5 items • Updated 8 days ago • 4

upvoted a paper 2 months ago

Agentic Entropy-Balanced Policy Optimization

Paper • 2510.14545 • Published Oct 16 • 104

updated a collection 3 months ago

KlearReasoner

KlearReasoner • 7 items • Updated 19 days ago • 5

updated a model 3 months ago

Kwai-Klear/Klear-Reasoner-8B-SFT

8B • Updated Sep 27 • 19 • 2