2 3 3

Milad Aghajohari

miladink

AI & ML interests

NLP, ML, Multi-Agent RL, SSL, AI

Recent Activity

upvoted a paper 3 months ago

Grounding Computer Use Agents on Human Demonstrations

authored a paper 4 months ago

LOQA: Learning with Opponent Q-Learning Awareness

authored a paper 4 months ago

VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment

View all activity

Organizations

upvoted a paper 3 months ago

Grounding Computer Use Agents on Human Demonstrations

Paper • 2511.07332 • Published Nov 10, 2025 • 106

authored 3 papers 4 months ago

LOQA: Learning with Opponent Q-Learning Awareness

Paper • 2405.01035 • Published May 2, 2024

VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment

Paper • 2410.01679 • Published Oct 2, 2024 • 27

The Markovian Thinker

Paper • 2510.06557 • Published Oct 8, 2025 • 31

upvoted a paper 4 months ago

The Markovian Thinker

Paper • 2510.06557 • Published Oct 8, 2025 • 31

commented a paper 4 months ago

Inference-Time Hyper-Scaling with KV Cache Compression

Paper • 2506.05345 • Published Jun 5, 2025 • 28 •

updated a model 5 months ago

MathMindsAGI/DT_1.5B_3X8K_YOLO_BF_20250823Tamia_seed_2746318213

Updated Aug 30, 2025

published a model 5 months ago

MathMindsAGI/DT_1.5B_3X8K_YOLO_BF_20250823Tamia_seed_2746318213

Updated Aug 30, 2025

updated a model 5 months ago

MathMindsAGI/DT_1.5B_3X4K_YOLO_20250823Tamia_seed_2746318213

Updated Aug 27, 2025

published a model 5 months ago

MathMindsAGI/DT_1.5B_3X4K_YOLO_20250823Tamia_seed_2746318213

Updated Aug 27, 2025

updated a model 5 months ago

MathMindsAGI/delethinkIter_R1DistillQwen1.5B_DEEPSCALER_BASELINE_20250812Baseline_seed_274631821

Updated Aug 26, 2025

published a model 5 months ago

MathMindsAGI/delethinkIter_R1DistillQwen1.5B_DEEPSCALER_BASELINE_20250812Baseline_seed_274631821

Updated Aug 26, 2025

updated a model 5 months ago

MathMindsAGI/delethinkIter_R1DistillQwen1.5B_DEEPSCALER_LONG_20250812Long_seed_2746318213

Updated Aug 26, 2025

published a model 5 months ago

MathMindsAGI/delethinkIter_R1DistillQwen1.5B_DEEPSCALER_LONG_20250812Long_seed_2746318213

Updated Aug 26, 2025

updated 3 models 5 months ago

published 3 models 5 months ago

MathMindsAGI/DT_1.5B_3X8K_YOLO_KL_20250823Tamia_seed_2746318213

Updated Aug 25, 2025

MathMindsAGI/DT_1.5B_3X8K_YOLO_DRGRPO_20250823Tamia_seed_2746318213

Updated Aug 25, 2025

MathMindsAGI/DT_1.5B_3X8K_YOLO_20250823Tamia_seed_2746318213

Updated Aug 26, 2025

Milad Aghajohari

AI & ML interests

Recent Activity

Organizations

miladink's activity