ML Foundations Development

non-profit

https://github.com/mlfoundations

AI & ML interests

None defined yet.

Recent Activity

penfever authored a paper 4 days ago

OpenThoughts-Agent: Data Recipes for Agentic Models

RZ412 authored a paper 7 days ago

OpenThoughts-Agent: Data Recipes for Agentic Models

penfever published a dataset 28 days ago

mlfoundations-dev/sachin_khanacademy_web_scraped_questions

View all activity

authored a paper 4 days ago

OpenThoughts-Agent: Data Recipes for Agentic Models

Paper • 2606.24855 • Published 9 days ago • 46

authored a paper 7 days ago

OpenThoughts-Agent: Data Recipes for Agentic Models

Paper • 2606.24855 • Published 9 days ago • 46

published 6 datasets 28 days ago

mlfoundations-dev/sachin_khanacademy_web_scraped_questions

Updated Oct 25, 2024 • 27

mlfoundations-dev/slim-orca

Viewer • Updated Oct 28, 2024 • 260k • 12

mlfoundations-dev/subsampled_flan_v2_w_system_instructions

Viewer • Updated Oct 14, 2024 • 3.56M • 20

mlfoundations-dev/subsampled_flan_v2

Viewer • Updated Oct 10, 2024 • 3.51M • 20

mlfoundations-dev/TIGER-Lab-WebInstructFull-copy

Viewer • Updated Oct 20, 2024 • 11.6M • 8

mlfoundations-dev/open-orca

Preview • Updated Oct 18, 2024 • 4

authored 7 papers 3 months ago

DWIM: Towards Tool-aware Visual Reasoning via Discrepancy-aware Workflow Generation & Instruct-Masking Tuning

Paper • 2503.19263 • Published Mar 25, 2025 • 2

Executable Functional Abstractions: Inferring Generative Programs for Advanced Math Problems

Paper • 2504.09763 • Published Apr 14, 2025 • 12

OpenThoughts: Data Recipes for Reasoning Models

Paper • 2506.04178 • Published Jun 4, 2025 • 56

One Life to Learn: Inferring Symbolic World Models for Stochastic Environments from Unguided Exploration

Paper • 2510.12088 • Published Oct 14, 2025 • 5

PRInTS: Reward Modeling for Long-Horizon Information Seeking

Paper • 2511.19314 • Published Nov 24, 2025 • 8

Cog-DRIFT: Exploration on Adaptively Reformulated Instances Enables Learning from Hard Reasoning Problems

Paper • 2604.04767 • Published Apr 6 • 7

Playing Along: Learning a Double-Agent Defender for Belief Steering via Theory of Mind

Paper • 2604.11666 • Published Apr 13 • 4

submitted a paper to Daily Papers 3 months ago

Test-Time Scaling Makes Overtraining Compute-Optimal

Paper • 2604.01411 • Published Apr 1 • 28

submitted a paper to Daily Papers 3 months ago

Composer 2 Technical Report

Paper • 2603.24477 • Published Mar 25 • 20

submitted a paper to Daily Papers 4 months ago

Calibrate-Then-Act: Cost-Aware Exploration in LLM Agents

Paper • 2602.16699 • Published Feb 18 • 16

authored a paper 4 months ago

UniT: Unified Multimodal Chain-of-Thought Test-time Scaling

Paper • 2602.12279 • Published Feb 12 • 20

submitted a paper to Daily Papers 4 months ago

UniT: Unified Multimodal Chain-of-Thought Test-time Scaling

Paper • 2602.12279 • Published Feb 12 • 20