AI & ML interests

None defined yet.

Recent Activity

burtenshaw  updated a Space 3 days ago
openenv/sudoku
burtenshaw  updated a Space 3 days ago
openenv/browsergym_env
burtenshaw  updated a Space 3 days ago
openenv/repl
View all activity

burtenshaw 
updated a Space 9 days ago
sergiopaniego 
posted an update 10 days ago
view post
Post
397
Meet the Post-Training Toolkit (PTT), which easily integrates with TRL via a single callback, by Aditya Challapally ( @microsoft ):

🔍 Detects training issues early
🛠 Lets you intervene safely
📊 Keeps long training runs stable, auditable & efficient

Microsoft blog: https://devblogs.microsoft.com/engineering-at-microsoft/diagnosing-instability-in-production-scale-agent-rl/

Integration guide: https://huggingface.co/docs/trl/main/en/ptt_integration

Code: https://github.com/microsoft/post-training-toolkit
sergiopaniego 
posted an update 11 days ago
sergiopaniego 
posted an update 13 days ago
sergiopaniego 
posted an update 20 days ago
view post
Post
1596
FunctionGemma Tuning Lab is a new no-code tool by @google that lets you fine-tune a model directly from the browser, with no coding knowledge required, using TRL behind the scenes.

blog: https://developers.googleblog.com/a-guide-to-fine-tuning-functiongemma/

try it out: google/functiongemma-tuning-lab

This example builds on a more advanced one for learning fine-tuning with SFT using TRL: https://ai.google.dev/gemma/docs/functiongemma/finetuning-with-functiongemma
  • 1 reply
·
sergiopaniego 
posted an update 23 days ago
sergiopaniego 
posted an update 26 days ago
view post
Post
2999
New REPL environment in OpenEnv available! ✨
Used in the Recursive Language Models (RLM) paper by Alex Zhang.

Ready for inference & post-training using trajectories. Handles long contexts:

> Run Python code in a sandbox
> Make recursive calls to LMs
> Explore data programmatically
> Return final result

Docs: https://meta-pytorch.org/OpenEnv/environments/repl/
Inference script: https://github.com/meta-pytorch/OpenEnv/blob/main/examples/repl_oolong_simple.py
sergiopaniego 
posted an update 27 days ago
view post
Post
486
Recursive Language Models (RLM) is a new interface for LLMs with cool ideas by Alex Zhang!

⚠️ LLMs struggle with long prompts → attention overload & lost info
🔄 RLMs inspect, split & call themselves on chunks, then aggregate results
✅ Handles millions of tokens, reduces noise, improves reasoning
💡 System prompt guides recursion
🎯 RLM trajectories can be used for RL training or distillation (OpenEnv+TRL!!)

We're adding it to OpenEnv (with Kashif Rasul): https://github.com/meta-pytorch/OpenEnv/pull/282

More resources:

> Paper: Recursive Language Models (2512.24601)
> Paper blog: https://alexzhang13.github.io/blog/2025/rlm/
> RLM repo: https://github.com/alexzhang13/rlm
  • 2 replies
·
sergiopaniego 
posted an update about 1 month ago