ClawGym: A Scalable Framework for Building Effective Claw Agents Paper • 2604.26904 • Published 13 days ago • 50
view article Article Training mRNA Language Models Across 25 Species for $165 OpenMed • Mar 31 • 27
InstructVLA: Vision-Language-Action Instruction Tuning from Understanding to Manipulation Paper • 2507.17520 • Published Jul 23, 2025 • 15
view article Article Page-to-Video: Generate videos from webpages 🪄🎬 burtenshaw • May 6, 2025 • 27
Hymba: A Hybrid-head Architecture for Small Language Models Paper • 2411.13676 • Published Nov 20, 2024 • 48
Paper Copilot: A Self-Evolving and Efficient LLM System for Personalized Academic Assistance Paper • 2409.04593 • Published Sep 6, 2024 • 26
Llama 3.1 Collection This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models • 11 items • Updated Dec 6, 2024 • 711
view article Article Llama 3.1 - 405B, 70B & 8B with multilinguality and long context +6 philschmid, osanseviero, alvarobartt, lvwerra, dvilasuero, reach-vb, marcsun13, pcuenq • Jul 23, 2024 • 241
Retrieval-Enhanced Machine Learning: Synthesis and Opportunities Paper • 2407.12982 • Published Jul 17, 2024 • 6
Model Merging and Safety Alignment: One Bad Model Spoils the Bunch Paper • 2406.14563 • Published Jun 20, 2024 • 30
view article Article Welcome Gemma - Google’s new open LLM +1 philschmid, osanseviero, pcuenq • Feb 21, 2024 • 26
abliterated-v3 Collection Latest gen of the abliterated models I've produced • 17 items • Updated Jun 3, 2024 • 139
view article Article SeeMoE: Implementing a MoE Vision Language Model from Scratch AviSoori1x • Jun 23, 2024 • 39
ShortGPT: Layers in Large Language Models are More Redundant Than You Expect Paper • 2403.03853 • Published Mar 6, 2024 • 65
Meta-Prompting: Enhancing Language Models with Task-Agnostic Scaffolding Paper • 2401.12954 • Published Jan 23, 2024 • 32
LLM in a flash: Efficient Large Language Model Inference with Limited Memory Paper • 2312.11514 • Published Dec 12, 2023 • 264
Rethinking Attention: Exploring Shallow Feed-Forward Neural Networks as an Alternative to Attention Layers in Transformers Paper • 2311.10642 • Published Nov 17, 2023 • 25