James PRO

jtatman

AI & ML interests

improving domain specific models and re-sampling data, refining datasets for use in different modalities, small scale micro-llm clusters using quantized and smoothed models, and all emerging llm stack connecting technologies. Small models rock.

Recent Activity

liked a dataset 10 days ago

HuggingFaceH4/ultrafeedback_binarized

upvoted a paper 11 days ago

ClawGym: A Scalable Framework for Building Effective Claw Agents

liked a model 11 days ago

syntheticbot/clip-face-attribute-classifier

View all activity

Organizations

upvoted a paper 11 days ago

ClawGym: A Scalable Framework for Building Effective Claw Agents

Paper • 2604.26904 • Published 13 days ago • 50

upvoted a paper 15 days ago

OpenGame: Open Agentic Coding for Games

Paper • 2604.18394 • Published 22 days ago • 78

upvoted an article 18 days ago

Article

Training mRNA Language Models Across 25 Species for $165

OpenMed

•

Mar 31

• 27

upvoted a paper 9 months ago

InstructVLA: Vision-Language-Action Instruction Tuning from Understanding to Manipulation

Paper • 2507.17520 • Published Jul 23, 2025 • 15

upvoted an article about 1 year ago

Article

Page-to-Video: Generate videos from webpages 🪄🎬

burtenshaw

•

May 6, 2025

• 27

upvoted 2 papers over 1 year ago

Hymba: A Hybrid-head Architecture for Small Language Models

Paper • 2411.13676 • Published Nov 20, 2024 • 48

Paper Copilot: A Self-Evolving and Efficient LLM System for Personalized Academic Assistance

Paper • 2409.04593 • Published Sep 6, 2024 • 26

upvoted a collection almost 2 years ago

Llama 3.1

Collection

This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models • 11 items • Updated Dec 6, 2024 • 711

upvoted an article almost 2 years ago

Article

Llama 3.1 - 405B, 70B & 8B with multilinguality and long context

philschmid, osanseviero, alvarobartt, lvwerra, dvilasuero, reach-vb, marcsun13, pcuenq

•

Jul 23, 2024

• 241

upvoted 2 papers almost 2 years ago

Retrieval-Enhanced Machine Learning: Synthesis and Opportunities

Paper • 2407.12982 • Published Jul 17, 2024 • 6

Model Merging and Safety Alignment: One Bad Model Spoils the Bunch

Paper • 2406.14563 • Published Jun 20, 2024 • 30

upvoted an article almost 2 years ago

Article

Welcome Gemma - Google’s new open LLM

philschmid, osanseviero, pcuenq

•

Feb 21, 2024

• 26

upvoted a collection almost 2 years ago

abliterated-v3

Collection

Latest gen of the abliterated models I've produced • 17 items • Updated Jun 3, 2024 • 139

upvoted an article about 2 years ago

Article

SeeMoE: Implementing a MoE Vision Language Model from Scratch

AviSoori1x

•

Jun 23, 2024

• 39

upvoted a paper about 2 years ago

ShortGPT: Layers in Large Language Models are More Redundant Than You Expect

Paper • 2403.03853 • Published Mar 6, 2024 • 65

upvoted 4 papers over 2 years ago

Rethinking Attention: Exploring Shallow Feed-Forward Neural Networks as an Alternative to Attention Layers in Transformers

Paper • 2311.10642 • Published Nov 17, 2023 • 25

James PRO

AI & ML interests

Recent Activity

Organizations

jtatman's activity

Training mRNA Language Models Across 25 Species for $165

Page-to-Video: Generate videos from webpages 🪄🎬

Llama 3.1 - 405B, 70B & 8B with multilinguality and long context

Welcome Gemma - Google’s new open LLM

SeeMoE: Implementing a MoE Vision Language Model from Scratch