On Data Engineering for Scaling LLM Terminal Capabilities Paper • 2602.21193 • Published 9 days ago • 90
Nemotron-Terminal Collection We are releasing Nemotron-Terminal models and training datasets. • 5 items • Updated 1 day ago • 25
Revisiting the Platonic Representation Hypothesis: An Aristotelian View Paper • 2602.14486 • Published 17 days ago • 11
Does Socialization Emerge in AI Agent Society? A Case Study of Moltbook Paper • 2602.14299 • Published 18 days ago • 26
Reinforced Fast Weights with Next-Sequence Prediction Paper • 2602.16704 • Published 15 days ago • 13
Health AI Developer Foundations (HAI-DEF) Collection Groups models released for use in health AI by Google. Read more about HAI-DEF at http://goo.gle/hai-def • 22 items • Updated Jan 12 • 200
Judging What We Cannot Solve: A Consequence-Based Approach for Oracle-Free Evaluation of Research-Level Math Paper • 2602.06291 • Published 27 days ago • 23
Revisiting the Shape Convention of Transformer Language Models Paper • 2602.06471 • Published 27 days ago • 4
Length-Unbiased Sequence Policy Optimization: Revealing and Controlling Response Length Variation in RLVR Paper • 2602.05261 • Published 28 days ago • 49
Horizon-LM: A RAM-Centric Architecture for LLM Training Paper • 2602.04816 • Published 29 days ago • 17
Quartet II: Accurate LLM Pre-Training in NVFP4 by Improved Unbiased Gradient Estimation Paper • 2601.22813 • Published Jan 30 • 57
Linear representations in language models can change dramatically over a conversation Paper • 2601.20834 • Published Jan 28 • 21
Scaling Embeddings Outperforms Scaling Experts in Language Models Paper • 2601.21204 • Published Jan 29 • 100
CGPT: Cluster-Guided Partial Tables with LLM-Generated Supervision for Table Retrieval Paper • 2601.15849 • Published Jan 22 • 14
Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability Paper • 2601.18778 • Published Jan 26 • 41