LLM - a zhuww Collection

zhuww 's Collections

LLM

updated Jan 13

GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

Paper • 2508.06471 • Published Aug 8, 2025 • 210
NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model

Paper • 2508.14444 • Published Aug 20, 2025 • 46
Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities

Paper • 2507.06261 • Published Jul 7, 2025 • 67
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention

Paper • 2506.13585 • Published Jun 16, 2025 • 274
Magistral

Paper • 2506.10910 • Published Jun 12, 2025 • 67
Qwen3 Technical Report

Paper • 2505.09388 • Published May 14, 2025 • 339
MiMo: Unlocking the Reasoning Potential of Language Model -- From Pretraining to Posttraining

Paper • 2505.07608 • Published May 12, 2025 • 82
Phi-4-reasoning Technical Report

Paper • 2504.21318 • Published Apr 30, 2025 • 54
Llama-Nemotron: Efficient Reasoning Models

Paper • 2505.00949 • Published May 2, 2025 • 42
CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training

Paper • 2504.13161 • Published Apr 17, 2025 • 97
DeepSeek-R1 Thoughtology: Let's <think> about LLM Reasoning

Paper • 2504.07128 • Published Apr 2, 2025 • 87
Rethinking Reflection in Pre-Training

Paper • 2504.04022 • Published Apr 5, 2025 • 80
OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens

Paper • 2504.07096 • Published Apr 9, 2025 • 77
Gemma 3 Technical Report

Paper • 2503.19786 • Published Mar 25, 2025 • 55
LIMO: Less is More for Reasoning

Paper • 2502.03387 • Published Feb 5, 2025 • 62
Skywork Open Reasoner 1 Technical Report

Paper • 2505.22312 • Published May 28, 2025 • 54
Every Sample Matters: Leveraging Mixture-of-Experts and High-Quality Data for Efficient and Accurate Code LLM

Paper • 2503.17793 • Published Mar 22, 2025 • 24
RedStone: Curating General, Code, Math, and QA Data for Large Language Models

Paper • 2412.03398 • Published Dec 4, 2024 • 2
Nemotron-CC-Math: A 133 Billion-Token-Scale High Quality Math Pretraining Dataset

Paper • 2508.15096 • Published Aug 20, 2025 • 7
RLBFF: Binary Flexible Feedback to bridge between Human Feedback & Verifiable Rewards

Paper • 2509.21319 • Published Sep 25, 2025 • 8
StarCoder 2 and The Stack v2: The Next Generation

Paper • 2402.19173 • Published Feb 29, 2024 • 154
Nemotron-Cascade: Scaling Cascaded Reinforcement Learning for General-Purpose Reasoning Models

Paper • 2512.13607 • Published Dec 15, 2025 • 37
Nemotron 3 Nano: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

Paper • 2512.20848 • Published Dec 23, 2025 • 41
X-Coder: Advancing Competitive Programming with Fully Synthetic Tasks, Solutions, and Tests

Paper • 2601.06953 • Published Jan 11 • 46