mmBERT: a modern multilingual encoder Collection mmBERT is trained on 3T tokens from over 1800 languages, showing SoTA scores on benchmarks and exceptional low-resource performance • 16 items • Updated Sep 9, 2025 • 51
Falcon-H1 Collection Falcon-H1 Family of Hybrid-Head Language Models (Transformer-SSM), including 0.5B, 1.5B, 1.5B-Deep, 3B, 7B, and 34B (pretrained & instruction-tuned). • 39 items • Updated Jan 9 • 59
Falcon Edge series Collection A series of powerful, universal and fine-tunable small Language Models • 7 items • Updated Nov 6, 2025 • 25
Absolute Zero: Reinforced Self-play Reasoning with Zero Data Paper • 2505.03335 • Published May 6, 2025 • 189
Qwen3 Collection Qwen's new Qwen3 models. In Unsloth Dynamic 2.0, GGUF, 4-bit and 16-bit Safetensor formats. Includes 128K Context Length variants. • 79 items • Updated about 16 hours ago • 261
BitNet Collection 🔥BitNet family of large language models (1-bit LLMs). • 7 items • Updated May 1, 2025 • 56
Granite Experiments Collection Experimental projects under consideration for the Granite family. • 22 items • Updated 27 days ago • 15
Granite 3.3 Language Models Collection Our latest language models licensed under Apache 2.0 license. • 4 items • Updated Nov 17, 2025 • 45
ParetoQ: Scaling Laws in Extremely Low-bit LLM Quantization Paper • 2502.02631 • Published Feb 4, 2025 • 4
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention Paper • 2502.11089 • Published Feb 16, 2025 • 167