RuAR's picture

RuAR

RachidAR

·

RachidARx

AI & ML interests

1.58 bit LLM

Organizations

upvoted a collection 6 months ago

mmBERT: a modern multilingual encoder

mmBERT is trained on 3T tokens from over 1800 languages, showing SoTA scores on benchmarks and exceptional low-resource performance • 16 items • Updated Sep 9, 2025 • 51

upvoted 3 collections 9 months ago

Falcon-H1

Falcon-H1 Family of Hybrid-Head Language Models (Transformer-SSM), including 0.5B, 1.5B, 1.5B-Deep, 3B, 7B, and 34B (pretrained & instruction-tuned). • 39 items • Updated Jan 9 • 59

Granite 4.0 Language Models

13 items • Updated Nov 17, 2025 • 208

Falcon Edge series

A series of powerful, universal and fine-tunable small Language Models • 7 items • Updated Nov 6, 2025 • 25

upvoted a paper 10 months ago

Absolute Zero: Reinforced Self-play Reasoning with Zero Data

Paper • 2505.03335 • Published May 6, 2025 • 189

upvoted an article 10 months ago

Article

Comparing sub 50GB Llama 4 Scout quants (KLD/Top P)

Apr 9, 2025

•

45

upvoted a collection 10 months ago

Qwen3

Qwen's new Qwen3 models. In Unsloth Dynamic 2.0, GGUF, 4-bit and 16-bit Safetensor formats. Includes 128K Context Length variants. • 79 items • Updated about 16 hours ago • 261

upvoted an article 10 months ago

Article

Uncensor any LLM with abliteration

Jun 13, 2024

•

785

upvoted 5 collections 10 months ago

Qwen3

84 items • Updated Dec 31, 2025 • 1.68k

blt

4 items • Updated Apr 17, 2025 • 28

Skywork-OR1

Skywork Open Reasoner 1 • 11 items • Updated May 29, 2025 • 31

BitNet

🔥BitNet family of large language models (1-bit LLMs). • 7 items • Updated May 1, 2025 • 56

Granite Experiments

Experimental projects under consideration for the Granite family. • 22 items • Updated 27 days ago • 15

upvoted 3 collections 11 months ago

GLM-4-0414

GLM-4-0414 series model • 8 items • Updated Jun 30, 2025 • 134

Granite 3.3 Language Models

Our latest language models licensed under Apache 2.0 license. • 4 items • Updated Nov 17, 2025 • 45

Cogito v1 Preview

5 items • Updated Apr 8, 2025 • 119

upvoted a paper 11 months ago

ParetoQ: Scaling Laws in Extremely Low-bit LLM Quantization

Paper • 2502.02631 • Published Feb 4, 2025 • 4

upvoted a collection 12 months ago

Gemma 3 Release

28 items • Updated Aug 11, 2025 • 614

upvoted 2 papers about 1 year ago

Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention

Paper • 2502.11089 • Published Feb 16, 2025 • 167

Titans: Learning to Memorize at Test Time

Paper • 2501.00663 • Published Dec 31, 2024 • 29