Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

45,407

Base only

Active filters: 4-bit

unsloth/DeepSeek-R1-Distill-Qwen-7B-unsloth-bnb-4bit

Text Generation • 8B • Updated Feb 14, 2025 • 24.2k • 25

Qwen/Qwen2.5-VL-3B-Instruct-AWQ

Image-Text-to-Text • 4B • Updated Apr 6, 2025 • 40.9k • 64

iFaz/llama32_3B_en_emo_2000_stp

Text Generation • 3B • Updated Mar 7, 2025 • 10 • 1

mlx-community/gemma-3-text-27b-it-4bit

Text Generation • Updated Mar 18, 2025 • 164 • 3

sleepdeprived3/Reformed-Christian-Bible-Expert-v2.1-12B_EXL2_4bpw_H8

Text Generation • Updated Apr 20, 2025 • 13 • 1

unsloth/Qwen3-8B-unsloth-bnb-4bit

Updated May 13, 2025 • 126k • 20

unsloth/Qwen3-8B-bnb-4bit

Updated May 13, 2025 • 38.7k • 10

lmstudio-community/Phi-4-mini-reasoning-MLX-4bit

Text Generation • 0.6B • Updated May 1, 2025 • 58.5k • 4

mlx-community/Qwen3-Embedding-0.6B-4bit-DWQ

Text Generation • Updated Jun 6, 2025 • 17.8k • 9

unsloth/LFM2-700M-unsloth-bnb-4bit

Text Generation • 0.8B • Updated Jul 14, 2025 • 32 • 1

steampunque/GLM-Z1-9B-0414-MP-GGUF

9B • Updated Feb 18 • 21 • 2

mlx-community/Qwen3-30B-A3B-Instruct-2507-4bit

Text Generation • Updated Aug 6, 2025 • 1.15k • 10

QuantTrio/Qwen3-30B-A3B-Thinking-2507-AWQ-BF16Mix

Text Generation • 31B • Updated Sep 5, 2025 • 1.17k • 5

lmstudio-community/Qwen3-4B-Thinking-2507-MLX-4bit

Text Generation • 0.6B • Updated Aug 6, 2025 • 61.8k • 13

mlx-community/Qwen3-Next-80B-A3B-Instruct-4bit

Text Generation • Updated Sep 12, 2025 • 2.7k • 24

LeDXIII/NuMarkdown-8B-Thinking-bnb4

8B • Updated Sep 16, 2025 • 21 • 1

QuantTrio/Qwen3-VL-30B-A3B-Instruct-AWQ

Text Generation • 31B • Updated Oct 8, 2025 • 1.25M • 43

SEOKDONG/gpt-oss-safeguard-20b-kor-enterprise-gptq-4bit

Text Generation • 21B • Updated Dec 2, 2025 • 69 • 4

mlx-community/Trinity-Mini-4bit

Text Generation • Updated Dec 2, 2025 • 47 • 2

mlx-community/Ministral-3-3B-Instruct-2512-4bit

Updated Dec 3, 2025 • 21.2k • 5

MaziyarPanahi/GLM-4.6V-Flash-GGUF

Text Generation • 9B • Updated Dec 8, 2025 • 80.2k • 6

mlx-community/LFM2.5-VL-1.6B-4bit

Image-Text-to-Text • 0.6B • Updated Jan 6 • 5.09k • 3

unsloth/medgemma-1.5-4b-it-unsloth-bnb-4bit

Image-Text-to-Text • 4B • Updated Jan 14 • 1.81k • 3

steampunque/GLM-4.7-Flash-MP-GGUF

30B • Updated 2 days ago • 762 • 1

mlx-community/DeepSeek-OCR-2-4bit

Image-Text-to-Text • 0.9B • Updated Jan 28 • 267 • 1

lmstudio-community/Qwen3-Coder-Next-MLX-4bit

80B • Updated Feb 2 • 224k • 23

mlx-community/Qwen3.5-397B-A17B-nvfp4

Text Generation • 396B • Updated Feb 16 • 362 • 5

saricles/Qwen3-Coder-Next-NVFP4-GB10

Text Generation • Updated Mar 1 • 8.66k • 28

mlx-community/Qwen3.5-27B-4bit

Image-Text-to-Text • 5B • Updated Feb 24 • 96.9k • 47

mlx-community/Qwen3.5-35B-A3B-4bit

Image-Text-to-Text • 6B • Updated Feb 24 • 8.19k • 37