Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Mixture of Experts

Carbon Emissions

Models

40,412

Full-text search

Active filters: 4-bit

MaziyarPanahi/Meta-Llama-3.1-70B-Instruct-GGUF

Text Generation • 71B • Updated Jul 29, 2024 • 120k • 40

unsloth/Meta-Llama-3.1-70B-Instruct-bnb-4bit

Text Generation • 73B • Updated Nov 22, 2024 • 5.53k • 32

hugging-quants/Meta-Llama-3.1-8B-Instruct-GPTQ-INT4

Text Generation • 8B • Updated Aug 7, 2024 • 9.03k • 40

Qwen/Qwen2-VL-72B-Instruct-GPTQ-Int4

Image-Text-to-Text • 74B • Updated Sep 24, 2024 • 462 • 29

Qwen/Qwen2.5-32B-Instruct-AWQ

Text Generation • 33B • Updated Oct 9, 2024 • 1.1M • 90

Qwen/Qwen2.5-Coder-1.5B-Instruct-AWQ

Text Generation • 2B • Updated Nov 18, 2024 • 90.8k • 4

unsloth/Qwen2.5-Coder-7B-bnb-4bit

Text Generation • 8B • Updated Nov 12, 2024 • 14.7k • 13

unsloth/Qwen2.5-Coder-7B-Instruct-bnb-4bit

Text Generation • 4B • Updated Nov 12, 2024 • 97.9k • 8

unsloth/llava-1.5-7b-hf-bnb-4bit

Image-Text-to-Text • 4B • Updated Feb 13 • 194k • 5

da-fr/Mistral-NeMo-Minitron-8B-ARChitects-Full-bnb-4bit

8B • Updated Dec 3, 2024 • 17 • 7

MaziyarPanahi/Llama-3.3-70B-Instruct-GGUF

Text Generation • 71B • Updated Dec 7, 2024 • 121k • 20

shuyuej/Llama-3.3-70B-Instruct-GPTQ

71B • Updated Dec 22, 2024 • 1.06k • 6

MaziyarPanahi/Mistral-Small-24B-Instruct-2501-GGUF

Text Generation • 24B • Updated Jun 23 • 121k • 7

Qwen/Qwen2.5-VL-72B-Instruct-AWQ

Image-Text-to-Text • 74B • Updated Mar 7 • 80.7k • 69

unsloth/Llama-3.1-8B-unsloth-bnb-4bit

Text Generation • 5B • Updated Feb 15 • 5.18k • 5

empirischtech/DeepSeek-R1-Distill-Qwen-32B-gptq-4bit

Text Generation • Updated Feb 16 • 1.12k • 5

RichardErkhov/MrezaPRZ_-_CodeLlama-7B-postgres-expert-4bits

7B • Updated Feb 26 • 14 • 1

secemp9/TraceBack-12b

Text Generation • 7B • Updated Mar 14 • 59 • 32

MaziyarPanahi/gemma-3-1b-it-GGUF

Text Generation • 1.0B • Updated Mar 12 • 122k • 11

RichardErkhov/zjj815_-_Qwen1.5-4B-Chinese-toxic-content-detection-4bits

4B • Updated Apr 6 • 28 • 1

unsloth/Qwen3-14B-unsloth-bnb-4bit

Text Generation • 15B • Updated May 13 • 47.3k • 10

unsloth/Qwen3-1.7B-unsloth-bnb-4bit

Text Generation • 2B • Updated May 13 • 47.2k • 10

mlx-community/Qwen3-0.6B-4bit

Text Generation • Updated Apr 28 • 6.13k • 7

mlx-community/Qwen3-4B-4bit

Text Generation • Updated Apr 28 • 8.01k • 10

wolfofbackstreet/Qwen2.5-Omni-3B-4Bit

6B • Updated May 1 • 13 • 4

Qwen/Qwen3-8B-AWQ

Text Generation • 8B • Updated May 21 • 97k • 30

Qwen/Qwen3-4B-AWQ

Text Generation • 4B • Updated May 21 • 144k • 20

Qwen/Qwen2.5-Omni-7B-AWQ

Any-to-Any • 11B • Updated May 15 • 32k • 15

Qwen/Qwen3-0.6B-MLX-4bit

Text Generation • 83.9M • Updated Jul 29 • 1.29k • 15

Qwen/Qwen3-14B-MLX-4bit

Text Generation • 2B • Updated Jul 7 • 420 • 6