-
-
-
-
-
-
Inference Providers
Active filters:
4-bit
MaziyarPanahi/Meta-Llama-3.1-70B-Instruct-GGUF
Text Generation
•
71B
•
Updated
•
120k
•
40
unsloth/Meta-Llama-3.1-70B-Instruct-bnb-4bit
Text Generation
•
73B
•
Updated
•
5.53k
•
32
hugging-quants/Meta-Llama-3.1-8B-Instruct-GPTQ-INT4
Text Generation
•
8B
•
Updated
•
9.03k
•
40
Qwen/Qwen2-VL-72B-Instruct-GPTQ-Int4
Image-Text-to-Text
•
74B
•
Updated
•
462
•
29
Qwen/Qwen2.5-32B-Instruct-AWQ
Text Generation
•
33B
•
Updated
•
1.1M
•
90
Qwen/Qwen2.5-Coder-1.5B-Instruct-AWQ
Text Generation
•
2B
•
Updated
•
90.8k
•
4
unsloth/Qwen2.5-Coder-7B-bnb-4bit
Text Generation
•
8B
•
Updated
•
14.7k
•
13
unsloth/Qwen2.5-Coder-7B-Instruct-bnb-4bit
Text Generation
•
4B
•
Updated
•
97.9k
•
8
unsloth/llava-1.5-7b-hf-bnb-4bit
Image-Text-to-Text
•
4B
•
Updated
•
194k
•
5
da-fr/Mistral-NeMo-Minitron-8B-ARChitects-Full-bnb-4bit
8B
•
Updated
•
17
•
7
MaziyarPanahi/Llama-3.3-70B-Instruct-GGUF
Text Generation
•
71B
•
Updated
•
121k
•
20
shuyuej/Llama-3.3-70B-Instruct-GPTQ
71B
•
Updated
•
1.06k
•
6
MaziyarPanahi/Mistral-Small-24B-Instruct-2501-GGUF
Text Generation
•
24B
•
Updated
•
121k
•
7
Qwen/Qwen2.5-VL-72B-Instruct-AWQ
Image-Text-to-Text
•
74B
•
Updated
•
80.7k
•
69
unsloth/Llama-3.1-8B-unsloth-bnb-4bit
Text Generation
•
5B
•
Updated
•
5.18k
•
5
empirischtech/DeepSeek-R1-Distill-Qwen-32B-gptq-4bit
Text Generation
•
Updated
•
1.12k
•
5
RichardErkhov/MrezaPRZ_-_CodeLlama-7B-postgres-expert-4bits
7B
•
Updated
•
14
•
1
secemp9/TraceBack-12b
Text Generation
•
7B
•
Updated
•
59
•
32
MaziyarPanahi/gemma-3-1b-it-GGUF
Text Generation
•
1.0B
•
Updated
•
122k
•
11
RichardErkhov/zjj815_-_Qwen1.5-4B-Chinese-toxic-content-detection-4bits
4B
•
Updated
•
28
•
1
unsloth/Qwen3-14B-unsloth-bnb-4bit
Text Generation
•
15B
•
Updated
•
47.3k
•
10
unsloth/Qwen3-1.7B-unsloth-bnb-4bit
Text Generation
•
2B
•
Updated
•
47.2k
•
10
mlx-community/Qwen3-0.6B-4bit
Text Generation
•
Updated
•
6.13k
•
7
mlx-community/Qwen3-4B-4bit
Text Generation
•
Updated
•
8.01k
•
10
wolfofbackstreet/Qwen2.5-Omni-3B-4Bit
6B
•
Updated
•
13
•
4
Qwen/Qwen3-8B-AWQ
Text Generation
•
8B
•
Updated
•
97k
•
30
Qwen/Qwen3-4B-AWQ
Text Generation
•
4B
•
Updated
•
144k
•
20
Qwen/Qwen2.5-Omni-7B-AWQ
Any-to-Any
•
11B
•
Updated
•
32k
•
15
Qwen/Qwen3-0.6B-MLX-4bit
Text Generation
•
83.9M
•
Updated
•
1.29k
•
15
Qwen/Qwen3-14B-MLX-4bit
Text Generation
•
2B
•
Updated
•
420
•
6