Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

424

Base only

Active filters: torchao

Cloudmaster/Llama-3.2-3B-torchao-W8A8

Text Generation • Updated May 29, 2025 • 2

Cloudmaster/Llama-3.2-3B-torchao-W8-g128

Text Generation • Updated May 29, 2025 • 2

Cloudmaster/Llama-3.2-3B-torchao-W8-g256

Text Generation • Updated May 29, 2025 • 2

Cloudmaster/Llama-3.2-3B-torchao-W8-g64

Text Generation • Updated May 29, 2025 • 2

Cloudmaster/Llama-3.2-3B-torchao-W8-g32

Text Generation • Updated May 29, 2025 • 1

Cloudmaster/Llama-3.2-3B-torchao-W8A4-g128

Text Generation • Updated May 30, 2025 • 2

Daiphuoc/test_gemma_torchao

Text Generation • Updated Jul 7, 2025 • 1

testdummyvt/Qwen2.5-VL-3B-Instruct-int8-weightonly-torchao

Image-Text-to-Text • Updated May 30, 2025 • 15

dropbox-dash/Llama-3.1-8B-Instruct_gemlite-ao_a16w4_gs_128_pack_32bit

Text Generation • Updated Jun 3, 2025 • 1 • 2

dropbox-dash/Llama-3.1-8B-Instruct_gemlite-ao_a16w4_gs_128_pack_16bit

Text Generation • Updated Jun 4, 2025 • 3 • 1

dropbox-dash/Phi-4-mini-instruct_gemlite-ao_a16w4_gs_128_pack_32bit

Text Generation • Updated Jun 4, 2025 • 3 • 1

dropbox-dash/Phi-4-mini-instruct_gemlite-ao_a16w4_gs_128_pack_16bit

Text Generation • Updated Jun 4, 2025 • 2 • 1

dropbox-dash/Qwen2.5-7B-Instruct_gemlite-ao_a16w4_gs_128_pack_32bit

Text Generation • Updated Jun 4, 2025 • 6 • 2

dropbox-dash/Qwen3-32B_gemlite-ao_a16w4_gs_128_pack_32bit

Text Generation • Updated Jun 4, 2025 • 1 • 1

dropbox-dash/Qwen2.5-VL-7B-Instruct_gemlite-ao_a8w8

Image-Text-to-Text • Updated Jun 6, 2025 • 4 • 3

appy1234/Llama3.1-8B-Int8DynamicActivationInt8WeightQuantized

Text Generation • Updated Jun 4, 2025 • 3

appy1234/Llama-3.2-3B-Instruct-Int8DynamicActivationInt8WeightQuantized

Text Generation • Updated Jun 4, 2025 • 3

dropbox-dash/Qwen2.5-VL-7B-Instruct_gemlite-ao_a16w4_gs_128_pack_32bit

Image-Text-to-Text • Updated Jun 4, 2025 • 4 • 2

mobicham/Qwen2.5-VL-3B-Instruct_int8wo_ao

Image-Text-to-Text • Updated Jun 5, 2025 • 22.6k

arthurrpp/qwen3-0.6B-sq-w8a8

Updated Jun 6, 2025 • 1

testdummyvt/Qwen2.5-VL-3B-Instruct-int8-dynamic-torchao

Image-Text-to-Text • Updated Jun 9, 2025 • 1

yutotom/Meta-Llama-3-8B-torchao-int8_weight_only

Updated Jun 9, 2025

mikaylagawarecki/foo_bar

Feature Extraction • Updated Jun 11, 2025 • 1

BillionForgeAi/Meta-Llama-3-8B-torchao-int8_dynamic_activation_int8_weight

Updated Jun 12, 2025 • 2

RoadToNowhere/Qwen3-0.6B-ao-float8wo

Text Generation • Updated Jun 13, 2025 • 2

MidnightPhreaker/FuseO1-DeepSeekR1-Qwen2.5-Coder-32B-Preview-ao-int4wo-gs128

Updated Jun 20, 2025 • 1

jcaip/fp8fnuz-opt-125m

Text Generation • Updated Jun 20, 2025 • 1

appy1234/Phi-4-mini-instruct-float8dq

Text Generation • Updated Jun 23, 2025 • 2

poinka/gemma-3-4b-pt-q8bits

Updated Jun 27, 2025 • 1

kexve/DeepSeek-R1-Distill-Qwen-1.5B-torchao-int8_weight_only

Updated Jul 2, 2025