Inference Providers
Active filters: torchao
Cloudmaster/Llama-3.2-3B-torchao-W8A8
Text Generation
• Updated • 2
Cloudmaster/Llama-3.2-3B-torchao-W8-g128
Text Generation
• Updated • 2
Cloudmaster/Llama-3.2-3B-torchao-W8-g256
Text Generation
• Updated • 2
Cloudmaster/Llama-3.2-3B-torchao-W8-g64
Text Generation
• Updated • 2
Cloudmaster/Llama-3.2-3B-torchao-W8-g32
Text Generation
• Updated • 1
Cloudmaster/Llama-3.2-3B-torchao-W8A4-g128
Text Generation
• Updated • 2
Daiphuoc/test_gemma_torchao
Text Generation
• Updated • 1
testdummyvt/Qwen2.5-VL-3B-Instruct-int8-weightonly-torchao
Image-Text-to-Text
• Updated • 15
dropbox-dash/Llama-3.1-8B-Instruct_gemlite-ao_a16w4_gs_128_pack_32bit
Text Generation
• Updated • 1
• 2
dropbox-dash/Llama-3.1-8B-Instruct_gemlite-ao_a16w4_gs_128_pack_16bit
Text Generation
• Updated • 3
• 1
dropbox-dash/Phi-4-mini-instruct_gemlite-ao_a16w4_gs_128_pack_32bit
Text Generation
• Updated • 3
• 1
dropbox-dash/Phi-4-mini-instruct_gemlite-ao_a16w4_gs_128_pack_16bit
Text Generation
• Updated • 2
• 1
dropbox-dash/Qwen2.5-7B-Instruct_gemlite-ao_a16w4_gs_128_pack_32bit
Text Generation
• Updated • 6
• 2
dropbox-dash/Qwen3-32B_gemlite-ao_a16w4_gs_128_pack_32bit
Text Generation
• Updated • 1
• 1
dropbox-dash/Qwen2.5-VL-7B-Instruct_gemlite-ao_a8w8
Image-Text-to-Text
• Updated • 4
• 3
appy1234/Llama3.1-8B-Int8DynamicActivationInt8WeightQuantized
Text Generation
• Updated • 3
appy1234/Llama-3.2-3B-Instruct-Int8DynamicActivationInt8WeightQuantized
Text Generation
• Updated • 3
dropbox-dash/Qwen2.5-VL-7B-Instruct_gemlite-ao_a16w4_gs_128_pack_32bit
Image-Text-to-Text
• Updated • 4
• 2
mobicham/Qwen2.5-VL-3B-Instruct_int8wo_ao
Image-Text-to-Text
• Updated • 22.6k
arthurrpp/qwen3-0.6B-sq-w8a8
testdummyvt/Qwen2.5-VL-3B-Instruct-int8-dynamic-torchao
Image-Text-to-Text
• Updated • 1
yutotom/Meta-Llama-3-8B-torchao-int8_weight_only
Updated
Feature Extraction
• Updated • 1
BillionForgeAi/Meta-Llama-3-8B-torchao-int8_dynamic_activation_int8_weight
RoadToNowhere/Qwen3-0.6B-ao-float8wo
Text Generation
• Updated • 2
MidnightPhreaker/FuseO1-DeepSeekR1-Qwen2.5-Coder-32B-Preview-ao-int4wo-gs128
Text Generation
• Updated • 1
appy1234/Phi-4-mini-instruct-float8dq
Text Generation
• Updated • 2
poinka/gemma-3-4b-pt-q8bits
kexve/DeepSeek-R1-Distill-Qwen-1.5B-torchao-int8_weight_only
Updated