Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

21

Full-text search

Active filters: W4A16

ModelCloud/DeepSeek-R1-Distill-Qwen-7B-gptqmodel-4bit-vortex-v1

Text Generation • 8B • Updated Jan 24, 2025 • 8 • 6

ModelCloud/DeepSeek-R1-Distill-Qwen-7B-gptqmodel-4bit-vortex-v2

Text Generation • 8B • Updated Jan 24, 2025 • 640 • 8

ModelCloud/QwQ-32B-Preview-gptqmodel-4bit-vortex-v2

Text Generation • 33B • Updated Dec 18, 2024 • 2 • 16

ModelCloud/QwQ-32B-Preview-gptqmodel-4bit-vortex-v3

Text Generation • 33B • Updated Dec 20, 2024 • 4 • 14

ModelCloud/Falcon3-10B-Instruct-gptqmodel-4bit-vortex-v1

Text Generation • 10B • Updated Dec 21, 2024 • 144 • 3

ModelCloud/Qwen2.5-0.5B-Instruct-gptqmodel-w4a16

Text Generation • 0.5B • Updated Oct 19, 2025 • 76 • 1

RedHatAI/phi-4-quantized.w4a16

Text Generation • 3B • Updated Sep 25, 2025 • 3.45k • 4

RedHatAI/Mistral-Small-3.1-24B-Instruct-2503-quantized.w4a16

Image-Text-to-Text • 5B • Updated Oct 29, 2025 • 21.8k • 10

RedHatAI/Llama-4-Scout-17B-16E-Instruct-quantized.w4a16

Image-Text-to-Text • 20B • Updated Sep 22, 2025 • 209k • 12

pyrymikko/nomic-embed-code-W4A16-AWQ

1B • Updated Sep 30, 2025 • 22.9k

tcclaviger/Minimax-M2-Thrift-GPTQ-W4A16-AMD

Text Generation • 24B • Updated Dec 1, 2025 • 3 • 1

TevunahAi/granite-34b-code-instruct-8k-Ultra-Hybrid

Text Generation • 11B • Updated Dec 1, 2025 • 3

TevunahAi/Llama-3.1-70B-Instruct-Ultra-Hybrid

Text Generation • 22B • Updated Dec 4, 2025 • 2

Vishva007/Qwen3-4B-Instruct-2507-W4A16-AutoRound

Text Generation • 0.9B • Updated 25 days ago • 9

Vishva007/Qwen3-VL-8B-Instruct-W4A16-AutoRound

Image-Text-to-Text • 2B • Updated 17 days ago • 253

Vishva007/Qwen3-VL-2B-Instruct-W4A16-AutoRound

Image-Text-to-Text • 0.9B • Updated 17 days ago • 28

Vishva007/Qwen3-VL-2B-Instruct-W4A16-AutoRound-GPTQ

Image-Text-to-Text • 2B • Updated 17 days ago • 23

Vishva007/Qwen3-VL-2B-Instruct-W4A16-AutoRound-AWQ

Image-Text-to-Text • 2B • Updated 17 days ago • 113

Vishva007/Qwen3-VL-4B-Instruct-W4A16-AutoRound

Image-Text-to-Text • 1B • Updated 17 days ago • 22

Vishva007/Qwen3-VL-4B-Instruct-W4A16-AutoRound-GPTQ

Image-Text-to-Text • 4B • Updated 17 days ago • 16

Vishva007/Qwen3-VL-4B-Instruct-W4A16-AutoRound-AWQ

Image-Text-to-Text • 4B • Updated 17 days ago • 44 • 1