Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

26,901

Full-text search

Active filters: 8-bit

openai/gpt-oss-120b

Text Generation • 120B • Updated Aug 26, 2025 • 2.92M • • 4.44k

nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-NVFP4

Text Generation • 18B • Updated 4 days ago • 44k • 57

openai/gpt-oss-20b

Text Generation • 22B • Updated Aug 26, 2025 • 6.06M • • 4.29k

microsoft/bitnet-b1.58-2B-4T

Text Generation • 0.8B • Updated Dec 17, 2025 • 6.36k • 1.28k

GadflyII/GLM-4.7-Flash-NVFP4

Text Generation • 18B • Updated 14 days ago • 239k • 56

openai/gpt-oss-safeguard-20b

Text Generation • 22B • Updated 20 days ago • 26.1k • • 188

unsloth/NVIDIA-Nemotron-3-Nano-30B-A3B-NVFP4

Text Generation • 18B • Updated 6 days ago • 76 • 5

mlx-community/Qwen3-ASR-1.7B-8bit

0.8B • Updated 5 days ago • 314 • 5

FabioSarracino/VibeVoice-Large-Q8

Text-to-Audio • 9B • Updated Oct 1, 2025 • 2.62k • 82

lukealonso/MiniMax-M2.1-NVFP4

115B • Updated 28 days ago • 27.1k • 21

lmstudio-community/GLM-4.7-Flash-MLX-8bit

Text Generation • 30B • Updated 12 days ago • 644k • 7

mlx-community/Qwen3-TTS-12Hz-1.7B-CustomVoice-8bit

Text-to-Speech • 0.8B • Updated 8 days ago • 741 • 4

RedHatAI/Qwen3-VL-235B-A22B-Instruct-NVFP4

Text Generation • 133B • Updated Dec 4, 2025 • 13k • 10

nvidia/NVIDIA-Nemotron-Nano-12B-v2-VL-NVFP4-QAD

Image-Text-to-Text • 8B • Updated Nov 13, 2025 • 36.3k • 17

Salyut1/GLM-4.7-NVFP4

Text Generation • 177B • Updated Dec 23, 2025 • 5.12k • 13

MultiverseComputingCAI/HyperNova-60B

Text Generation • 60B • Updated 26 days ago • 1.65k • 51

GadflyII/GLM-4.7-Flash-MXFP4

Text Generation • 18B • Updated 8 days ago • 1.9k • 6

mlx-community/Jan-v3-4B-base-instruct-8bit

Text Generation • 1B • Updated 7 days ago • 83 • 3

StefanKrsteski/Phi-3-mini-4k-instruct-GPTQ-8bit

Text Generation • 4B • Updated Jun 8, 2024 • 7 • 2

nvidia/DeepSeek-R1-NVFP4

Text Generation • 397B • Updated Jun 6, 2025 • 14k • 270

nvidia/Qwen3-30B-A3B-NVFP4

Text Generation • 16B • Updated Sep 10, 2025 • 26.1k • 23

nvidia/Qwen2.5-VL-7B-Instruct-NVFP4

Text Generation • 5B • Updated Dec 6, 2025 • 4.41k • 13

openai/gpt-oss-safeguard-120b

Text Generation • 120B • Updated Oct 29, 2025 • 20.8k • 84

nvidia/NVIDIA-Nemotron-Nano-9B-v2-NVFP4

Text Generation • 6B • Updated 26 days ago • 11.5k • 17

kldzj/gpt-oss-120b-heretic-v2

Text Generation • 117B • Updated Nov 18, 2025 • 370 • 18

numind/NuMarkdown-8B-Thinking-mlx-8bits

Image-to-Text • Updated Nov 24, 2025 • 53 • 4

RedHatAI/Qwen3-VL-32B-Instruct-NVFP4

Text Generation • 20B • Updated Dec 10, 2025 • 4.48k • 2

Tengyunw/GLM-4.7-NVFP4

Text Generation • 177B • Updated Dec 26, 2025 • 2.01k • 6

mlx-community/GLM-4.7-Flash-8bit

Text Generation • 30B • Updated 9 days ago • 14.4k • 17

MaziyarPanahi/rank_zephyr_7b_v1_full-GGUF

Text Ranking • 7B • Updated Apr 2, 2025 • 124 • 6