-
inference-optimization/test_tencentbac_fastmtp
Updated • 38 -
inference-optimization/test_qwen3_next_mtp
Updated • 41 -
inference-optimization/Qwen3-Next-80B-A3B-Instruct_mtp_speculator
Text Generation • 2B • Updated • 58 -
inference-optimization/Qwen3-Next-80B-A3B-Instruct-MTP-ultrachat-epoch3
2B • Updated • 18
Inference Optimization
community
AI & ML interests
None defined yet.
Recent Activity
View all activity
-
inference-optimization/test_tencentbac_fastmtp
Updated • 38 -
inference-optimization/test_qwen3_next_mtp
Updated • 41 -
inference-optimization/Qwen3-Next-80B-A3B-Instruct_mtp_speculator
Text Generation • 2B • Updated • 58 -
inference-optimization/Qwen3-Next-80B-A3B-Instruct-MTP-ultrachat-epoch3
2B • Updated • 18
FP8-block, FP8-dynamic, NVFP4, w4a16, w8a8 quantized models of ibm-granite/granite-4.0-h-small and ibm-granite/granite-4.0-h-tiny models
models 206
inference-optimization/Qwen3-30B-A3B-Instruct-2507_5.5_bits_mode_heuristic
22B • Updated
inference-optimization/Qwen3-30B-A3B-Instruct-2507_5.5_bits_mode_noise
22B • Updated
inference-optimization/Qwen3-30B-A3B-Instruct-2507_5.5_bits_mode_hybrid
22B • Updated
inference-optimization/Qwen3-30B-A3B-Instruct-2507_5.0_bits_mode_heuristic
20B • Updated
inference-optimization/Qwen3-30B-A3B-Instruct-2507_5.0_bits_mode_noise
20B • Updated
inference-optimization/MiniMax-M2.5-NVFP4
130B • Updated • 4
inference-optimization/Qwen3-30B-A3B-Instruct-2507_5.0_bits_mode_hybrid
20B • Updated • 10
inference-optimization/Qwen3-30B-A3B-Instruct-2507-quant-test-1
25B • Updated • 38
inference-optimization/gpt-oss-120b-from-self-ckpt5-speculator.eagle3
0.9B • Updated • 69
inference-optimization/gpt-oss-120b-from-self-ckpt3-speculator.eagle3
0.9B • Updated • 58
datasets 6
inference-optimization/speculators-qwen3-30b-a3b-instruct
Preview • Updated • 14
inference-optimization/speculators-qwen3-32b-instruct
Preview • Updated • 29
inference-optimization/gpt-oss-20b-nan-hidden-states-repro
Updated • 28
inference-optimization/SWE-bench_Multilingual
Viewer • Updated • 300 • 13
inference-optimization/SWE-bench_Verified
Viewer • Updated • 500 • 82
inference-optimization/SWE-bench_Lite
Viewer • Updated • 323 • 57