lukealonso
/

MiniMax-M2.5-NVFP4

8-bit precision

Model card Files Files and versions

Resources

View closed (0)

Redacted

#9 opened 4 days ago by

pathosethoslogos

Question regarding quantization hardware and modelopt sharding

#8 opened 19 days ago by

Request: NVFP4 version of MiniMax-M2.5-REAP-139B (to fit on a single RTX 6000 Pro)

#7 opened 26 days ago by

VLLM error for kv weight scaling - workaround

#6 opened 29 days ago by

Thanks for your effort

#5 opened 29 days ago by

fp8 kv cache

#4 opened 30 days ago by

KeyError: '110.w1.input_scale' with TRT

#3 opened about 1 month ago by

"w1_weight_scale_2 must match w3_weight_scale_2. Accuracy may be affected."

#2 opened about 1 month ago by

Here's the vLLM recipe I'm using with 2x RTX Pro 6000

#1 opened about 1 month ago by