GGUF quants and llama.cpp inference?
#7 opened 5 days ago
by
ljupco
Add MMLU-Pro evaluation result
#6 opened 29 days ago
by
burtenshaw
REAP-ing/REAM-ing LongCat-Flash-Lite
#5 opened 30 days ago
by
TomLucidor
使用sglang在两个H200上推理速度非常慢
2
#4 opened about 1 month ago
by
taozi555
Update model files by removing redundant n-gram embededding weight duplication
3
#3 opened about 1 month ago
by
LongCat0830
sampler settings?
1
#2 opened about 1 month ago
by
doc-acula
llama.cpp support please.
❤️ 17
3
#1 opened about 1 month ago
by
rosspanda0