Update README.md
Browse files
README.md
CHANGED
|
@@ -9,8 +9,7 @@ EXL3 quants of [Qwen3-30B-A3B](https://huggingface.co/Qwen/Qwen3-30B-A3B)
|
|
| 9 |
[4.00 bits per weight](https://huggingface.co/turboderp/Qwen3-30B-A3B-exl3/tree/4.0bpw)
|
| 10 |
[5.00 bits per weight](https://huggingface.co/turboderp/Qwen3-30B-A3B-exl3/tree/5.0bpw)
|
| 11 |
[6.00 bits per weight](https://huggingface.co/turboderp/Qwen3-30B-A3B-exl3/tree/6.0bpw)
|
| 12 |
-
|
| 13 |
-
While I work out a way to meaningfully measure perplexity for such a sparse model, here are some other tests:
|
| 14 |
|
| 15 |
| Model | HumanEval pass@1 | KL-div vs FP16 (wiki2 20k tokens) | Top-1 agreement vs FP16 |
|
| 16 |
|----------|------------------|-----------------------------------|-------------------------|
|
|
@@ -19,4 +18,7 @@ While I work out a way to meaningfully measure perplexity for such a sparse mode
|
|
| 19 |
| 4.00 bpw | 92.07% | 0.0215 | 94.33% |
|
| 20 |
| 5.00 bpw | 93.29% | 0.0094 | 96.24% |
|
| 21 |
| 6.00 bpw | 92.68% | 0.0054 | 97.45% |
|
| 22 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 9 |
[4.00 bits per weight](https://huggingface.co/turboderp/Qwen3-30B-A3B-exl3/tree/4.0bpw)
|
| 10 |
[5.00 bits per weight](https://huggingface.co/turboderp/Qwen3-30B-A3B-exl3/tree/5.0bpw)
|
| 11 |
[6.00 bits per weight](https://huggingface.co/turboderp/Qwen3-30B-A3B-exl3/tree/6.0bpw)
|
| 12 |
+
[8.00 bits per weight / H8](https://huggingface.co/turboderp/Qwen3-30B-A3B-exl3/tree/8.0bpw_H8)
|
|
|
|
| 13 |
|
| 14 |
| Model | HumanEval pass@1 | KL-div vs FP16 (wiki2 20k tokens) | Top-1 agreement vs FP16 |
|
| 15 |
|----------|------------------|-----------------------------------|-------------------------|
|
|
|
|
| 18 |
| 4.00 bpw | 92.07% | 0.0215 | 94.33% |
|
| 19 |
| 5.00 bpw | 93.29% | 0.0094 | 96.24% |
|
| 20 |
| 6.00 bpw | 92.68% | 0.0054 | 97.45% |
|
| 21 |
+
| 8.00 bpw | 91.46% | 0.0020 | 98.36% |
|
| 22 |
+
| FP16 | 91.46% | - | - |
|
| 23 |
+
|
| 24 |
+

|