Commit History

llama : add gpt-oss (llama/15091)
bf225d6

ggerganov ngxson HF Staff slaren commited on

vulkan: use uint array index to avoid glslang bug (llama/13193)
fd2d86d

jeffbolznv commited on

vulkan: In coopmat2 mmq, load q4_k/q5_k scales through shared memory (llama/12833)
4b7a407

jeffbolznv commited on

vulkan: optimize iq1 coopmat2 dequant functions (llama/12427)
53dd8ad

jeffbolznv commited on

vulkan: use fp32 in coopmat2 q4_k dequant function (llama/12309)
9ca84c6

jeffbolznv commited on

vulkan: matmul dequantization improvements (llama/12015)
ffdf466

Eve commited on

vulkan: initial support for IQ1_S and IQ1_M quantizations (llama/11528)
0d2e888

Rémy O commited on

vulkan: optimize coopmat2 iq2/iq3 callbacks (llama/11521)
3731f13

jeffbolznv commited on

vulkan: initial support for IQ4_XS quantization (llama/11501)
ed46ad5

Rémy O commited on

vulkan: implement initial support for IQ2 and IQ3 quantizations (llama/11360)
bd93c1b

Rémy Oudompheng jeffbolznv commited on

vulkan: optimize coopmat2 q4_k/q5_k dequant functions. (llama/11206)
ee122d3

jeffbolznv commited on

vulkan: optimize coopmat2 q2_k dequant function (llama/11130)
d49a569

jeffbolznv commited on

vulkan: optimize coopmat2 dequant functions (llama/10855)
5e70c43

jeffbolznv commited on

vulkan: Add VK_NV_cooperative_matrix2 support for mul_mat and flash attention (llama/10206)
d10b47b

jeffbolznv commited on