Commit History

ggml : fix fallback to CPU for ununsupported ops (llama/15118)
2b7ae5e

Diego Devesa commited on

CANN: add support for ACL Graph (llama/15065)
137a0dc

Chenguang Li commited on

llama : add gpt-oss (llama/15091)
bf225d6

ggerganov ngxson HF Staff slaren commited on

sycl: fix mul_mat selection (llama/15092)
344310a

Romain Biessy commited on

cmake: Add GGML_BACKEND_DIR option (llama/15074)
6e460b6

Christian Kastner commited on

vulkan: fix build when using glslang that does not support coopmat2 (llama/15062)
863e083

jeffbolznv commited on

vulkan: Use coopmat2 for conv2d (llama/14982)
6df82f4

jeffbolznv commited on

opencl: fix adreno compiler detection logic (llama/15029)
e6a209e

lhez commited on

CUDA: use mma FA kernel for gqa > 4 on RTX 4000 (llama/15035)
9e85264

JohannesGaessler commited on

cuda: make im2col a little faster (llama/15025)
9a85c65

leejet commited on

cuda, sycl : fix batched gemm when ne02 == 1 && ne03 > 1 (llama/15038)
cc3a2ed

ggerganov commited on

vulkan: coopmat2 mul_mat optimizations (llama/14934)
ca86566

jeffbolznv commited on

vulkan: Support ne[3]>1 in noncontig matrix-vector multiply (llama/15015)
d4c4115

jeffbolznv commited on

CUDA: fix MMQ nwarps for AMD with warp_size==32 (llama/15014)
fbc3cd1

JohannesGaessler commited on

opencl: add f16 for `add`, `sub`, `mul`, `div` (llama/14984)
4dc1834

lhez commited on

ggml : Q2k interleaving implementation - x86/x64 SIMD (llama/14373)
e2965b0

Srihari-mcw Manognasree commited on

docker : add cann build pipline (llama/14591)
2d993ad

diannao ggerganov Xuan-Son Nguyen commited on

Vulkan: Fix minor debug mode issues (llama/14899)
a81bc86

OccamRazor commited on

CANN: Improve loading efficiency after converting weights to NZ format. (llama/14985)
7612978

hipudding commited on

opencl: add `mul_mat_f32_f32_l4_lm` and `mul_mat_f16_f32_l4_lm` (llama/14809)
05577c3

lhez commited on

HIP: enable mfma mmq on gfx908 and gfx90a for select datatypes and shapes (llama/14949)
149f5a5

uvos commited on

CUDA: skip masked KV slices for all FA kernels (llama/14924)
0c60f80

JohannesGaessler commited on

HIP: remove the use of __HIP_PLATFORM_AMD__, explicitly support only AMD targets (llama/14945)
e37eff3

uvos commited on

HIP: add GGML_HIP_MMQ_MFMA option to allow disableing the MFMA path. (llama/14930)
f9dbd96

uvos commited on

HIP: Ignore unsupported unroll transformation in fattn-vec (llama/14931)
8e133f7

uvos commited on

CANN: Add ggml_set_rows (llama/14943)
fa22f70

hipudding commited on

cuda : add softcap fusion (llama/14907)
2237878

Sigbjørn Skjæret commited on

CUDA: add roll (llama/14919)
d41a4ec

am17an commited on

ggml-cpu : deduplicate scalar implementations (llama/14897)
1d58d7c

xctan commited on

SYCL: Add set_rows support for quantized types (llama/14883)
c55b72b

Akarshan Biswas commited on

CUDA: fix pointer incrementation in FA (llama/14916)
eb84e7e

JohannesGaessler commited on

sycl: refactor quantization to q8_1 (llama/14815)
31edd77

Alberto Cabrera Pérez commited on

cmake : Fix BLAS link interface (ggml/1316)
3020711

Kai Pastor commited on

vulkan : fix 32-bit builds (ggml/1313)
96b66fd

Kai Pastor commited on

scripts : update sync scripts
311eccd

ggerganov commited on

node : add win platform check for require path (#3363)
29b8653
unverified

danbev commited on

ci : update main-cuda.Dockerfile (#3371)
e79709c
unverified

ustas commited on

whisper : fixed crash in GPU device selection on multi-GPU systems (#3372)
0869200
unverified

Dw9 commited on

wasm : change ggml model host to HF (#3369)
ac86ad0
unverified

ggerganov commited on

ruby : Add ruby binding for max_len (#3365)
b408b8e
unverified

Adam Debono commited on

stream.wasm : add language selection support (#3354)
e8933c1
unverified

danbev commited on

whisper : reset conv scheduler when CoreML is used (#3350)
f425556
unverified

ggerganov commited on

ggml : remove old kompute, cann (skip) (#3349)
d321914
unverified

ggerganov commited on

talk-llama : sync llama.cpp
844e617

ggerganov commited on

vulkan : add fp16 support for the conv_2d kernel (llama/14872)
48e92ad

Erik Scholz commited on

vulkan: skip empty set_rows to avoid invalid API usage (llama/14860)
22fb24a

jeffbolznv commited on

HIP: Enable Matrix cores for MMQ Kernels, Enable stream-K for CDNA 3 (llama/14624)
5422b31

deepsek commited on

CANN: Implement GLU ops (llama/14884)
851010b

hipudding commited on