Commits · Xenobd/whisper.cpp

Diego Devesa commited on Aug 6

Chenguang Li commited on Aug 6

ngxson HF Staff slaren commited on Aug 5

Romain Biessy commited on Aug 5

Christian Kastner commited on Aug 4

jeffbolznv commited on Aug 4

jeffbolznv commited on Aug 3

lhez commited on Aug 2

leejet commited on Aug 2

ggerganov commited on Aug 2

jeffbolznv commited on Aug 2

jeffbolznv commited on Aug 2

OccamRazor commited on Aug 2

lhez commited on Aug 1

Manognasree commited on Aug 1

ggerganov Xuan-Son Nguyen commited on Aug 1

OccamRazor commited on Jul 31

hipudding commited on Jul 31

lhez commited on Jul 30

uvos commited on Jul 30

JohannesGaessler commited on Jul 30

uvos commited on Jul 29

uvos commited on Jul 29

uvos commited on Jul 29

hipudding commited on Jul 29

Sigbjørn Skjæret commited on Jul 29

am17an commited on Jul 29

xctan commited on Jul 28

Akarshan Biswas commited on Jul 28

JohannesGaessler commited on Jul 28

Alberto Cabrera Pérez commited on Jul 28

Kai Pastor commited on Jul 30

Kai Pastor commited on Jul 30

ggerganov commited on Aug 18

danbev commited on Aug 15

ustas commited on Aug 13

Dw9 commited on Aug 12

ggerganov commited on Aug 10

Adam Debono commited on Aug 7

danbev commited on Aug 2

ggerganov commited on Jul 30

ggerganov commited on Jul 30

ggerganov commited on Jul 28

ggerganov commited on Jul 28

Erik Scholz commited on Jul 27

jeffbolznv commited on Jul 27

deepsek commited on Jul 26

hipudding commited on Jul 26

Commit History

ggml : fix fallback to CPU for ununsupported ops (llama/15118) 2b7ae5e

CANN: add support for ACL Graph (llama/15065) 137a0dc

llama : add gpt-oss (llama/15091) bf225d6

sycl: fix mul_mat selection (llama/15092) 344310a

cmake: Add GGML_BACKEND_DIR option (llama/15074) 6e460b6

vulkan: fix build when using glslang that does not support coopmat2 (llama/15062) 863e083

vulkan: Use coopmat2 for conv2d (llama/14982) 6df82f4

opencl: fix adreno compiler detection logic (llama/15029) e6a209e

CUDA: use mma FA kernel for gqa > 4 on RTX 4000 (llama/15035) 9e85264

cuda: make im2col a little faster (llama/15025) 9a85c65

cuda, sycl : fix batched gemm when ne02 == 1 && ne03 > 1 (llama/15038) cc3a2ed

vulkan: coopmat2 mul_mat optimizations (llama/14934) ca86566

vulkan: Support ne[3]>1 in noncontig matrix-vector multiply (llama/15015) d4c4115

vulkan: optimizations for direct convolution (llama/14933) 215f463

CUDA: fix MMQ nwarps for AMD with warp_size==32 (llama/15014) fbc3cd1

opencl: add f16 for `add`, `sub`, `mul`, `div` (llama/14984) 4dc1834

ggml : Q2k interleaving implementation - x86/x64 SIMD (llama/14373) e2965b0

docker : add cann build pipline (llama/14591) 2d993ad

Vulkan: Fix minor debug mode issues (llama/14899) a81bc86

CANN: Improve loading efficiency after converting weights to NZ format. (llama/14985) 7612978

opencl: add `mul_mat_f32_f32_l4_lm` and `mul_mat_f16_f32_l4_lm` (llama/14809) 05577c3

HIP: enable mfma mmq on gfx908 and gfx90a for select datatypes and shapes (llama/14949) 149f5a5

CUDA: skip masked KV slices for all FA kernels (llama/14924) 0c60f80

HIP: remove the use of __HIP_PLATFORM_AMD__, explicitly support only AMD targets (llama/14945) e37eff3

HIP: add GGML_HIP_MMQ_MFMA option to allow disableing the MFMA path. (llama/14930) f9dbd96

HIP: Ignore unsupported unroll transformation in fattn-vec (llama/14931) 8e133f7

CANN: Add ggml_set_rows (llama/14943) fa22f70

cuda : add softcap fusion (llama/14907) 2237878

CUDA: add roll (llama/14919) d41a4ec

ggml-cpu : deduplicate scalar implementations (llama/14897) 1d58d7c

SYCL: Add set_rows support for quantized types (llama/14883) c55b72b

CUDA: fix pointer incrementation in FA (llama/14916) eb84e7e

sycl: refactor quantization to q8_1 (llama/14815) 31edd77

cmake : Fix BLAS link interface (ggml/1316) 3020711

vulkan : fix 32-bit builds (ggml/1313) 96b66fd

scripts : update sync scripts 311eccd

node : add win platform check for require path (#3363) 29b8653 unverified

ci : update main-cuda.Dockerfile (#3371) e79709c unverified

whisper : fixed crash in GPU device selection on multi-GPU systems (#3372) 0869200 unverified

wasm : change ggml model host to HF (#3369) ac86ad0 unverified

ruby : Add ruby binding for max_len (#3365) b408b8e unverified

stream.wasm : add language selection support (#3354) e8933c1 unverified

whisper : reset conv scheduler when CoreML is used (#3350) f425556 unverified

ggml : remove old kompute, cann (skip) (#3349) d321914 unverified

talk-llama : sync llama.cpp 844e617

sync : ggml 7d38d31

vulkan : add fp16 support for the conv_2d kernel (llama/14872) 48e92ad

vulkan: skip empty set_rows to avoid invalid API usage (llama/14860) 22fb24a

HIP: Enable Matrix cores for MMQ Kernels, Enable stream-K for CDNA 3 (llama/14624) 5422b31

CANN: Implement GLU ops (llama/14884) 851010b

ggml : fix fallback to CPU for ununsupported ops (llama/15118)

2b7ae5e

CANN: add support for ACL Graph (llama/15065)

137a0dc

llama : add gpt-oss (llama/15091)

bf225d6

sycl: fix mul_mat selection (llama/15092)

344310a

cmake: Add GGML_BACKEND_DIR option (llama/15074)

6e460b6

vulkan: fix build when using glslang that does not support coopmat2 (llama/15062)

863e083

vulkan: Use coopmat2 for conv2d (llama/14982)

6df82f4

opencl: fix adreno compiler detection logic (llama/15029)

e6a209e

CUDA: use mma FA kernel for gqa > 4 on RTX 4000 (llama/15035)

9e85264

cuda: make im2col a little faster (llama/15025)

9a85c65

cuda, sycl : fix batched gemm when ne02 == 1 && ne03 > 1 (llama/15038)

cc3a2ed

vulkan: coopmat2 mul_mat optimizations (llama/14934)

ca86566

vulkan: Support ne[3]>1 in noncontig matrix-vector multiply (llama/15015)

d4c4115

vulkan: optimizations for direct convolution (llama/14933)

215f463

CUDA: fix MMQ nwarps for AMD with warp_size==32 (llama/15014)

fbc3cd1

opencl: add f16 for `add`, `sub`, `mul`, `div` (llama/14984)

4dc1834

ggml : Q2k interleaving implementation - x86/x64 SIMD (llama/14373)

e2965b0

docker : add cann build pipline (llama/14591)

2d993ad

Vulkan: Fix minor debug mode issues (llama/14899)

a81bc86

CANN: Improve loading efficiency after converting weights to NZ format. (llama/14985)

7612978

opencl: add `mul_mat_f32_f32_l4_lm` and `mul_mat_f16_f32_l4_lm` (llama/14809)

05577c3

HIP: enable mfma mmq on gfx908 and gfx90a for select datatypes and shapes (llama/14949)

149f5a5

CUDA: skip masked KV slices for all FA kernels (llama/14924)

0c60f80

HIP: remove the use of __HIP_PLATFORM_AMD__, explicitly support only AMD targets (llama/14945)

e37eff3

HIP: add GGML_HIP_MMQ_MFMA option to allow disableing the MFMA path. (llama/14930)

f9dbd96

HIP: Ignore unsupported unroll transformation in fattn-vec (llama/14931)

8e133f7

CANN: Add ggml_set_rows (llama/14943)

fa22f70

cuda : add softcap fusion (llama/14907)

2237878

CUDA: add roll (llama/14919)

d41a4ec

ggml-cpu : deduplicate scalar implementations (llama/14897)

1d58d7c

SYCL: Add set_rows support for quantized types (llama/14883)

c55b72b

CUDA: fix pointer incrementation in FA (llama/14916)

eb84e7e

sycl: refactor quantization to q8_1 (llama/14815)

31edd77

cmake : Fix BLAS link interface (ggml/1316)

3020711

vulkan : fix 32-bit builds (ggml/1313)

96b66fd

scripts : update sync scripts

311eccd

node : add win platform check for require path (#3363)

29b8653
unverified

ci : update main-cuda.Dockerfile (#3371)

e79709c
unverified

whisper : fixed crash in GPU device selection on multi-GPU systems (#3372)

0869200
unverified

wasm : change ggml model host to HF (#3369)

ac86ad0
unverified

ruby : Add ruby binding for max_len (#3365)

b408b8e
unverified

stream.wasm : add language selection support (#3354)

e8933c1
unverified

whisper : reset conv scheduler when CoreML is used (#3350)

f425556
unverified

ggml : remove old kompute, cann (skip) (#3349)

d321914
unverified

talk-llama : sync llama.cpp

844e617

sync : ggml

7d38d31

vulkan : add fp16 support for the conv_2d kernel (llama/14872)

48e92ad

vulkan: skip empty set_rows to avoid invalid API usage (llama/14860)

22fb24a

HIP: Enable Matrix cores for MMQ Kernels, Enable stream-K for CDNA 3 (llama/14624)

5422b31

CANN: Implement GLU ops (llama/14884)

851010b