Commits · Xenobd/whisper.cpp

ggml : add logging for native build options/vars (#2935)

aaf8a91
unverified

danbev commited on Mar 24

examples : command.wasm updates (#2904)

0db3249
unverified

danbev commited on Mar 20

cmake : fix ggml-config (ggml/0)

40f0325

ggerganov commited on Mar 8

ggml-cpu: faster AVX2 variant for IQ1_M (llama/12216)

591cbfb

Rémy O commited on Mar 7

metal : simplify kernel arguments using a struct (ggml/3229) (llama/12194)

092277a

BB-fat alexju commited on Mar 7

metal : fix default.metallib build (llama/12224)

838efb6

danbev commited on Mar 7

opencl: Noncontiguous `norm`, `rms_norm`, disable `fp16` for some ops (llama/12217)

94449e3

lhez commited on Mar 7

cmake : fix undefined reference errors for std::filesystem in ggml (#12092) (llama/12094)

dc68418

xiaofei Ray Lee commited on Mar 6

CUDA: fix FA logic for PTX 7.0 and CC >= 7.5 (llama/12222)

4dc8a81

JohannesGaessler commited on Mar 6

HIP/CUDA: set the paramerter value in maintain_cuda_graph instead of replaceing it. (llama/12209)

18afa4b

uvos commited on Mar 6

opencl : fix buffer alignment (llama/12197)

7d25156

linehill commited on Mar 6

opencl : fix `ulong` kernel args were set from `int` variables (llama/12174)

67ffff0

linehill commited on Mar 6

opencl : fix profile-related errors (llama/12095)

e11a847

simon886212 ubuntu commited on Mar 6

ggml-cpu: Faster IQ1 mul_mat_vec on AVX2 using BMI2 instructions (llama/12154)

05466a9

Rémy O commited on Mar 6

SYCL: Disable f16 Unary OPs as not supported by the kernels (llama/12201)

723b8b4

Akarshan Biswas commited on Mar 5

ggml : fix GGMLMetalClass ODR (llama/12200)

2094cb7

pacominev commited on Mar 5

ggml : ggml_compute_forward_concat() for arbitrary tensor type (ggml/1118)

c9a49f9

vmobilis commited on Mar 7

vulkan : sync (llama/0)

4c17fa1

ggerganov commited on Mar 4

ggml : portability fixes for VS 2017 (llama/12150)

49e3343

mgroeber9110 Marcus Groeber commited on Mar 4

HIP: implement FlashAttention via rocWMMA for CDNA and RDNA3+ (llama/12032)

a027c1d

David Huang commited on Mar 3

ggml : fix kleidiai build (llama/12159)

dbc0180

ag2s20150909 commited on Mar 3

SYCL: Move CPY kernels to a separate file and add few missing kernels (llama/12133)

1d6d451

Akarshan Biswas commited on Mar 3

ggml-backend : keep paths in native string type when possible (llama/12144)

6e89d8c

Diego Devesa commited on Mar 2

CUDA: compress mode option and default to size (llama/12029)

4ec988a

Erik Scholz commited on Mar 1

ggml : upgrade init_tensor API to return a ggml_status (llama/11854)

d6b6852

William Tambellini slaren commited on Feb 28

vulkan: add specific MMV kernels for IQ2 and IQ3 quants + optimizations (llama/11595)

d7d82b9

Rémy O commited on Feb 28

CUDA: fix logic for V100 + GGML_CUDA_FORCE_MMQ (llama/12098)

0b52fcc

JohannesGaessler commited on Feb 28

ggml: aarch64: implement SVE kernels for q2_k_q8_k vector dot (llama/12064)

459beb1

Prashant Vithule vithulep commited on Feb 28

CANN: Fix build error with GCC 13 (llama/11990)

dcf68db

hipudding commited on Feb 28

vulkan: matmul dequantization improvements (llama/12015)

ffdf466

Eve commited on Feb 28

vulkan: improve im2col (llama/11826)

f6cff0a

Daniele commited on Feb 28

cmake: Fix ggml backend dependencies and installation (llama/11818)

c6c2a2c

Vladimir Vuksanovic commited on Feb 27

vulkan: fix assertion when qy_needs_dequant (llama/12068)

271c7e4

jeffbolznv commited on Feb 25

ggml-cpu: Fix build with sve (llama/12059)

4be146e

mollysama commited on Feb 25

cuda: unary ops as float + de-duplicate (ggml/1130)

4bec2e4

cmdr2 commited on Mar 3

cuda/vulkan: specify fp32-only support for some operations in supports_op (ggml/1129)

f959b90

cmdr2 commited on Feb 28

cuda/cpu: Increase support for fp16 unary operations (ggml/1125)

67e8c32

cmdr2 commited on Feb 28

Told cmake to install ggml-cpp.h as a public header file. (ggml/1126)

3d4f29c

petterreinholdtsen Petter Reinholdtsen commited on Feb 26

whisper : support GGML_BACKEND_DL (#2843)

2e6437e
unverified

Diego Devesa

ggerganov commited on Feb 27

Support pure float16 add/sub/mul/div operations in the CUDA (and CPU) backend (ggml/1121)

2b94a24

cmdr2 commited on Feb 25

metal : copy kernels for quant to F32/F16 conversions (llama/12017)

6c8e7ec

Garf

ggerganov commited on Feb 25

opencl: fix for small models (llama/11950)

4532dc6

lhez Shawn Gu Skyler Szot commited on Feb 24

Optimize mul_mat for Q4_0 on Intel GPU (llama/12035)

14fd317

Neo Zhang Jianyu arthw commited on Feb 24

SYCL: Fix GGML_SYCL_DEBUG macro (llama/11995)

310a36c

qnixsynapse commited on Feb 24

ggml-cpu: Support s390x SIMD Instruction Set (llama/12019)

4aa54ec

Aaron Teo Jinyang He junchao-zhao commited on Feb 22

CUDA: app option to compile without FlashAttention (llama/12025)

fbc5f16

JohannesGaessler commited on Feb 22

CUDA: optimize FA for GQA + large batches (llama/12014)

6662d54

JohannesGaessler commited on Feb 22

cuda: Add Q5_1, Q5_0, Q4_1 and Q4_0 to F32 conversion support. (llama/12000)

6cb8158

Garf commited on Feb 22

CUDA: correct the lowest Maxwell supported by CUDA 12 (llama/11984)

6641178

PureJourney

JohannesGaessler commited on Feb 21

MUSA: support ARM64 and enable dp4a .etc (llama/11843)

ab96dac

Bodhi Bodhi Hu commited on Feb 21

Commit History

ggml : add logging for native build options/vars (#2935) aaf8a91 unverified

examples : command.wasm updates (#2904) 0db3249 unverified

cmake : fix ggml-config (ggml/0) 40f0325

ggml-cpu: faster AVX2 variant for IQ1_M (llama/12216) 591cbfb

metal : simplify kernel arguments using a struct (ggml/3229) (llama/12194) 092277a

metal : fix default.metallib build (llama/12224) 838efb6

opencl: Noncontiguous `norm`, `rms_norm`, disable `fp16` for some ops (llama/12217) 94449e3

cmake : fix undefined reference errors for std::filesystem in ggml (#12092) (llama/12094) dc68418

CUDA: fix FA logic for PTX 7.0 and CC >= 7.5 (llama/12222) 4dc8a81

HIP/CUDA: set the paramerter value in maintain_cuda_graph instead of replaceing it. (llama/12209) 18afa4b

opencl : fix buffer alignment (llama/12197) 7d25156

opencl : fix `ulong` kernel args were set from `int` variables (llama/12174) 67ffff0

opencl : fix profile-related errors (llama/12095) e11a847

ggml-cpu: Faster IQ1 mul_mat_vec on AVX2 using BMI2 instructions (llama/12154) 05466a9

SYCL: Disable f16 Unary OPs as not supported by the kernels (llama/12201) 723b8b4

ggml : fix GGMLMetalClass ODR (llama/12200) 2094cb7

ggml : ggml_compute_forward_concat() for arbitrary tensor type (ggml/1118) c9a49f9

vulkan : sync (llama/0) 4c17fa1

ggml : portability fixes for VS 2017 (llama/12150) 49e3343

HIP: implement FlashAttention via rocWMMA for CDNA and RDNA3+ (llama/12032) a027c1d

ggml : fix kleidiai build (llama/12159) dbc0180

SYCL: Move CPY kernels to a separate file and add few missing kernels (llama/12133) 1d6d451

ggml-backend : keep paths in native string type when possible (llama/12144) 6e89d8c

CUDA: compress mode option and default to size (llama/12029) 4ec988a

ggml : upgrade init_tensor API to return a ggml_status (llama/11854) d6b6852

vulkan: add specific MMV kernels for IQ2 and IQ3 quants + optimizations (llama/11595) d7d82b9

CUDA: fix logic for V100 + GGML_CUDA_FORCE_MMQ (llama/12098) 0b52fcc

ggml: aarch64: implement SVE kernels for q2_k_q8_k vector dot (llama/12064) 459beb1

CANN: Fix build error with GCC 13 (llama/11990) dcf68db

vulkan: matmul dequantization improvements (llama/12015) ffdf466

vulkan: improve im2col (llama/11826) f6cff0a

cmake: Fix ggml backend dependencies and installation (llama/11818) c6c2a2c

vulkan: fix assertion when qy_needs_dequant (llama/12068) 271c7e4

ggml-cpu: Fix build with sve (llama/12059) 4be146e

cuda: unary ops as float + de-duplicate (ggml/1130) 4bec2e4

cuda/vulkan: specify fp32-only support for some operations in supports_op (ggml/1129) f959b90

cuda/cpu: Increase support for fp16 unary operations (ggml/1125) 67e8c32

Told cmake to install ggml-cpp.h as a public header file. (ggml/1126) 3d4f29c

whisper : support GGML_BACKEND_DL (#2843) 2e6437e unverified

Support pure float16 add/sub/mul/div operations in the CUDA (and CPU) backend (ggml/1121) 2b94a24

metal : copy kernels for quant to F32/F16 conversions (llama/12017) 6c8e7ec

opencl: fix for small models (llama/11950) 4532dc6

Optimize mul_mat for Q4_0 on Intel GPU (llama/12035) 14fd317

SYCL: Fix GGML_SYCL_DEBUG macro (llama/11995) 310a36c

ggml-cpu: Support s390x SIMD Instruction Set (llama/12019) 4aa54ec

CUDA: app option to compile without FlashAttention (llama/12025) fbc5f16

CUDA: optimize FA for GQA + large batches (llama/12014) 6662d54

cuda: Add Q5_1, Q5_0, Q4_1 and Q4_0 to F32 conversion support. (llama/12000) 6cb8158

CUDA: correct the lowest Maxwell supported by CUDA 12 (llama/11984) 6641178

MUSA: support ARM64 and enable dp4a .etc (llama/11843) ab96dac