Commits · Xenobd/whisper.cpp

CUDA: add conv_2d_transpose (llama/14287)

a728b83

am17an commited on Jun 20

sycl: add usage of enqueue_functions extension (llama/14244)

2e59a96

Nicolò Scipione commited on Jun 20

Implement GGML_CPU_ALL_VARIANTS for PowerPC (llama/14286)

0bcd751

Christian Kastner Diego Devesa commited on Jun 20

cuda : synchronize graph capture and cublas handle destruction (llama/14288)

39c4fa5

Diego Devesa commited on Jun 20

ggml : fix repack work size for mul_mat_id (llama/14292)

4b0d2de

ggerganov commited on Jun 20

ggml: Update KleidiAI to v1.9.0 (llama/14277)

90ccf35

Charles Xu commited on Jun 20

CUDA: add conv_2d_dw (llama/14265)

5cca3ec

am17an commited on Jun 20

ggml-cpu : remove unnecesary arm feature detection (llama/14281)

62cf694

Diego Devesa commited on Jun 19

build : suppress gcc15 compile warnings (llama/14261)

0454008

fanyang commited on Jun 19

sycl: Cleanup codepaths in Get Rows in sycl backend (llama/14215)

feee739

Anton Mitkov commited on Jun 19

llamafile : support s390x SIMD instruction set (llama/14273)

26bafb6

taronaeo commited on Jun 19

Vulkan: Set device max size for host memory to avoid OOM warning and fallback to CPU buffer (llama/14249)

08debcd

OccamRazor commited on Jun 19

metal : add mean kernel (llama/14267)

a726ecc

ggerganov commited on Jun 19

ggml-cpu: reduce asm calls for hsum (llama/14037)

17c0dfa

taronaeo commited on Jun 18

ggml-cpu: fix uncaught underscore terminators (llama/14023)

c005248

taronaeo commited on Jun 18

ggml: Add Apple support for GGML_CPU_ALL_VARIANTS (llama/14258)

9d1d21b

Charles Xu commited on Jun 18

Add `ggml_roll` (ggml/1274)

71923e5

Acly commited on Jun 18

cmake: remove shader-gen step-targets from ggml-vulkan (llama/14226)

b7a7257

bandoti commited on Jun 17

ggml-cpu : remove the weak alias trick (llama/14221)

a1bcb29

xctan commited on Jun 17

musa: fix build warning (unused variable) (llama/14231)

165c242

yeahdongcn commited on Jun 17

llama : add thread safety test (llama/14035)

acc9311

Diego Devesa

ggerganov commited on Jun 16

cmake: clean up external project logic for vulkan-shaders-gen (llama/14179)

bc8b1f7

bandoti commited on Jun 16

HIP: disable rocwmma on gfx12 by default until rocm 7.0 (llama/14202)

f95736f

uvos commited on Jun 16

ggml: Add Android support for GGML_CPU_ALL_VARIANTS (llama/14206)

7ddd89c

Charles Xu commited on Jun 16

vulkan: mutex around vkQueueSubmit (llama/14127)

ef3a7d0

jeffbolznv commited on Jun 16

ggml-cpu : rework weak alias on apple targets (llama/14146)

de5e986

xctan commited on Jun 16

CUDA/HIP: fix ssm_scan on devices where warp size is not 32 (llama/14196)

adf6b4b

uvos commited on Jun 15

HIP: Replace usage of depricated preprocessor macro __AMDGCN_WAVEFRONT_SIZE__ (llama/14183)

c3467c7

uvos commited on Jun 15

sycl: Adding additional cpy dbg print output (llama/14034)

6799437

Anton Mitkov commited on Jun 13

SYCL: Bump oneMath commit (llama/14152)

4d12916

Ewan Crawford commited on Jun 13

sycl: Remove not needed copy f16->f32 for dnnl mul mat (llama/14125)

eed049f

Anton Mitkov commited on Jun 12

cmake : handle whitepsaces in path during metal build (llama/14126)

8076017

ggerganov

danbev commited on Jun 12

Implement GGML_CPU_ALL_VARIANTS for ARM (llama/14080)

c9cec9d

Christian Kastner commited on Jun 11

vulkan: Better thread-safety for command pools/buffers (llama/14116)

fdc26e7

jeffbolznv commited on Jun 11

vulkan: Track descriptor pools/sets per-context (llama/14109)

855a3bf

jeffbolznv commited on Jun 11

opencl: add `mul_mv_id_q4_0_f32_8x_flat` (llama/14003)

d0a458b

lhez commited on Jun 10

Vulkan: Don't default to CPU device (like llvmpipe), even if no other device is available, to allow fallback to CPU backend (llama/14099)

dcb106f

OccamRazor commited on Jun 10

rpc : nicer error messages for RPC server crash (llama/14076)

5d5056e

mcfadyeni commited on Jun 10

ggml : disable warnings for tests when using MSVC (ggml/1273)

1669c07

danbev commited on Jun 13

ggml : remove unused ggml_context_container (ggml/1272)

e6d6988

danbev commited on Jun 13

examples : include examples in msvc disable warn (ggml/1270)

0c191be

danbev commited on Jun 12

ggml : fix weak alias win32 (#0)

d47070d

ggerganov commited on Jun 10

files : remove old sources (part 2)

c1c9908

ggerganov commited on Jun 10

files : remove old sources

e4ae8c6

ggerganov commited on Jun 10

metal : use less stack memory in FA kernel (llama/14088)

014afb6

ggerganov commited on Jun 9

ggml-cpu : split arch-specific implementations (llama/13892)

8c833e9

xctan

ggerganov commited on Jun 9

cuda : fix device sync on buffer clear (llama/14033)

8f2e8d6

Diego Devesa commited on Jun 9

CANN: Simplify the environment variable setting(#13104)

f1535d7

dou112 commited on Jun 9

sycl: Add reorder to Q6_K mmvq implementation (llama/13885)

56f0e48

Nicolò Scipione commited on Jun 9

cuda : fix buffer type check with integrated GPUs (llama/14069)

747ad97

Diego Devesa commited on Jun 8

Commit History

CUDA: add conv_2d_transpose (llama/14287) a728b83

sycl: add usage of enqueue_functions extension (llama/14244) 2e59a96

Implement GGML_CPU_ALL_VARIANTS for PowerPC (llama/14286) 0bcd751

cuda : synchronize graph capture and cublas handle destruction (llama/14288) 39c4fa5

ggml : fix repack work size for mul_mat_id (llama/14292) 4b0d2de

ggml: Update KleidiAI to v1.9.0 (llama/14277) 90ccf35

CUDA: add conv_2d_dw (llama/14265) 5cca3ec

ggml-cpu : remove unnecesary arm feature detection (llama/14281) 62cf694

build : suppress gcc15 compile warnings (llama/14261) 0454008

sycl: Cleanup codepaths in Get Rows in sycl backend (llama/14215) feee739

llamafile : support s390x SIMD instruction set (llama/14273) 26bafb6

Vulkan: Set device max size for host memory to avoid OOM warning and fallback to CPU buffer (llama/14249) 08debcd

metal : add mean kernel (llama/14267) a726ecc

ggml-cpu: reduce asm calls for hsum (llama/14037) 17c0dfa

ggml-cpu: fix uncaught underscore terminators (llama/14023) c005248

ggml: Add Apple support for GGML_CPU_ALL_VARIANTS (llama/14258) 9d1d21b

Add `ggml_roll` (ggml/1274) 71923e5

cmake: remove shader-gen step-targets from ggml-vulkan (llama/14226) b7a7257

ggml-cpu : remove the weak alias trick (llama/14221) a1bcb29

musa: fix build warning (unused variable) (llama/14231) 165c242

llama : add thread safety test (llama/14035) acc9311

cmake: clean up external project logic for vulkan-shaders-gen (llama/14179) bc8b1f7

HIP: disable rocwmma on gfx12 by default until rocm 7.0 (llama/14202) f95736f

ggml: Add Android support for GGML_CPU_ALL_VARIANTS (llama/14206) 7ddd89c

vulkan: mutex around vkQueueSubmit (llama/14127) ef3a7d0

ggml-cpu : rework weak alias on apple targets (llama/14146) de5e986

CUDA/HIP: fix ssm_scan on devices where warp size is not 32 (llama/14196) adf6b4b

HIP: Replace usage of depricated preprocessor macro __AMDGCN_WAVEFRONT_SIZE__ (llama/14183) c3467c7

sycl: Adding additional cpy dbg print output (llama/14034) 6799437

SYCL: Bump oneMath commit (llama/14152) 4d12916

sycl: Remove not needed copy f16->f32 for dnnl mul mat (llama/14125) eed049f

cmake : handle whitepsaces in path during metal build (llama/14126) 8076017

Implement GGML_CPU_ALL_VARIANTS for ARM (llama/14080) c9cec9d

vulkan: Better thread-safety for command pools/buffers (llama/14116) fdc26e7

vulkan: Track descriptor pools/sets per-context (llama/14109) 855a3bf

opencl: add `mul_mv_id_q4_0_f32_8x_flat` (llama/14003) d0a458b

Vulkan: Don't default to CPU device (like llvmpipe), even if no other device is available, to allow fallback to CPU backend (llama/14099) dcb106f

rpc : nicer error messages for RPC server crash (llama/14076) 5d5056e

ggml : disable warnings for tests when using MSVC (ggml/1273) 1669c07

ggml : remove unused ggml_context_container (ggml/1272) e6d6988

examples : include examples in msvc disable warn (ggml/1270) 0c191be

ggml : fix weak alias win32 (#0) d47070d

files : remove old sources (part 2) c1c9908

files : remove old sources e4ae8c6

metal : use less stack memory in FA kernel (llama/14088) 014afb6

ggml-cpu : split arch-specific implementations (llama/13892) 8c833e9

cuda : fix device sync on buffer clear (llama/14033) 8f2e8d6

CANN: Simplify the environment variable setting(#13104) f1535d7

sycl: Add reorder to Q6_K mmvq implementation (llama/13885) 56f0e48

cuda : fix buffer type check with integrated GPUs (llama/14069) 747ad97

CUDA: add conv_2d_transpose (llama/14287)

a728b83

sycl: add usage of enqueue_functions extension (llama/14244)

2e59a96

Implement GGML_CPU_ALL_VARIANTS for PowerPC (llama/14286)

0bcd751

cuda : synchronize graph capture and cublas handle destruction (llama/14288)

39c4fa5

ggml : fix repack work size for mul_mat_id (llama/14292)

4b0d2de

ggml: Update KleidiAI to v1.9.0 (llama/14277)

90ccf35

CUDA: add conv_2d_dw (llama/14265)

5cca3ec

ggml-cpu : remove unnecesary arm feature detection (llama/14281)

62cf694

build : suppress gcc15 compile warnings (llama/14261)

0454008

sycl: Cleanup codepaths in Get Rows in sycl backend (llama/14215)

feee739

llamafile : support s390x SIMD instruction set (llama/14273)

26bafb6

Vulkan: Set device max size for host memory to avoid OOM warning and fallback to CPU buffer (llama/14249)

08debcd

metal : add mean kernel (llama/14267)

a726ecc

ggml-cpu: reduce asm calls for hsum (llama/14037)

17c0dfa

ggml-cpu: fix uncaught underscore terminators (llama/14023)

c005248

ggml: Add Apple support for GGML_CPU_ALL_VARIANTS (llama/14258)

9d1d21b

Add `ggml_roll` (ggml/1274)

71923e5

cmake: remove shader-gen step-targets from ggml-vulkan (llama/14226)

b7a7257

ggml-cpu : remove the weak alias trick (llama/14221)

a1bcb29

musa: fix build warning (unused variable) (llama/14231)

165c242

llama : add thread safety test (llama/14035)

acc9311

cmake: clean up external project logic for vulkan-shaders-gen (llama/14179)

bc8b1f7

HIP: disable rocwmma on gfx12 by default until rocm 7.0 (llama/14202)

f95736f

ggml: Add Android support for GGML_CPU_ALL_VARIANTS (llama/14206)

7ddd89c

vulkan: mutex around vkQueueSubmit (llama/14127)

ef3a7d0

ggml-cpu : rework weak alias on apple targets (llama/14146)

de5e986

CUDA/HIP: fix ssm_scan on devices where warp size is not 32 (llama/14196)

adf6b4b

HIP: Replace usage of depricated preprocessor macro __AMDGCN_WAVEFRONT_SIZE__ (llama/14183)

c3467c7

sycl: Adding additional cpy dbg print output (llama/14034)

6799437

SYCL: Bump oneMath commit (llama/14152)

4d12916

sycl: Remove not needed copy f16->f32 for dnnl mul mat (llama/14125)

eed049f

cmake : handle whitepsaces in path during metal build (llama/14126)

8076017

Implement GGML_CPU_ALL_VARIANTS for ARM (llama/14080)

c9cec9d

vulkan: Better thread-safety for command pools/buffers (llama/14116)

fdc26e7

vulkan: Track descriptor pools/sets per-context (llama/14109)

855a3bf

opencl: add `mul_mv_id_q4_0_f32_8x_flat` (llama/14003)

d0a458b

Vulkan: Don't default to CPU device (like llvmpipe), even if no other device is available, to allow fallback to CPU backend (llama/14099)

dcb106f

rpc : nicer error messages for RPC server crash (llama/14076)

5d5056e

ggml : disable warnings for tests when using MSVC (ggml/1273)

1669c07

ggml : remove unused ggml_context_container (ggml/1272)

e6d6988

examples : include examples in msvc disable warn (ggml/1270)

0c191be

ggml : fix weak alias win32 (#0)

d47070d

files : remove old sources (part 2)

c1c9908

files : remove old sources

e4ae8c6

metal : use less stack memory in FA kernel (llama/14088)

014afb6

ggml-cpu : split arch-specific implementations (llama/13892)

8c833e9

cuda : fix device sync on buffer clear (llama/14033)

8f2e8d6

CANN: Simplify the environment variable setting(#13104)

f1535d7

sycl: Add reorder to Q6_K mmvq implementation (llama/13885)

56f0e48

cuda : fix buffer type check with integrated GPUs (llama/14069)

747ad97