Spaces:

Duplicated from natasa365/whisper.cpp

Xenobd
/

whisper.cpp

Running

App Files Files Community

whisper.cpp / ggml /src /ggml-vulkan /ggml-vulkan.cpp

Commit History

Vulkan: Set device max size for host memory to avoid OOM warning and fallback to CPU buffer (llama/14249)

08debcd

OccamRazor commited on Jun 19

vulkan: mutex around vkQueueSubmit (llama/14127)

ef3a7d0

jeffbolznv commited on Jun 16

vulkan: Better thread-safety for command pools/buffers (llama/14116)

fdc26e7

jeffbolznv commited on Jun 11

vulkan: Track descriptor pools/sets per-context (llama/14109)

855a3bf

jeffbolznv commited on Jun 11

Vulkan: Don't default to CPU device (like llvmpipe), even if no other device is available, to allow fallback to CPU backend (llama/14099)

dcb106f

OccamRazor commited on Jun 10

vulkan: Enable VK_KHR_cooperative_matrix extension for Intel Xe2 GPUs (llama/14001)

e5107fe

rillomas commited on Jun 5

vulkan: automatically deduce size of push constants (llama/13936)

00a9e2f

jeffbolznv commited on Jun 5

ggml-vulkan: adds support for op CONV_TRANSPOSE_1D (llama/13813)

32985b0

etasnadi commited on Jun 4

vulkan: fix warnings in perf logger querypool code (llama/13937)

11bac96

jeffbolznv commited on Jun 3

vulkan: use timestamp queries for GGML_VULKAN_PERF (llama/13817)

56ddc5b

jeffbolznv commited on May 27

vulkan : Remove unexpected ; (ggml/1253)

c4be6fb

Kai Pastor commited on May 31

vulkan: mark IM2COL as supporting non-contig (llama/13783)

09c03ad

jeffbolznv commited on May 26

vulkan: support CPY from any type to itself (llama/13695)

f5f766b

jeffbolznv commited on May 23

vulkan: Disable coopmat/coopmat2/bfloat extensions if glslc doesn't support it (llama/13696)

69679f5

jeffbolznv commited on May 23

use LOG_WARN to replace `std::cerr` (llama/13657)

6975ec2

Judd commited on May 23

vulkan: fix warnings (llama/13626)

8602d10

Eve commited on May 20

Vulkan: Add f32 accumulator support to quantized mul mat to fix GLM4 32B incoherence (llama/13607)

dfa38af

OccamRazor commited on May 19

vulkan: use scalar FA rather than coopmat2 when N==1 (llama/13554)

97d9aa6

jeffbolznv commited on May 17

vulkan: KHR_coopmat flash attention (llama/13506)

4d1bd4f

jeffbolznv commited on May 14

vulkan: scalar flash attention implementation (llama/13324)

3331abd

jeffbolznv commited on May 10

vulkan: Allow up to 4096 elements for mul_mat_id row_ids (llama/13326)

53f8fee

jeffbolznv commited on May 9

vulkan: Additional type support for unary, binary, and copy (llama/13266)

b9cb11e

jeffbolznv commited on May 4

vulkan: Add bfloat16 support (llama/12554)

b21f8a1

jeffbolznv commited on May 1

vulkan: Handle src1 batch dimension in non-contiguous mat-vec-mul shader (llama/13191)

710fdcf

jeffbolznv commited on May 1

vulkan : kernels for depthwise 2D convolution (CONV_2D_DW) (ggml/1204)

43d9f3e

Acly commited on May 2

vulkan: matmul gcn tuning (llama/13016)

ac537d2

Eve

OccamRazor commited on Apr 24

vulkan: support noncontiguous rms_norm (llama/13031)

e4d1f59

jeffbolznv commited on Apr 20

graph : make FA compatible with MLA + add initial Metal kernels (llama/12953)

fb0d243

ggerganov commited on Apr 17

vulkan: enable coopmat2 FA gqa and split_k optimizations more often (llama/12931)

f844153

jeffbolznv commited on Apr 16

vulkan: In coopmat2 mmq, load q4_k/q5_k scales through shared memory (llama/12833)

4b7a407

jeffbolznv commited on Apr 9

ggml : add bilinear upscale support (ggml/1185)

4c5e449

Diego Devesa commited on Apr 9

vulkan: Use unclamped loads for flash attention mask (llama/12720)

a76ef69

jeffbolznv commited on Apr 6

Vulkan: Tune Vulkan mmq int dot shader for performance (llama/12767)

b3bf710

OccamRazor commited on Apr 5

vulkan: Hybrid waitForFences/getFenceStatus to reduce fence latency (llama/12630)

ee422be

jeffbolznv commited on Apr 4

vulkan: Implement split_k for coopmat2 flash attention. (llama/12627)

5ab06d6

jeffbolznv commited on Apr 2

vulkan: Implement grouped query attention in the coopmat2 FA shader (llama/12559)

e7bebe6

jeffbolznv commited on Apr 2

vulkan: fix build when glslc doesn't support coopmat (llama/12683)

f91eb88

Wagner Bruna commited on Apr 1

Vulkan: Add DP4A MMQ and Q8_1 quantization shader (llama/12135)

06ec111

OccamRazor commited on Mar 31

metal : improve FA + improve MoE (llama/12612)

04a3389

ggerganov commited on Mar 28

vulkan: Optimize mul_mat_vec p021 and nc shaders (llama/12505)

6868981

jeffbolznv commited on Mar 22

Vulkan: RTE rounding for cpy to quant (llama/12480)

8707beb

stduhpf

jeffbolznv commited on Mar 21

vulkan: Submit once enough matmul work has been recorded (llama/12406)

ec77b2c

jeffbolznv commited on Mar 19

Vulkan: Default to 1GB allocations instead of 4GB to avoid fragmentation and driver issues (llama/12434)

55088d3

OccamRazor commited on Mar 18

llama: Add support for RWKV v7 architecture (llama/12412)

727de7e

mollysama commited on Mar 17

vulkan: Add N/2 and N/4 optimized paths in coopmat2 shader (llama/12312)

c9f86c1

jeffbolznv commited on Mar 17

vulkan: subgroup size tuning (llama/12087)

af63c3d

OccamRazor commited on Mar 17

vulkan: Pad N dimension of B matrix for coopmat2 perf, to avoid bounds checking (llama/12273)

5d51f1c

jeffbolznv commited on Mar 17

vulkan: Adjust coopmat2 tile sizes and selection heuristic (llama/12258)

3cc6539

jeffbolznv commited on Mar 17

vulkan : sync (llama/0)

4c17fa1

ggerganov commited on Mar 4

ggml : upgrade init_tensor API to return a ggml_status (llama/11854)

d6b6852

William Tambellini slaren commited on Feb 28