Spaces:
Running
Running
Commit History
ggml : update ggml_backend_cpu_device_supports_op (llama/10867) 2f11d1e
vulkan: bugfixes for small subgroup size systems + llvmpipe test (llama/10809) 9220b51
Eve commited on
rwkv6: add wkv6 support for Vulkan backend (llama/10829) c7285d6
Introducing experimental OpenCL backend with support for Qualcomm Adreno GPUs (llama/10693) 83a0899
lhez Skyler Szot Shangqing Gu Alexander Angus Hongqiang Wang Max Krasnyansky commited on
Fix crash caused by ggml_backend_load_all when launching on Android Activity (llama/10812) e1df33d
谢乃闻 Diego Devesa commited on
vulkan: small mul_mat_vec optimizations (llama/10665) ec98109
Eve commited on
SYCL: Reduce most of the compiler warnings (llama/10748) 050e6ce
ggml : Fix compilation issues on ARM platform when building without fp16 (llama/10811) f76ba41
Karol Kontny commited on
CUDA: faster non-contiguous concat (llama/10760) 4621719
a3sh Diego Devesa commited on
remove CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS (llama/10797) b38cecf
Diego Devesa commited on
Vulkan: Use improved q4_k and q5_k dequant code in dequant shaders (llama/10798) a812efc
Vulkan: Add VK_EXT_subgroup_size_control support to ensure full subgroups for coopmats (llama/10721) 488f19e
ggml: load all backends from a user-provided search path (llama/10699) c6de218
Gilad S Diego Devesa commited on
vulkan: request round-to-even for fp16 in im2col/rope_head (llama/10767) 461484c
vulkan: dynamic subgroup size for the remaining k quants (llama/10745) 1bbdb81
Eve commited on
CUDA: rename macros to avoid conflicts with WinAPI (llama/10736) 8544072
Andreas Kieslinger commited on
vulkan: disable spirv-opt for coopmat shaders (llama/10763) 2ac53b2
ggml : remove return from ggml_gallocr_allocate_node (ggml/1048) f9d4408
ggml : add check for grad_accs (ggml/1046) eacc95c
CUDA: fix shared memory access condition for mmv (llama/10740) 99a4546
vulkan: fix compile warnings (llama/10731) cdcb67c
Vulkan: fix NaN in tanh.comp with AMD proprietary driver on Windows (llama/10723) a618c84
stduhpf commited on
vulkan: compile a test shader in cmake to check for coopmat2 support (llama/10713) 980eeb3
ggml : disable iq4_nl interleave size 8 (llama/10709) a5294e7
ggml : refactor online repacking (llama/10446) 163128e
Vulkan: VK_KHR_cooperative_matrix support to speed up prompt processing (llama/10597) 9a4de04
metal : Extend how Llama.cpp locates metal resources (llama/10676) 44e7250
vulkan: Add VK_NV_cooperative_matrix2 support for mul_mat and flash attention (llama/10206) d10b47b
cmake : fix "amd64" processor string (#2638) 8a49dc4 unverified
vulkan : fix soft_max.comp division by zero (#2633) 1ce577d unverified
ggml : remove old files (skip) (#0) 6284570 unverified
ggml : sync remnants (skip) (#0) 451937f unverified
ggml : add predefined list of CPU backend variants to build (llama/10626) 1794b43
Diego Devesa commited on
ggml-cpu : fix HWCAP2_I8MM value (llama/10646) b3e6ea8
Diego Devesa commited on
vulkan: Implement "fast divide" (mul+shift) for unary ops like copy (llama/10642) e9ee893
SYCL : Move to compile time oneMKL interface backend selection for NVIDIA backend (llama/10584) 385f335
Nicolò Scipione commited on
Avoid using __fp16 on ARM with old nvcc (llama/10616) 19743b6
Frankie Robertson commited on
vulkan: optimize and reenable split_k (llama/10637) bca95f5
ggml: add `GGML_SET` Metal kernel + i32 CPU kernel (ggml/1037) dd775d5
ggml : add `GGML_PAD_REFLECT_1D` operation (ggml/1034) 154bbc0
files : remove make artifacts d3e3ea1
ggml : move AMX to the CPU backend (llama/10570) 3732429
Diego Devesa commited on
metal : small-batch mat-mul kernels (llama/10581) 58b0822
SYCL: Fix and switch to GGML_LOG system instead of fprintf (llama/10579) f083887
ggml-cpu: replace AArch64 NEON assembly with intrinsics in ggml_gemv_q4_0_4x4_q8_0() (llama/10567) 1c781a8
Adrien Gallouët commited on
vulkan: Dynamic subgroup size support for Q6_K mat_vec (llama/10536) 59600b5
Eve commited on