Spaces:
Running
Running
Commit History
test: fix OPT_STEP_ADAMW for test-backend-ops (ggml/974) 76aa810
vulkan : mul_mat: fix UB with small warps (ggml/952) d1a29c6
ggml : fix ggml_cast (ggml/973) c44d575
ggml: fix gradient allocation logic (ggml/966) ad3f29d
ggml : define missing HWCAP flags (llama/9684) 1d52105
ggml : add run-time detection of neon, i8mm and sve (llama/9331) 12c0e23
Dan Johansson commited on
Enable use to the rebar feature to upload buffers to the device. (llama/9251) 760f8c2
Markus Tavenrath commited on
mtgpu: enable VMM (llama/9597) e84b4f5
R0CKSTAR commited on
ggml : remove assert for AArch64 GEMV and GEMM Q4 kernels (llama/9217) 50395aa
Charles Xu commited on
cann: fix crash when llama-bench is running on multiple cann devices (llama/9627) 068c697
CUDA: remove bad assert (ggml/972) 91954a7
vulkan : multithread pipeline creation (ggml/963) ba60f98
vulkan : fix build for GGML_VULKAN_RUN_TESTS, add TFLOPS to log (ggml/961) 85e2387
vulkan : argsort barriers must be under uniform control flow (ggml/951) b2602d7
ggml : fix GGML_MAX_N_THREADS + improve formatting (ggml/969) ad34655
ggml : add ggml-cpu-impl.h (skip) (#0) 958f2d3
ggml : add AVX512DQ requirement for AVX512 builds (llama/9622) 14b5848
Eric Zhang commited on
log : add CONT level for continuing previous log entry (llama/9610) a29a4c5
threads: fix msvc build without openmp (llama/9615) 97b3eb5
Max Krasnyansky commited on
cuda: add q8_0->f32 cpy operation (llama/9571) 6201c74
threads: improve ggml_barrier scaling with large number of threads (llama/9598) aca04d5
Max Krasnyansky commited on
ggml : AVX512 gemm for Q4_0_8_8 (llama/9532) 7349efc
metal : use F32 prec for K*Q in vec FA (llama/9595) 99c4239
Revert "[SYCL] fallback mmvq (ggml/9088)" (llama/9579) 5aceb3d
Akarshan Biswas commited on
musa: enable building fat binaries, enable unified memory, and disable Flash Attention on QY1 (MTT S80) (llama/9526) 8ec75c3
R0CKSTAR commited on
Fix merge error in #9454 (llama/9589) 3142fa9
CUDA: enable Gemma FA for HIP/Pascal (llama/9581) 97cb7ce
RWKV v6: RWKV_WKV op CUDA implementation (llama/9454) 8d3e707
ggml-alloc : fix list of allocated tensors with GGML_ALLOCATOR_DEBUG (llama/9573) 673df39
slaren commited on
Update CUDA graph on scale change plus clear nodes/params (llama/9550) 6b63eb1
agray3 commited on
examples : adapt to ggml.h changes (ggml/0) 91c7734
ggml : refactoring (llama/#0) 1b62c96
ggml : fix builds (llama/0) 524a01b
ggml : fix trailing whitespace (llama/0) 214f95e
CUDA: fix sum.cu compilation for CUDA < 11.7 (llama/9562) b305ecf
ggml : fix n_threads_cur initialization with one thread (llama/9538) af82b69
slaren Max Krasnyansky commited on
threadpool : skip polling for unused threads (llama/9461) 9d11a7a
Max Krasnyansky commited on
ggml : link MATH_LIBRARY not by its full path (llama/9339) 07d57ec
Michael Podvitskiy commited on
cmake : do not hide GGML options + rename option (llama/9465) 8c32d36
ggml : IQ4_NL sgemm + Q4_0 AVX optimization (llama/9422) f2986f6
Eve commited on
metal : handle zero-sized allocs (llama/9466) 868283e
common : reimplement logging (llama/9418) e893c97
cmake : correct order of sycl flags (llama/9497) 45ddbb5
Michael Podvitskiy commited on
cmake : try to fix sycl+intel build (llama/9487) dd66fc9
Michael Podvitskiy commited on
ggml : ggml_type_name return "NONE" for invalid values (llama/9458) 8a1bb27
Yuri Khrustalev commited on
cmake : use list(APPEND ...) instead of set() + dedup linker (llama/9463) 5497c27
cann: Add host buffer type for Ascend NPU (llama/9406) 7cbca42
Dou Xinpeng commited on
riscv : modify Makefile and add a RISCV_VECT to print log info (llama/9442) f77ad34
Ahmad Tameem commited on
cann: Fix error when running a non-exist op (llama/9424) 74dcc66
Xinpeng Dou commited on