Spaces:
Running
Running
Commit History
ggml : add ALiBi support for ggml_soft_max_ext (llama/5488) 26c019a unverified
ggml : add numa options (llama/5377) 7c952d2 unverified
ggml : add mmla kernels for quantized GEMM (llama/4966) 0d50a29 unverified
snadampal commited on
ggml-alloc : v3 (ggml/727) 5cffd6f unverified
slaren commited on
llava : add MobileVLM support (llama/5132) f17a416 unverified
JidongZhang-THU slaren commited on
kompute : llama-bench support and ggml_cpu_has_kompute() (llama/5226) 0c9c434 unverified
ggml : add abort_callback for cpu backend (ggml/725) a8ea91b unverified
Michael Podvitskiy commited on
SOTA 3-bit quants (llama/5196) 4649943 unverified
ggml : add Vulkan backend (llama/2059) 5a97aba unverified
minor : clean-up some warnings and style (llama/5094) 7df090b unverified
llava : MobileVLM support (llama/4954) dc8f956 unverified
ggml : add IQ2 to test-backend-ops + refactoring (llama/4990) 227f2ae unverified
imatrix : offload to GPU support (llama/4957) 6490f98 unverified
ggml : introduce GGML_CALL function annotation (llama/4850) 7815f68 unverified
2-bit quantizations (llama/4897) 8a399ab unverified
llama : ggml-backend integration (llama/4766) 362430b unverified
ggml : SOTA 2-bit quants (add IQ2_XS) (llama/4856) 5e827d5 unverified
ggml : remove ggml_cpy_inplace and ggml_cont_inplace (ggml/693) 6469bfe unverified
Timothy Cronin commited on
ggml : change GGML_MAX_NAME at compile time (ggml/682) ded2b1a unverified
SOTA 2-bit quants (llama/4773) 75de5bf unverified
sync : ggml (VMM, sync-ggml-am, dotprod ARM fixes, CUDA fixes) (#1691) 919a447 unverified
sync : ggml (ggml_scale, ggml_row_size, etc.) (#1677) aa86ade unverified
sync : ggml (Metal fixes, new ops, tests) (#1633) a0d4b48 unverified
sync : ggml (new ops, new backend, etc) (#1602) 895e87a unverified
sync : ggml (ggml-alloc + linker + gguf fixes) (#1501) 58507b9 unverified
whisper : add full CUDA and Metal offloading (#1472) da4acca unverified
sync : ggml (backend v2, k-quants, CUDA opts, Metal opts, etc.) (#1422) 7006035 unverified
sync : ggml (const correctness) 4ce2d25 unverified
metal : add F32 support + update bench output 02d7878 unverified
ggml : sync latest llama.cpp (view_src + alloc improvements) (#1247) 8bb66c1 unverified
ggml : sync (ggml-alloc, GPU, eps, etc.) (#1220) d41ba35 unverified
ggml : detect SSSE3 (#1211) 82a619c unverified
Przemysław Pawełczyk commited on