whisper : add check that target name exists (#3103) 60ff3ed unverified danbev commited on May 1, 2025
server : add --no-gpu option to print usage output (#3098) 1eb0f64 unverified danbev commited on May 1, 2025
ruby : ignore "Downloading" output in test_log_suppress (#3106) fdb6c7e unverified danbev commited on May 1, 2025
make : fix samples glob pattern (#3100) 0a9e5b1 unverified ggerganov HF Staff commited on Apr 30, 2025
whisper : fix grammar advance stack warning (#3087) e4a0565 unverified danbev commited on Apr 28, 2025
examples : expose language detection probabilities to server example (#3044) 6b8d348 unverified sachaarbonel commited on Apr 28, 2025
whisper : remove empty .gitmodules file [no ci] (#3085) aa54166 unverified danbev commited on Apr 28, 2025
ci : disable publishing of java binding [no ci] (#3086) 4b6e041 unverified danbev commited on Apr 28, 2025
build : Add Moore Threads GPU support and update GitHub workflow for MUSA build (#3069) 8ede9a1 unverified R0CKSTAR commited on Apr 28, 2025
examples : fix deprecated FFmpeg functions (#3073) 0aa41e8 unverified podre-henrique commited on Apr 28, 2025
ruby : add encoder begin callback related methods (#3076) 855927b unverified KitaitiMakoto commited on Apr 25, 2025
opencl : remove obsolete files (skip) (ggml/1200) adc6542 ggerganov HF Staff commited on Apr 24, 2025
opencl: split ggml-opencl.cl into multiple files and cleanup (llama/12886) 291a5b7 lhez Shangqing Gu commited on Apr 24, 2025
CUDA: use switch statements in constexpr functions (llama/13095) f5cd546 JohannesGaessler commited on Apr 24, 2025
metal : fix floating-point range of attention scores in FA kernels (llama/13090) e093044 ggerganov HF Staff commited on Apr 24, 2025
CUDA: noncont MMVQ + batched bs1 MUL_MAT_ID (llama/13014) 285a334 JohannesGaessler commited on Apr 22, 2025
ggml : add SSE 4.2 and x64 base variant for CPUs without AVX (llama/12871) f8795d3 Diego Devesa commited on Apr 21, 2025
SYCL: Add non-contiguous support in ROPE (llama/12993) a29a2c3 Akarshan Biswas commited on Apr 21, 2025
SYCL: Refactor and enable FP16 in binary broadcast OPs (llama/12975) 1377b05 Akarshan Biswas commited on Apr 18, 2025
graph : make FA compatible with MLA + add initial Metal kernels (llama/12953) fb0d243 ggerganov HF Staff commited on Apr 17, 2025
ggml: Re-enable CUDA graphs in presence of CONT and DUP nodes (llama/12970) 3944ae5 Alan Gray commited on Apr 17, 2025
CANN: Add support for async operator submission (llama/12864) 1b9d0f0 hipudding commited on Apr 17, 2025
opencl: fix incorrect local_size index in profiling log (llama/12868) 8f5d919 kimminsu commited on Apr 16, 2025
vulkan: enable coopmat2 FA gqa and split_k optimizations more often (llama/12931) f844153 jeffbolznv commited on Apr 16, 2025
metal : add FA-vec kernels for head size 96 (llama/12952) f1f88b8 ggerganov HF Staff commited on Apr 15, 2025
CUDA/HIP: Share the same unified memory allocation logic. (llama/12934) 143cb70 David Huang commited on Apr 15, 2025
ggml : Add AVX512 implementation of GEMM - Q4_Kx8 (llama/12829) 2457b99 Srihari-mcw commited on Apr 15, 2025
CANN: Optimize CANN buffer pool memory management (llama/12875) 66b93b3 dou112 commited on Apr 15, 2025
ggml: use _mm[512/256]_dpbusd[_avx]_epi32 to directly accumulate into the result register (llama/12773) acb674d sxx-404 commited on Apr 14, 2025
ggml: disable CUDA graphs for unsupported DUP and CONT node types (llama/12891) 9e42c4d Alan Gray commited on Apr 13, 2025