Spaces:
Running
Running
Commit History
rpc : set SO_REUSEADDR for the server socket (llama/7320) 195fe29
ggml-quants, llama : removed excess checks (llama/7274) 142d95e
ggml : rewrite silu and softmax for cpu (llama/7154) c78b872
Justine Tunney commited on
rpc : add command line arg for specifying backend memory b441739
Add support for properly optimized Windows ARM64 builds with LLVM and MSVC (llama/7191) c917076
ggml : use dynamic thread scheduling for matrix multiplication (llama/6915) 6f8daf7
kunnis commited on
Avoid unnecessarily disabling CUDA graphs (llama/7302) 4816f6a
agray3 commited on
ggml : tag ggml_tensor::backend as deprecated (llama/7290) 1a5606e
slaren commited on
Add missing " (llama/7303) 2c417da
AidanBeltonS commited on
ggml : add `ggml_upscale_ext` (ggml/814) 04a5333
scripts : update sync 9e35f6d unverified
whisper : use ggml-cuda in mel calc, set appropriate device (#2236) 93af41a unverified
cuda : fix HIPBLAS build (#2234) a8eb666 unverified
cuda : fix bounds check for src0 rows in MMVQ kernel (#2231) 4fdb9d2 unverified
ci : fix CUDA builds (#2232) 41b22d2 unverified
whisper : auto-grow working areas for mel_calc_cuda (#2227) 6282f63 unverified
whisper : free whisper_mel instances (#2220) 9373d6b unverified
whisper : whisper_state/backend fixes (#2217) adde036 unverified
whisper : calculate mel spectrogram directly into a ggml_tensor (#2208) 521186a unverified
whisper : add CUDA-specific computation mel spectrograms (#2206) c6894d3 unverified
whisper : remove `speed_up` and `phase_vocoder*` functions (#2198) 7ef0c95 unverified
readme : add conan badge (#2196) f08dc65 unverified
Martin Delille commited on
readme : add install instructions for Conan (#2189) fb4f721 unverified
Carlos Zoido commited on
whisper: use global cache for sin/cos vals and Hann window (#2194) 3a04f56 unverified
release : v1.6.2 3e54141 unverified
Revert "whisper : remove extra backend instance (huh?)" (#2182) b708d81 unverified
server : fix typo (#2181) 18c60fc unverified
Daniel Valdivia commited on
ruby : update bindings (#2154) a2bce18 unverified
Todd commited on
release : v1.6.1 ca6f4b2 unverified
examples : add support for decoding input with ffmpeg (Linux) (#2133) c160b58 unverified
William Tambellini commited on
node : add flash_attn param (#2170) b4d05df unverified
ci: Update build.yml to suppress warnings about node.js versions (#2166) e9954d9 unverified
Tamotsu Takahashi commited on
release : v1.6.0 d823237 unverified
whisper : use flash attention (#2152) 27c0a97 unverified
talk-llama : reject runs without required arguments (#2153) b445508 unverified
sync : ggml aac57a1 unverified
metal : support FA without mask + add asserts (llama/7278) 98ce302 unverified
ggml : add RPC backend (llama/6829) 5838a14 unverified
rm wait() (llama/7233) 328702a unverified
Neo Zhang commited on
CUDA: add FP32 FlashAttention vector kernel (llama/7188) 03d4b22 unverified
scripts : sync ggml-rpc 7b58c58 unverified
whisper : fix model path encoding in windows (#2086) 49f8792 unverified
thewh1teagle commited on
server : return utf-8 (#2138) 2719aa0 unverified
cmake : fix HIP/ROCm build (#2102) a90ae59 unverified
aldorof commited on
node : add additional params (#2000) 933eb40 unverified
valVk commited on
js : remove un-needed request header from fetchRemote (#2119) 6c54394 unverified
Mark Karpelès commited on
cmake : fix metal embed sources path (#2110) 087b1a8 unverified
main : dont print timings with --no-prints (#2108) 685d1c1 unverified
Daniel Ziegenberg commited on