Commit History

CUDA: faster large batch FA without tensor cores (llama/7314)
a6d9f2d

JohannesGaessler commited on

rpc : set SO_REUSEADDR for the server socket (llama/7320)
195fe29

rgerganov commited on

ggml-quants, llama : removed excess checks (llama/7274)
142d95e

germanaizek commited on

ggml : rewrite silu and softmax for cpu (llama/7154)
c78b872

Justine Tunney commited on

rpc : add command line arg for specifying backend memory
b441739

rgerganov commited on

Add support for properly optimized Windows ARM64 builds with LLVM and MSVC (llama/7191)
c917076

Max Krasnyansky ggerganov HF Staff commited on

ggml : use dynamic thread scheduling for matrix multiplication (llama/6915)
6f8daf7

kunnis commited on

Avoid unnecessarily disabling CUDA graphs (llama/7302)
4816f6a

agray3 commited on

ggml : tag ggml_tensor::backend as deprecated (llama/7290)
1a5606e

slaren commited on

Add missing " (llama/7303)
2c417da

AidanBeltonS commited on

ggml : add `ggml_upscale_ext` (ggml/814)
04a5333

John Balis ggerganov HF Staff commited on

scripts : update sync
9e35f6d
unverified

ggerganov HF Staff commited on

whisper : use ggml-cuda in mel calc, set appropriate device (#2236)
93af41a
unverified

stanimirovb commited on

cuda : fix HIPBLAS build (#2234)
a8eb666
unverified

ggerganov HF Staff commited on

cuda : fix bounds check for src0 rows in MMVQ kernel (#2231)
4fdb9d2
unverified

ggerganov HF Staff JohannesGaessler commited on

ci : fix CUDA builds (#2232)
41b22d2
unverified

ggerganov HF Staff commited on

whisper : auto-grow working areas for mel_calc_cuda (#2227)
6282f63
unverified

stanimirovb commited on

whisper : free whisper_mel instances (#2220)
9373d6b
unverified

ggerganov HF Staff commited on

whisper : whisper_state/backend fixes (#2217)
adde036
unverified

ggerganov HF Staff commited on

whisper : calculate mel spectrogram directly into a ggml_tensor (#2208)
521186a
unverified

stanimirovb commited on

whisper : add CUDA-specific computation mel spectrograms (#2206)
c6894d3
unverified

stanimirovb commited on

whisper : remove `speed_up` and `phase_vocoder*` functions (#2198)
7ef0c95
unverified

stanimirovb commited on

readme : add conan badge (#2196)
f08dc65
unverified

Martin Delille commited on

readme : add install instructions for Conan (#2189)
fb4f721
unverified

Carlos Zoido commited on

whisper: use global cache for sin/cos vals and Hann window (#2194)
3a04f56
unverified

stanimirovb commited on

release : v1.6.2
3e54141
unverified

ggerganov HF Staff commited on

Revert "whisper : remove extra backend instance (huh?)" (#2182)
b708d81
unverified

ggerganov HF Staff commited on

server : fix typo (#2181)
18c60fc
unverified

Daniel Valdivia commited on

ruby : update bindings (#2154)
a2bce18
unverified

Todd commited on

release : v1.6.1
ca6f4b2
unverified

ggerganov HF Staff commited on

examples : add support for decoding input with ffmpeg (Linux) (#2133)
c160b58
unverified

William Tambellini commited on

node : add flash_attn param (#2170)
b4d05df
unverified

pprobst commited on

ci: Update build.yml to suppress warnings about node.js versions (#2166)
e9954d9
unverified

Tamotsu Takahashi commited on

release : v1.6.0
d823237
unverified

ggerganov HF Staff commited on

whisper : use flash attention (#2152)
27c0a97
unverified

ggerganov HF Staff commited on

talk-llama : reject runs without required arguments (#2153)
b445508
unverified

petterreinholdtsen ggerganov HF Staff commited on

sync : ggml
aac57a1
unverified

ggerganov HF Staff commited on

metal : support FA without mask + add asserts (llama/7278)
98ce302
unverified

ggerganov HF Staff commited on

ggml : add RPC backend (llama/6829)
5838a14
unverified

rgerganov commited on

rm wait() (llama/7233)
328702a
unverified

Neo Zhang commited on

CUDA: add FP32 FlashAttention vector kernel (llama/7188)
03d4b22
unverified

JohannesGaessler commited on

scripts : sync ggml-rpc
7b58c58
unverified

ggerganov HF Staff commited on

whisper : fix model path encoding in windows (#2086)
49f8792
unverified

thewh1teagle commited on

server : return utf-8 (#2138)
2719aa0
unverified

ggerganov HF Staff commited on

node : add audio_ctx and audio buffer params (#2123)
9b4d9d5
unverified

pprobst ggerganov HF Staff commited on

cmake : fix HIP/ROCm build (#2102)
a90ae59
unverified

aldorof commited on

node : add additional params (#2000)
933eb40
unverified

valVk commited on

js : remove un-needed request header from fetchRemote (#2119)
6c54394
unverified

Mark Karpelès commited on

cmake : fix metal embed sources path (#2110)
087b1a8
unverified

ggerganov HF Staff commited on

main : dont print timings with --no-prints (#2108)
685d1c1
unverified

Daniel Ziegenberg commited on