Commit History

Update Dockerfile
0d923d0
Running
verified

Xenobd commited on

remove unnecessary output
84f8c68

3v324v23 commited on

Unified return format
e408c7b

3v324v23 commited on

fix build
1b9b118

3v324v23 commited on

fix build
d937c3a

3v324v23 commited on

fix dockerfile path
03ff3a5

3v324v23 commited on

add meta
36ff0ea

3v324v23 commited on

chore: track binaries with git-lfs
aa000f7

3v324v23 commited on

chore: track binaries with git-lfs
f33d63d

3v324v23 commited on

add sync task
46ebeba

3v324v23 commited on

Handle negative value in padding (#3389)
6e115ac
unverified

Treboko commited on

models : update`./models/download-ggml-model.cmd` to allow for tdrz download (#3381)
0b65831
unverified

Thea Mukhi danbev commited on

talk-llama : sync llama.cpp
4321600

ggerganov commited on

ggml: Add initial WebGPU backend (llama/14521)
4b3da1d

Reese Levine commited on

ggml : initial zDNN backend (llama/14975)
6dd510c

taronaeo commited on

common : handle mxfp4 enum
fd4c0e1

ggerganov commited on

ggml-quants : fix make_qp_quants NANs and IQ1 assertion errors (llama/15379)
a575f57

compilade commited on

vulkan: disable spirv-opt for bfloat16 shaders (llama/15352)
cf24af7

jeffbolznv commited on

vulkan: Use larger workgroups for mul_mat_vec when M is small (llama/15355)
054584a

jeffbolznv OccamRazor commited on

vulkan: support sqrt (llama/15370)
e5406c0

Dong Won Kim commited on

vulkan: Optimize argsort (llama/15354)
80a188c

jeffbolznv commited on

vulkan: fuse adds (llama/15252)
ad199b1

jeffbolznv commited on

vulkan: Support mul_mat_id with f32 accumulators (llama/15337)
41a76e6

jeffbolznv commited on

vulkan: Add missing bounds checking to scalar/coopmat1 mul_mat_id (llama/15334)
a6fa78e

jeffbolznv commited on

OpenCL: add initial FA support (llama/14987)
8ece1ee

mrfatso commited on

opencl: add initial mxfp4 support via mv (llama/15270)
1a0281c

lhez shawngu-quic commited on

vulkan : fix out-of-bounds access in argmax kernel (llama/15342)
78a1865

ggerganov commited on

vulkan : fix compile warnings on macos (llama/15340)
e3107ff

ggerganov commited on

ggml: initial IBM zDNN backend (llama/14975)
449e1a4

taronaeo commited on

CUDA: fix negative KV_max values in FA (llama/15321)
6e3a7b6

JohannesGaessler commited on

vulkan: perf_logger improvements (llama/15246)
d48d508

jeffbolznv commited on

ggml: fix ggml_conv_1d_dw bug (ggml/1323)
4496862

jasonni2 commited on

cuda : fix GGML_CUDA_GRAPHS=OFF (llama/15300)
59c694d

Sigbjørn Skjæret commited on

HIP: bump requirement to rocm 6.1 (llama/15296)
58a3802

uvos commited on

ggml : update `ggml_rope_multi` (llama/12665)
b4896dc

Judd ggerganov commited on

ggml : repack block_iq4_nlx8 (llama/14904)
db4407f

ggerganov commited on

CUDA: Optimize `reduce_rows_f32` kernel, leading up to 25x perf improvement on kernel-level and 10% perf increase for Gemma3n (llama/15132)
c768824

ORippler commited on

ggml-rpc: chunk send()/recv() to avoid EINVAL for very large tensors over RPC (macOS & others) (llama/15188)
c8284f2

aixsatoshi Shinnosuke Takagi commited on

HIP: disable sync warp shuffel operators from clr amd_warp_sync_functions.h (llama/15273)
8fca6dd

uvos commited on

sycl: Fix and disable more configurations of mul_mat (llama/15151)
7b868ed

Romain Biessy commited on

opencl: allow mixed f16/f32 `add` (llama/15140)
345810b

mrfatso commited on

CUDA cmake: add `-lineinfo` for easier debug (llama/15260)
008e169

am17an commited on

CANN: GGML_OP_CPY optimization (llama/15070)
73e90ff

Chenguang Li commited on

musa: fix failures in test-backend-ops for mul_mat_id op (llama/15236)
4168dda

yeahdongcn commited on

CANN: Add broadcast for softmax and FA (llama/15208)
db87c9d

hipudding commited on