Spaces:
Running
Running
Commit History
ggml : update `ggml_rope_multi` (llama/12665)
b4896dc
sync : resolve conflicts (ggml/0)
497add0
ggml : add ggml_scale_bias (llama/14417)
573d50a
CUDA: add bilinear interpolation for upscale (llama/14563)
68ded09
ggml : implement GEGLU_ERF and GEGLU_QUICK ops (llama/14445)
f798922
Sigbjørn Skjæret
commited on
ggml : fix FA mask dim 2 and 3 (llama/14505)
a89dc81
llama : initial Mamba-2 support (llama/9126)
1b4087e
ggml : support bcast ggml_soft_max_ext, ggml_flash_attn_ext (llama/14435)
ebacb3e
ggml : Callback before abort (llama/14481)
ccee17d
Add Conv2d for CPU (llama/14388)
68eb27a
ggml : implement REGLU/GEGLU/SWIGLU ops (llama/14158)
add5c0f
ggml-cpu : "align corners" for bilinear upscale/downscale (ggml/1285)
88e7829
Add `ggml_roll` (ggml/1274)
71923e5
threading: support for GGML_SCHED_PRIO_LOW, update thread info on Windows to avoid throttling (llama/12995)
d5d55f2
Max Krasnyansky
Diego Devesa
commited on
ggml : add ggml_repeat_4d (llama/13824)
3fe8af8
ggml : remove ggml_graph_import and ggml_graph_export declarations (ggml/1247)
3c9a1d2
ggml : fix the order of ggml_unary_op (llama/13718)
bdae2b3
ggml : add ggml_gelu_erf() (llama/13667)
6c9cd9a
llama/ggml: add LLM training support (llama/10544)
8d3b3c1
CUDA: fix bad asserts for partial offload (llama/13337)
23e676b
CUDA: fix q_nope_absorbed prec for DS 2 Lite f16 (llama/13137)
e9c9d4b
ggml : Depthwise 2D convolution (ggml/1152)
0c950d5
ggml : add bilinear upscale support (ggml/1185)
4c5e449
Diego Devesa
commited on
ggml : add more generic custom op, remove deprecated custom ops (ggml/1183)
ba7a5f8
Diego Devesa
commited on
metal : improve FA + improve MoE (llama/12612)
04a3389
llama: Add support for RWKV v7 architecture (llama/12412)
727de7e
ggml : portability fixes for VS 2017 (llama/12150)
49e3343
mgroeber9110
Marcus Groeber
commited on
cleanup: fix compile warnings associated with gnu_printf (llama/11811)
ef6a968
bandoti
commited on
CUDA: use mma PTX instructions for FlashAttention (llama/11583)
f328957
CUDA: backwards pass for misc. ops, add tests (llama/11257)
2fbcec1
RoPE: fix back, CUDA support for back + noncont. (llama/11240)
131a21e
GGUF: C++ refactor, backend support, misc fixes (llama/11030)
21c5b64
tts : add OuteTTS support (llama/10784)
8d0f0ac
ggml : refactor online repacking (llama/10446)
163128e
ggml : add `GGML_PAD_REFLECT_1D` operation (ggml/1034)
154bbc0
ggml-cpu: support IQ4_NL_4_4 by runtime repack (llama/10541)
bf73242
ggml : add support for dynamic loading of backends (llama/10469)
b73266f
ggml: new optimization interface (ggml/988)
dd33ace
ggml : build backends as libraries (llama/10256)
3dc93f3
metal : optimize FA kernels (llama/10171)
44ff932
ggml : move CPU backend to a separate file (llama/10144)
0f447f2
Diego Devesa
commited on
llama : add simple-chat example (llama/10124)
41ff26f
Diego Devesa
Xuan Son Nguyen
commited on