Skip to content

Releases: ggerganov/llama.cpp

b4174

26 Nov 02:50
0eb4e12
Compare
Choose a tag to compare
vulkan: Fix a vulkan-shaders-gen arugment parsing error (#10484)

The vulkan-shaders-gen was not parsing the --no-clean argument correctly.
Because the previous code was parsing the arguments which have a value only
and the --no-clean argument does not have a value, it was not being parsed
correctly. This commit can now correctly parse arguments that don't have values.

b4173

25 Nov 22:56
0cc6375
Compare
Choose a tag to compare
Introduce llama-run (#10291)

It's like simple-chat but it uses smart pointers to avoid manual
memory cleanups. Less memory leaks in the code now. Avoid printing
multiple dots. Split code into smaller functions. Uses no exception
handling.

Signed-off-by: Eric Curtin <[email protected]>

b4171

25 Nov 22:32
9fd8c26
Compare
Choose a tag to compare
server : add more information about error (#10455)

b4170

25 Nov 21:55
47f931c
Compare
Choose a tag to compare
server : enable cache_prompt by default (#10501)

ggml-ci

b4169

25 Nov 21:52
106964e
Compare
Choose a tag to compare
metal : enable mat-vec kernels for bs <= 4 (#10491)

b4168

25 Nov 21:34
80acb7b
Compare
Choose a tag to compare
Rename Olmo1124 to Olmo2 (#10500)

b4167

25 Nov 21:26
10bce04
Compare
Choose a tag to compare
llama : accept a list of devices to use to offload a model (#10497)

* llama : accept a list of devices to use to offload a model

* accept `--dev none` to completely disable offloading

* fix dev list with dl backends

* rename env parameter to LLAMA_ARG_DEVICE for consistency

b4164

25 Nov 18:09
9ca2e67
Compare
Choose a tag to compare
server : add speculative decoding support (#10455)

* server : add speculative decoding support

ggml-ci

* server : add helper function slot.can_speculate()

ggml-ci

b4163

25 Nov 17:41
5931c1f
Compare
Choose a tag to compare
ggml : add support for dynamic loading of backends (#10469)

* ggml : add support for dynamic loading of backends

---------

Co-authored-by: Georgi Gerganov <[email protected]>

b4162

25 Nov 17:17
f6d12e7
Compare
Choose a tag to compare
tests : fix compile warning