Releases: leixy76/llama.cpp
Releases · leixy76/llama.cpp
b3454
Build Llama SYCL Intel with static libs (#8668) Ensure SYCL CI builds both static & dynamic libs for testing purposes Signed-off-by: Joe Todd <[email protected]>
b3439
*.py: Stylistic adjustments for python (#8233) * Superflous parens in conditionals were removed. * Unused args in function were removed. * Replaced unused `idx` var with `_` * Initializing file_format and format_version attributes * Renaming constant to capitals * Preventing redefinition of the `f` var Signed-off-by: Jiri Podivin <[email protected]>
b3432
flake.lock: Update (#8610)
b3414
server: use relative routes for static files in new UI (#8552) * server: public: fix api_url on non-index pages * server: public: use relative routes for static files in new UI
b3409
CONTRIBUTING.md : remove mention of noci (#8541)
b3405
make/cmake: add missing force MMQ/cuBLAS for HIP (#8515)
b3389
llama : fix Gemma-2 Query scaling factors (#8473) * 9B - query_pre_attn_scalar = 256 not 224 See https://github.com/google/gemma_pytorch/commit/03e657582d17cb5a8617ebf333c1c16f3694670e Gemma 9b should use 256 and not 224 (self.config.hidden_size // self.config.num_attention_heads) * llama : fix Gemma-2 Query scaling factor ggml-ci --------- Co-authored-by: Daniel Han <[email protected]>
b3384
server : handle content array in chat API (#8449) * server : handle content array in chat API * Update examples/server/utils.hpp Co-authored-by: Xuan Son Nguyen <[email protected]> --------- Co-authored-by: Xuan Son Nguyen <[email protected]>
b3372
gitignore : deprecated binaries
b3369
Initialize default slot sampling parameters from the global context. …