v1.10
Library updates
- llama-cpp-python: bump to 0.2.82.
- ExLlamaV2: bump to 0.1.7 (adds Gemma-2 support).
Changes
- Add new
--no_xformers
and--no_sdpa
flags for ExLlamaV2.- Note: to use Gemma-2 with ExLlamaV2, you currently must use the
--no_flash_attn --no_xformers --no_sdpa
flags, or check the corresponding checkboxes in the UI before loading the model, otherwise it will perform very badly.
- Note: to use Gemma-2 with ExLlamaV2, you currently must use the
- Minor UI updates.