Skip to content

v1.15

Compare
Choose a tag to compare
@oobabooga oobabooga released this 01 Oct 17:48
· 58 commits to main since this release
3b06cb4

Backend updates

  • Transformers: bump to 4.45.
  • ExLlamaV2: bump to 0.2.3.
  • flash-attention: bump to 2.6.3.
  • llama-cpp-python: bump to 0.3.1.
  • bitsandbytes: bump to 0.44.
  • PyTorch: bump to 2.4.1.
  • ROCm: bump wheels to 6.1.2.
  • Remove AutoAWQ, AutoGPTQ, HQQ, and AQLM from requirements.txt:
    • AutoAWQ and AutoGPTQ were removed due to lack of support for PyTorch 2.4.1 and CUDA 12.1.
    • HQQ and AQLM were removed to make the project leaner since they're experimental with limited use.
    • You can still install those libraries manually if you are interested.

Changes

  • Exclude Top Choices (XTC): A sampler that boosts creativity, breaks writing clichés, and inhibits non-verbatim repetition (#6335). Thanks @p-e-w.
  • Make it possible to sort repetition penalties with "Sampler priority". The new keywords are:
    • repetition_penalty
    • presence_penalty
    • frequency_penalty
    • dry
    • encoder_repetition_penalty
    • no_repeat_ngram
    • xtc (not a repetition penalty but also added in this update)
  • Don't import PEFT unless necessary. This makes the web UI launch faster.
  • Add beforeunload event to add confirmation dialog when leaving page (#6279). Thanks @leszekhanusz.
  • update API documentation with examples to list/load models (#5902). Thanks @joachimchauvet.
  • Training pro update script.py (#6359). Thanks @FartyPants.

Bug fixes

  • Fix UnicodeDecodeError for BPE-based Models (especially GLM-4) (#6357). Thanks @GralchemOz.
  • API: Relax multimodal format, fixes HuggingFace Chat UI (#6353). Thanks @Papierkorb.
  • Force /bin/bash shell for conda (#6386). Thanks @Thireus.
  • Do not set value for histories in chat when --multi-user is used (#6317). Thanks @mashb1t.
  • typo in OpenAI response format (#6365). Thanks @jsboige.