Skip to content

Commit

Permalink
prepare for 1.5.4 release (#1026)
Browse files Browse the repository at this point in the history
  • Loading branch information
Qubitium authored Jan 5, 2025
1 parent d67fd01 commit 55c594b
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
</p>

## News
* 01/05/2025 [1.5.4 Patch](https://github.com/ModelCloud/GPTQModel/releases/tag/v1.5.4): Fix regression where `quantize_config` is not properly read from `config.json` if `quantize_config.json` does not exists.
* 01/05/2025 [1.5.4](https://github.com/ModelCloud/GPTQModel/releases/tag/v1.5.4): 25% faster quantization. Fixed regression where `quantize_config` is not properly read from `config.json`.
* 01/04/2025 [1.5.3](https://github.com/ModelCloud/GPTQModel/releases/tag/v1.5.3): AMD ROCm (6.2+) support added and validated for 7900XT+ GPU. Auto-tokenizer loader via `load()` api. For most models you no longer need to manually init a tokenizer for both inference and quantization. ~25-30% memory reduction in quantization vs previous release.
* 01/01/2025 [1.5.1](https://github.com/ModelCloud/GPTQModel/releases/tag/v1.5.1): 🎉 2025! Added `QuantizeConfig.device` to clearly define which device is used for quantization: default = `auto`. Non-quantized models are always loaded on cpu by-default and each layer is moved to `QuantizeConfig.device` during quantization to minimize vram usage. Compatibility fixes for `attn_implementation_autoset` in latest transformers.
* 12/23/2024 [1.5.0](https://github.com/ModelCloud/GPTQModel/releases/tag/v1.5.0): Multi-modal (image-to-text) optimized quantization support has been added for Qwen 2-VL and Ovis 1.6-VL. Previous image-to-text model quantizations did not use image calibration data, resulting in less than optimal post-quantization results. Version 1.5.0 is the first release to provide a stable path for multi-modal quantization: only text layers are quantized.
Expand Down

0 comments on commit 55c594b

Please sign in to comment.