Release GPTQModel v1.2.3 · ModelCloud/GPTQModel

Stable release with all feature and model unit tests passing. Fixed lots of model unit tests that did not pass or passed incorrectly in previous releases.

HF GLM support added. GLM/ChatGLM has two different code forks: one if non-hf integrated, and latest one is integrated into transformers. HF GLM and non-HF GLM are not weight compatible and we support both variants.

What's Changed

Add GLM (HF-ied) support by @Qubitium in #581
unit tests add args USE_VLLM by @ZYC-ModelCloud in #582
Quantize record info by @ZYC-ModelCloud in #583
[MISC] add gptqmodel[eval] and remove sentencepiece by @PZS-ModelCloud in #602
[MISC] requirements remove gekko, ninja, huggingface-hub, protobuf by @PZS-ModelCloud in #603
release gpu vram after layer.fwd by @LRL-ModelCloud in #616
Delete unsupported model & skip gptnoex by @CSY-ModelCloud in #617
[FIX] Some models put hidden_states in kwargs instead of args. by @ZX-ModelCloud in #621
lm_eval vllm task add max_model_len=4096 args by @LRL-ModelCloud in #625
try catch should only work with lmeval by @CSY-ModelCloud in #628
set USE_VLLM = False by @LRL-ModelCloud in #629
[FIX] if load quantized model. we will not monkey_path forward by @LRL-ModelCloud in #638
simplified ModelLoader ModelWriter func by @ZYC-ModelCloud in #637
disable chat for test_mpt by @CSY-ModelCloud in #641
Update unit_tests.yml by @Qubitium in #642
fix tokenized[0] wrong when getting value from BatchEncoding type by @CSY-ModelCloud in #643

New Contributors

@jiqing-feng made their first contribution in #527

Full Changelog: v1.2.1...v1.2.3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPTQModel v1.2.3

What's Changed

New Contributors

Contributors