[GPTQ Enhence] Support GPTQ & AWQ inference for QWENv1, v1.5 and Mixtral. #134

Zhenzhong1 · 2024-02-21T03:18:18Z

Type of Change

Feature Added
Bug fixed.
Feature Enhence.

Related PR: #140

Description

Feature Added

Bug fixed:
Fixed convert_quantized_mistral.py issues.

Feature Enhence.
Add parameters count and prinf for the NE format.

Expected Behavior & Potential Risk

N/A

How has this PR been tested?

Qwen:

Mixtral

Dependency Change?

N/A

for more information, see https://pre-commit.ci

# Conflicts: # neural_speed/models/qwen/qwen_utils.cpp

a32543254

LGTM

Zhenzhong1 and others added 13 commits February 20, 2024 19:17

add GPTQ Qwen

d8a989e

[pre-commit.ci] auto fixes from pre-commit.com hooks

7ac5689

for more information, see https://pre-commit.ci

debugging

2c31fb6

[pre-commit.ci] auto fixes from pre-commit.com hooks

3d61ea5

for more information, see https://pre-commit.ci

qwen fp32 inference ok

904fb91

[pre-commit.ci] auto fixes from pre-commit.com hooks

b374934

for more information, see https://pre-commit.ci

fixed the QWEN-GPTQ bfloat16 issue

af228cc

[pre-commit.ci] auto fixes from pre-commit.com hooks

83f96bd

for more information, see https://pre-commit.ci

Merge branch 'main' into zhenzhong/QwenQuantize

e38116f

qwen fp32 inference pass

8be219a

[pre-commit.ci] auto fixes from pre-commit.com hooks

edaadb7

for more information, see https://pre-commit.ci

qwen GPTQ inference pass

e1af367

[pre-commit.ci] auto fixes from pre-commit.com hooks

c240c64

for more information, see https://pre-commit.ci

Zhenzhong1 changed the title ~~add GPTQ Qwen~~ [GPTQ Enhence] Support QWEN-GPTQ Mar 4, 2024

Zhenzhong1 added 3 commits March 4, 2024 01:24

yapf

04f1973

Merge branch 'main' into zhenzhong/QwenQuantize

e7ad670

# Conflicts: # neural_speed/models/qwen/qwen_utils.cpp

mixtral GPTQ done & fixed mistral issues

cf82e89

Zhenzhong1 changed the title ~~[GPTQ Enhence] Support QWEN-GPTQ~~ [GPTQ Enhence] Support QWEN and Mixtral GPTQ inference Mar 5, 2024

cleancode

a4c5641

Zhenzhong1 marked this pull request as ready for review March 5, 2024 07:10

Zhenzhong1 requested review from a32543254, intellinjun and zhenwei-intel March 5, 2024 07:10

Zhenzhong1 changed the title ~~[GPTQ Enhence] Support QWEN and Mixtral GPTQ inference~~ [GPTQ Enhence] Support QWEN and Mixtral GPTQ & AWQ inference Mar 5, 2024

Zhenzhong1 added 4 commits March 5, 2024 00:04

AWQ pass

b4ad961

clang-format

3fa8f7d

update doc

26a04d4

QWEN2-GPTQ pass

6707336

zhenwei-intel approved these changes Mar 6, 2024

View reviewed changes

intellinjun approved these changes Mar 6, 2024

View reviewed changes

update docs

457fdc5

Zhenzhong1 changed the title ~~[GPTQ Enhence] Support QWEN and Mixtral GPTQ & AWQ inference~~ [GPTQ Enhence] Support QWENv1 & v2 and Mixtral GPTQ & AWQ inference Mar 6, 2024

Zhenzhong1 changed the title ~~[GPTQ Enhence] Support QWENv1 & v2 and Mixtral GPTQ & AWQ inference~~ [GPTQ Enhence] Support QWENv1, v2 and Mixtral GPTQ & AWQ inference Mar 6, 2024

Zhenzhong1 changed the title ~~[GPTQ Enhence] Support QWENv1, v2 and Mixtral GPTQ & AWQ inference~~ [GPTQ Enhence] Support GPTQ & AWQ inference for QWENv1, v2 and Mixtral. Mar 6, 2024

Zhenzhong1 added the ready to merge label Mar 6, 2024

Zhenzhong1 added 3 commits March 5, 2024 23:02

removed qwen2 scripts & clang-format

98b5358

fixed qwen convert issues.

3d9a9bd

fixed qwen convert issues.

b817c56

a32543254 approved these changes Mar 6, 2024

View reviewed changes

a32543254 changed the title ~~[GPTQ Enhence] Support GPTQ & AWQ inference for QWENv1, v2 and Mixtral.~~ [GPTQ Enhence] Support GPTQ & AWQ inference for QWENv1, v1.5 and Mixtral. Mar 6, 2024

VincyZhang merged commit a129213 into main Mar 6, 2024
11 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[GPTQ Enhence] Support GPTQ & AWQ inference for QWENv1, v1.5 and Mixtral. #134

[GPTQ Enhence] Support GPTQ & AWQ inference for QWENv1, v1.5 and Mixtral. #134

Zhenzhong1 commented Feb 21, 2024 •

edited

Loading

a32543254 left a comment

[GPTQ Enhence] Support GPTQ & AWQ inference for QWENv1, v1.5 and Mixtral. #134

[GPTQ Enhence] Support GPTQ & AWQ inference for QWENv1, v1.5 and Mixtral. #134

Conversation

Zhenzhong1 commented Feb 21, 2024 • edited Loading

Type of Change

Description

Expected Behavior & Potential Risk

How has this PR been tested?

Dependency Change?

a32543254 left a comment

Choose a reason for hiding this comment

Zhenzhong1 commented Feb 21, 2024 •

edited

Loading