Skip to content
This repository has been archived by the owner on Aug 30, 2024. It is now read-only.

[GPTQ Enhence] Support GPTQ & AWQ inference for QWENv1, v1.5 and Mixtral. #134

Merged
merged 25 commits into from
Mar 6, 2024

Conversation

Zhenzhong1
Copy link
Contributor

@Zhenzhong1 Zhenzhong1 commented Feb 21, 2024

Type of Change

  • Feature Added
  • Bug fixed.
  • Feature Enhence.

Related PR: #140

Description

Feature Added

Bug fixed:
Fixed convert_quantized_mistral.py issues.

Feature Enhence.
Add parameters count and prinf for the NE format.

Expected Behavior & Potential Risk

N/A

How has this PR been tested?

Qwen:
image

Mixtral
image

Dependency Change?

N/A

@Zhenzhong1 Zhenzhong1 changed the title add GPTQ Qwen [GPTQ Enhence] Support QWEN-GPTQ Mar 4, 2024
@Zhenzhong1 Zhenzhong1 changed the title [GPTQ Enhence] Support QWEN-GPTQ [GPTQ Enhence] Support QWEN and Mixtral GPTQ inference Mar 5, 2024
@Zhenzhong1 Zhenzhong1 marked this pull request as ready for review March 5, 2024 07:10
@Zhenzhong1 Zhenzhong1 changed the title [GPTQ Enhence] Support QWEN and Mixtral GPTQ inference [GPTQ Enhence] Support QWEN and Mixtral GPTQ & AWQ inference Mar 5, 2024
@Zhenzhong1 Zhenzhong1 changed the title [GPTQ Enhence] Support QWEN and Mixtral GPTQ & AWQ inference [GPTQ Enhence] Support QWENv1 & v2 and Mixtral GPTQ & AWQ inference Mar 6, 2024
@Zhenzhong1 Zhenzhong1 changed the title [GPTQ Enhence] Support QWENv1 & v2 and Mixtral GPTQ & AWQ inference [GPTQ Enhence] Support QWENv1, v2 and Mixtral GPTQ & AWQ inference Mar 6, 2024
@Zhenzhong1 Zhenzhong1 changed the title [GPTQ Enhence] Support QWENv1, v2 and Mixtral GPTQ & AWQ inference [GPTQ Enhence] Support GPTQ & AWQ inference for QWENv1, v2 and Mixtral. Mar 6, 2024
Copy link
Contributor

@a32543254 a32543254 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@a32543254 a32543254 changed the title [GPTQ Enhence] Support GPTQ & AWQ inference for QWENv1, v2 and Mixtral. [GPTQ Enhence] Support GPTQ & AWQ inference for QWENv1, v1.5 and Mixtral. Mar 6, 2024
@VincyZhang VincyZhang merged commit a129213 into main Mar 6, 2024
11 checks passed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants