-
Notifications
You must be signed in to change notification settings - Fork 258
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
change use_optimum_format=True and add bias #1431
Conversation
Signed-off-by: Xin He <[email protected]>
Signed-off-by: Xin He <[email protected]>
Signed-off-by: Xin He <[email protected]>
shall we specify the format, e.g., use_gptq_format? HF format sounds too general - how about AWQ and GGUF format? People also upload these formats to HF. |
It's actually general, we can generate RTN, AWQ model with this format. GGUF is another format, we haven't supported it now. |
Signed-off-by: Xin He <[email protected]>
Signed-off-by: Xin He <[email protected]>
Signed-off-by: Xin He <[email protected]>
Signed-off-by: Xin He <[email protected]>
Signed-off-by: Xin He <[email protected]>
Signed-off-by: Xin He <[email protected]>
/azp run Code-Scan |
Azure Pipelines successfully started running 1 pipeline(s). |
Type of Change
bug fix
Description
Optimum sets bias=True for QuantLinear when packing model. Here we follow this design of huggingface format for compatibility and set
use_hf_format=True
as default.change argument name from hf to optimum.
Expected Behavior & Potential Risk
UT pass