Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AQLM quantizer support #28928
AQLM quantizer support #28928
Changes from 9 commits
30075e2
bd15281
d5ab4bc
b7051b6
fa8ca68
e701def
fa45f37
1c632a5
0fb92a6
92e7250
8fe4d20
8934fed
2ea73c4
95a4de4
0f2165d
35153eb
d2b53c0
4bda42c
579e110
d8058e7
19d2357
c8621cc
dac3a67
f2e0ed3
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can move this method and make it public under
integrations/aqlm.py
and import locally the method inside_process_model_before_weight_loading
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw in the config of the model you pushed on the Hub that you also included layer norm weights inside
linear_weights_not_to_quantize
, I think these can be excluded from the config as they are not an insitance ofnn.Linear
right?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They certainly can be excluded. It's just that converting from a freshly quantized AQLM format it would be troublesome to check if an unquantized
.weight
parameter is ofnn.Linear
or not. So I simply included all of them just in case. That Mixtral config can, indeed, be made somewhat shorter.