-
Notifications
You must be signed in to change notification settings - Fork 196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Working around new int4wo weight packing #1389
Comments
there is no change of the input shape I believe, so the old code should work after you add
this seems to be an error of loading a unquantized model state dict into a quantized model? |
@yanbing-j can you make corresponding changes in torchchat (https://github.com/pytorch/torchchat/blob/main/torchchat/utils/gguf_loader.py#L609C17-L614C18) as well? also it would be helpful to add some docs for https://github.com/pytorch/pytorch/blob/7939b5f5f9b073984c26adef1446fa250a20bceb/aten/src/ATen/native/LinearAlgebra.cpp#L3457 and friends so it's clear the input and output dimensions |
I follow https://github.com/pytorch/torchchat/blob/main/.github/workflows/pull.yml#L830-L874 to reproduce this issue. |
Thanks @yanbing-j, I'll follow up in the other PR
There is a change input type and output shape i believe? |
Given the change in output shape/behavior in pytorch/pytorch#139611 + #1278
Question: What is the recommended way of migrating to the new cpu implementation of
while maintaining the previous behavior?
Specifically _convert_weight_to_int4pack
and _weight_int4pack_mm
Tested: With no code changes
The following error is encountered:
Tested: Naive (Just add *_for_cpu)
Size mismatch was encountered (expected since signatures are different)
cc: @yanbing-j @jerryzh168 who worked on the changes
The text was updated successfully, but these errors were encountered: