-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MUSA: support ARM64 and enable dp4a .etc #11843
base: master
Are you sure you want to change the base?
Conversation
…pp into bodhi/smoe+musa-ups
Hi @JohannesGaessler , @ggerganov , @slaren , @yeahdongcn , Can you please help review this PR ? Thanks a lot. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes to the CUDA backend look fine to me other than the things I commented on. I don't know whether the changes for model support are correct.
Co-authored-by: Johannes Gäßler <[email protected]>
Co-authored-by: Johannes Gäßler <[email protected]>
Co-authored-by: Johannes Gäßler <[email protected]>
Please run the functionality tests and the tests under the |
Hi @JohannesGaessler , the changes to model support is to enable the |
Hi @yeahdongcn , I see #11822 had been merged. When running
|
FYI,the above |
Hi @yeahdongcn , the model running issue had been fixed on x86,
|
Hi @slaren , the |
This PR will do:
dp4a
on MUSA;expert_weights_scale
for MoE sparsified LLaMA models;Tested with following models:
ARM64:
x86: