Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix SuScaledRotaryEmbedding for Phi-3 mini 128k #155

Closed

Conversation

DePasqualeOrg
Copy link
Contributor

I've verified that this works with Phi-3 mini 128k 4-bit and Phi-3.5 mini 4-bit. I think this should fix #92 and #127.

Comment on lines +35 to +44
if longFactorArray.size == 1 {
// If longFactor is a single value, broadcast it
self._freqs = MLXArray(longFactor[0]) * freqs
} else {
// If longFactor is an array, it should match the frequency dimensions
precondition(
longFactorArray.size == freqs.size,
"longFactor size must match frequency dimensions")
self._freqs = longFactorArray * freqs
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why this condition is necessary. Once you have the longFactorArray it will broadcast correctly if it has size one and if it has the wrong size it will raise an error. So you could replace it like so:

Suggested change
if longFactorArray.size == 1 {
// If longFactor is a single value, broadcast it
self._freqs = MLXArray(longFactor[0]) * freqs
} else {
// If longFactor is an array, it should match the frequency dimensions
precondition(
longFactorArray.size == freqs.size,
"longFactor size must match frequency dimensions")
self._freqs = longFactorArray * freqs
}
self._freqs = longFactorArray * freqs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, it looks like no change is necessary in SuScaledRotaryEmbedding. I'm confused, because I think I was able to reproduce the error in #92, which now no longer seems to occur.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, I didn't notice that was how it was originally. I think you fixed #92 in #107 :) and this PR can be closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Huggingface mlx-community/Phi-3-mini-128k-instruct-4bit model cannot be loaded
2 participants