-
Notifications
You must be signed in to change notification settings - Fork 27.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Phi-3 #30423
Phi-3 #30423
Conversation
I bolded some questions that I had from the previous review. Could you please check them @ArthurZucker? Done:
Review:
|
I think documentation tests are fine, as they are only happening because they ran when phi-3-4k-instruct didn't exist. Waiting on approval for pull request |
Hey! As I have mentioned offline I think that converting to llama format will allow anyone to use your model without any release (for any other frameworks), and we can add the new scalings ( SuScaledRotaryEmbedding, YarnScaledRotaryEmbedding) separately! This would unlock the entire community! 🤗 EDIT: this seems impossible as other frameworks are adapting to this format instead. We'll support the fused weights let's make the weights standardized for the good of the entire community! 🚀 🔥 |
As mentioned in Slack, other team already sent a PR with the previous |
But that should not impact the weights no? |
No, let me see what I can do! |
Updated the rotary embedding classes. Fingers crossed that all tests will pass! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM, let's pay attention to ROPE and make it as simple and close to what our users are used to!
So no chance we could get a Llama-converted version? |
I think RoPE should be clearer now and closer to what transformers proposes. What do you think? For now, I would like to keep the short_factor and long_factor on the config.json because I am still waiting for some replies on how they were created. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🔥 very clean, left 2 small nits about using the config instead of passing kwargs for ROPE, and no one liners. THat's it
Solving the conflicts: accept all changes from main, ignore current |
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
Hopefully I was able to address everything! But please let me know if anything else is needed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀 great work
Hi a CI is triggered https://github.com/huggingface/transformers/actions/runs/8818618312 |
Sorry about the mistake in the workflow file. |
Well =========== 100 passed, 47 skipped, 26 warnings in 391.83s (0:06:31) =========== 💯 I ❤️ see this ! Thank you for the great contribution 🚀 |
The circle CI seems crazy to run my branch #30455. If it is not green, it's fine. The last commit https://app.circleci.com/pipelines/github/huggingface/transformers?branch=pull%2F30423 |
Merging! 👍🏻 |
Thanks for everything folks! |
Thanks for yet another great contribution to the ecosystem ! |
Hi @gugarosa (I believe) The commit on Hub repo https://huggingface.co/microsoft/Phi-3-mini-4k-instruct/commit/3c0c9df9c11252fb61789d7847fa7d03f2825596 failed some tests
see this job https://github.com/huggingface/transformers/actions/runs/8842226344/job/24280809384 Could you take a look and open a PR for a fix please? You can run things like RUN_SLOW=1 python3 -m pytest -rs -v tests/models/phi3/test_modeling_phi3.py::Phi3IntegrationTest::test_phi3_mini_4k_instruct_generation Same for the example in the file docs/source/en/model_doc/phi3.md |
And the doc example for the class
It gives This is an example script . which is just the input text itself. Would be great if you can check this too, thanks. |
Hi @gugarosa It would very nice if you could take a look 🤗 . |
* chore(root): Initial commit of Phi-3 files. * fix(root): Fixes Phi-3 missing on readme. * fix(root): Ensures files are consistent. * fix(phi3): Fixes unit tests. * fix(tests): Fixes style of phi-3 test file. * chore(tests): Adds integration tests for Phi-3. * fix(phi3): Removes additional flash-attention usage, .e.g, swiglu and rmsnorm. * fix(phi3): Fixes incorrect docstrings. * fix(phi3): Fixes docstring typos. * fix(phi3): Adds support for Su and Yarn embeddings. * fix(phi3): Improves according first batch of reviews. * fix(phi3): Uses up_states instead of y in Phi3MLP. * fix(phi3): Uses gemma rotary embedding to support torch.compile. * fix(phi3): Improves how rotary embedding classes are defined. * fix(phi3): Fixes inv_freq not being re-computed for extended RoPE. * fix(phi3): Adds last suggestions to modeling file. * fix(phi3): Splits inv_freq calculation in two lines.
What does this PR do?
Integrates Phi-3 within
transformers
.Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.