Phi-3 #30423

gugarosa · 2024-04-23T12:11:36Z

What does this PR do?

Integrates Phi-3 within transformers.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

gugarosa · 2024-04-23T13:49:24Z

I bolded some questions that I had from the previous review. Could you please check them @ArthurZucker?

Done:

Standardize how Phi3LongScaledRotaryEmbedding is validated/initialized. Uses the same approach defined by Llama.
swiglu from flash-attention is required for now because we will need to ablate some alternative options. Will add a patch PR later on.
Uses copies for the unit tests in test_modeling_phi3.py.
Use # Ignore copy on Phi3ForCausalLM.forward.
Apply .float() on the apply_rotary_pos_emb.
Integration tests are added.

Review:

DbrxAttention has an extra clipping on the QKV, which prevents the copy. Or is there a way to ignore certain lines?
Use old get_usable_length - did not catch the reference, checked Llama and Mistral files, and they are using it as well.
Phi3Model uses an extra dropout - no sure how I should proceed with the copy. Can we ignore lines?

CoderCowMoo · 2024-04-23T14:59:08Z

I think documentation tests are fine, as they are only happening because they ran when phi-3-4k-instruct didn't exist. Waiting on approval for pull request

ArthurZucker · 2024-04-23T15:24:30Z

Hey! As I have mentioned offline I think that converting to llama format will allow anyone to use your model without any release (for any other frameworks), and we can add the new scalings ( SuScaledRotaryEmbedding, YarnScaledRotaryEmbedding) separately!

This would unlock the entire community! 🤗

EDIT: this seems impossible as other frameworks are adapting to this format instead. We'll support the fused weights let's make the weights standardized for the good of the entire community! 🚀 🔥

… rmsnorm.

gugarosa · 2024-04-23T17:06:44Z

As mentioned in Slack, other team already sent a PR with the previous Phi3LongScaledRotaryEmbedding structure 😭

ArthurZucker · 2024-04-23T17:08:45Z

But that should not impact the weights no?

gugarosa · 2024-04-23T17:11:21Z

No, let me see what I can do!

gugarosa · 2024-04-23T17:44:40Z

Updated the rotary embedding classes. Fingers crossed that all tests will pass!

ArthurZucker

Overall LGTM, let's pay attention to ROPE and make it as simple and close to what our users are used to!

docs/source/en/model_doc/phi3.md

src/transformers/models/phi3/configuration_phi3.py

src/transformers/models/phi3/modeling_phi3.py

tests/models/phi3/test_modeling_phi3.py

fakerybakery · 2024-04-23T20:41:59Z

EDIT: this seems impossible as other frameworks are adapting to this format instead. We'll support the fused weights let's make the weights standardized for the good of the entire community! 🚀 🔥

So no chance we could get a Llama-converted version?

gugarosa · 2024-04-23T21:37:19Z

Overall LGTM, let's pay attention to ROPE and make it as simple and close to what our users are used to!

I think RoPE should be clearer now and closer to what transformers proposes. What do you think?

For now, I would like to keep the short_factor and long_factor on the config.json because I am still waiting for some replies on how they were created.

ArthurZucker

🔥 very clean, left 2 small nits about using the config instead of passing kwargs for ROPE, and no one liners. THat's it

src/transformers/models/phi3/modeling_phi3.py

ArthurZucker · 2024-04-24T08:09:53Z

Solving the conflicts: accept all changes from main, ignore current

HuggingFaceDocBuilderDev · 2024-04-24T08:26:47Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

gugarosa · 2024-04-24T13:30:12Z

Hopefully I was able to address everything! But please let me know if anything else is needed

ArthurZucker

🚀 great work

ydshieh · 2024-04-24T14:55:54Z

Hi a CI is triggered https://github.com/huggingface/transformers/actions/runs/8818618312

ydshieh · 2024-04-24T14:56:56Z

~~But since the CI in this PR is not green yet, you might consider to rebase on main once #30456 is merged into main.~~

Sorry about the mistake in the workflow file.

ydshieh · 2024-04-24T15:02:12Z

Well

=========== 100 passed, 47 skipped, 26 warnings in 391.83s (0:06:31) ===========

💯

I ❤️ see this ! Thank you for the great contribution 🚀

ydshieh · 2024-04-24T15:29:53Z

The circle CI seems crazy to run my branch #30455. If it is not green, it's fine. The last commit aeb6ae7ebefcd4360441c2df4f20c0bcd45ff6f5 is green anyway

https://app.circleci.com/pipelines/github/huggingface/transformers?branch=pull%2F30423

ArthurZucker · 2024-04-24T15:31:44Z

Merging! 👍🏻

gugarosa · 2024-04-24T15:32:04Z

Thanks for everything folks!

ArthurZucker · 2024-04-24T15:32:32Z

Thanks for yet another great contribution to the ecosystem !

ydshieh · 2024-04-26T12:27:33Z

Hi @gugarosa

(I believe) The commit on Hub repo https://huggingface.co/microsoft/Phi-3-mini-4k-instruct/commit/3c0c9df9c11252fb61789d7847fa7d03f2825596

failed some tests

tests/models/phi3/test_modeling_phi3.py::Phi3IntegrationTest::test_phi3_mini_128k_instruct_generation FAILED [ 99%]
tests/models/phi3/test_modeling_phi3.py::Phi3IntegrationTest::test_phi3_mini_4k_instruct_generation FAILED [100%]

see this job https://github.com/huggingface/transformers/actions/runs/8842226344/job/24280809384

Could you take a look and open a PR for a fix please?

You can run things like

RUN_SLOW=1 python3 -m pytest -rs -v tests/models/phi3/test_modeling_phi3.py::Phi3IntegrationTest::test_phi3_mini_4k_instruct_generation

Same for the example in the file

docs/source/en/model_doc/phi3.md

ydshieh · 2024-04-26T12:31:31Z

And the doc example for the class Phi3ForCausalLM doesn't seem working (from the first day)

from transformers import AutoTokenizer, Phi3ForCausalLM

model = Phi3ForCausalLM.from_pretrained("microsoft/phi-3-mini-4k-instruct")
tokenizer = AutoTokenizer.from_pretrained("microsoft/phi-3-mini-4k-instruct")

prompt = "This is an example script ."
inputs = tokenizer(prompt, return_tensors="pt")

# Generate
generate_ids = model.generate(inputs.input_ids, max_new_tokens=30)
o = tokenizer.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]
print(o)

It gives

This is an example script .

which is just the input text itself.

Would be great if you can check this too, thanks.

ydshieh · 2024-05-02T09:24:15Z

#30423 (comment)

Hi @gugarosa It would very nice if you could take a look 🤗 .

* chore(root): Initial commit of Phi-3 files. * fix(root): Fixes Phi-3 missing on readme. * fix(root): Ensures files are consistent. * fix(phi3): Fixes unit tests. * fix(tests): Fixes style of phi-3 test file. * chore(tests): Adds integration tests for Phi-3. * fix(phi3): Removes additional flash-attention usage, .e.g, swiglu and rmsnorm. * fix(phi3): Fixes incorrect docstrings. * fix(phi3): Fixes docstring typos. * fix(phi3): Adds support for Su and Yarn embeddings. * fix(phi3): Improves according first batch of reviews. * fix(phi3): Uses up_states instead of y in Phi3MLP. * fix(phi3): Uses gemma rotary embedding to support torch.compile. * fix(phi3): Improves how rotary embedding classes are defined. * fix(phi3): Fixes inv_freq not being re-computed for extended RoPE. * fix(phi3): Adds last suggestions to modeling file. * fix(phi3): Splits inv_freq calculation in two lines.

gugarosa added 5 commits April 23, 2024 05:08

chore(root): Initial commit of Phi-3 files.

c1e38b0

fix(root): Fixes Phi-3 missing on readme.

416eaa4

fix(root): Ensures files are consistent.

e0b6815

fix(phi3): Fixes unit tests.

912edf1

fix(tests): Fixes style of phi-3 test file.

b62e6f3

chore(tests): Adds integration tests for Phi-3.

508ec8e

gugarosa marked this pull request as ready for review April 23, 2024 15:21

gugarosa added 3 commits April 23, 2024 08:39

fix(phi3): Removes additional flash-attention usage, .e.g, swiglu and…

56e6464

… rmsnorm.

fix(phi3): Fixes incorrect docstrings.

9bc1f1f

fix(phi3): Fixes docstring typos.

92d8379

ArthurZucker mentioned this pull request Apr 23, 2024

[Model] Adds Phi-3 support vllm-project/vllm#4298

Merged

fix(phi3): Adds support for Su and Yarn embeddings.

c442d06

ArthurZucker reviewed Apr 23, 2024

View reviewed changes

gugarosa added 2 commits April 23, 2024 13:37

fix(phi3): Improves according first batch of reviews.

d5aed89

fix(phi3): Uses up_states instead of y in Phi3MLP.

3a24a1d

gugarosa added 2 commits April 23, 2024 13:46

fix(phi3): Uses gemma rotary embedding to support torch.compile.

4cfa767

fix(phi3): Improves how rotary embedding classes are defined.

817fec7

fix(phi3): Fixes inv_freq not being re-computed for extended RoPE.

9427419

ArthurZucker added the single-model-run-slow label Apr 24, 2024

ArthurZucker approved these changes Apr 24, 2024

View reviewed changes

LZHgrla reviewed Apr 24, 2024

View reviewed changes

src/transformers/models/phi3/modeling_phi3.py Show resolved Hide resolved

gugarosa added 3 commits April 24, 2024 06:08

Merge remote-tracking branch 'upstream/main' into main

06cd06d

fix(phi3): Adds last suggestions to modeling file.

2abcd4d

fix(phi3): Splits inv_freq calculation in two lines.

aeb6ae7

ArthurZucker approved these changes Apr 24, 2024

View reviewed changes

ArthurZucker merged commit c9693db into huggingface:main Apr 24, 2024
25 checks passed

vikram71198 mentioned this pull request Apr 30, 2024

flash-attention is not running, although is_flash_attn_2_available() returns true #30547

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Phi-3 #30423

Phi-3 #30423

gugarosa commented Apr 23, 2024 •

edited

Loading

gugarosa commented Apr 23, 2024 •

edited

Loading

CoderCowMoo commented Apr 23, 2024

ArthurZucker commented Apr 23, 2024 •

edited

Loading

gugarosa commented Apr 23, 2024

ArthurZucker commented Apr 23, 2024

gugarosa commented Apr 23, 2024

gugarosa commented Apr 23, 2024 •

edited

Loading

ArthurZucker left a comment

fakerybakery commented Apr 23, 2024

gugarosa commented Apr 23, 2024

ArthurZucker left a comment

ArthurZucker commented Apr 24, 2024

HuggingFaceDocBuilderDev commented Apr 24, 2024

gugarosa commented Apr 24, 2024

ArthurZucker left a comment

ydshieh commented Apr 24, 2024

ydshieh commented Apr 24, 2024 •

edited

Loading

ydshieh commented Apr 24, 2024

ydshieh commented Apr 24, 2024

ArthurZucker commented Apr 24, 2024

gugarosa commented Apr 24, 2024

ArthurZucker commented Apr 24, 2024

ydshieh commented Apr 26, 2024 •

edited

Loading

ydshieh commented Apr 26, 2024

ydshieh commented May 2, 2024

Phi-3 #30423

Phi-3 #30423

Conversation

gugarosa commented Apr 23, 2024 • edited Loading

What does this PR do?

Before submitting

Who can review?

gugarosa commented Apr 23, 2024 • edited Loading

CoderCowMoo commented Apr 23, 2024

ArthurZucker commented Apr 23, 2024 • edited Loading

gugarosa commented Apr 23, 2024

ArthurZucker commented Apr 23, 2024

gugarosa commented Apr 23, 2024

gugarosa commented Apr 23, 2024 • edited Loading

ArthurZucker left a comment

Choose a reason for hiding this comment

fakerybakery commented Apr 23, 2024

gugarosa commented Apr 23, 2024

ArthurZucker left a comment

Choose a reason for hiding this comment

ArthurZucker commented Apr 24, 2024

HuggingFaceDocBuilderDev commented Apr 24, 2024

gugarosa commented Apr 24, 2024

ArthurZucker left a comment

Choose a reason for hiding this comment

ydshieh commented Apr 24, 2024

ydshieh commented Apr 24, 2024 • edited Loading

ydshieh commented Apr 24, 2024

ydshieh commented Apr 24, 2024

ArthurZucker commented Apr 24, 2024

gugarosa commented Apr 24, 2024

ArthurZucker commented Apr 24, 2024

ydshieh commented Apr 26, 2024 • edited Loading

ydshieh commented Apr 26, 2024

ydshieh commented May 2, 2024

gugarosa commented Apr 23, 2024 •

edited

Loading

gugarosa commented Apr 23, 2024 •

edited

Loading

ArthurZucker commented Apr 23, 2024 •

edited

Loading

gugarosa commented Apr 23, 2024 •

edited

Loading

ydshieh commented Apr 24, 2024 •

edited

Loading

ydshieh commented Apr 26, 2024 •

edited

Loading