Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix MistralIntegrationTest #31231

Merged
merged 4 commits into from
Jun 4, 2024
Merged

Fix MistralIntegrationTest #31231

merged 4 commits into from
Jun 4, 2024

Conversation

ydshieh
Copy link
Collaborator

@ydshieh ydshieh commented Jun 4, 2024

What does this PR do?

Also,

        del model
        backend_empty_cache(torch_device)
        gc.collect()

this is not helping and worse, with it, we actually get GPU OOM (here for test_model_7b_long_prompt_sdpa). After removing them, the tests are all passing now.

111 passed, 48 skipped, 276 warnings in 205.46s (0:03:25)

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@@ -526,7 +526,7 @@ def test_model_7b_logits(self):
# Note: Key 9 is currently set for MI300, but may need potential future adjustments for H100s,
# considering differences in hardware processing and potential deviations in output.
EXPECTED_SLICE = {
7: torch.tensor([-5.8781, -5.8616, -0.1052, -4.7200, -5.8781, -5.8774, -5.8773, -5.8777, -5.8781, -5.8780, -5.8781, -5.8779, -1.0787, 1.7583, -5.8779, -5.8780, -5.8783, -5.8778, -5.8776, -5.8781, -5.8784, -5.8778, -5.8778, -5.8777, -5.8779, -5.8778, -5.8776, -5.8780, -5.8779, -5.8781]),
7: torch.tensor([-5.8828, -5.8633, -0.1042, -4.7266, -5.8828, -5.8789, -5.8789, -5.8828, -5.8828, -5.8828, -5.8828, -5.8828, -1.0801, 1.7598, -5.8828, -5.8828, -5.8828, -5.8828, -5.8828, -5.8828, -5.8828, -5.8828, -5.8828, -5.8828, -5.8828, -5.8828, -5.8828, -5.8828, -5.8828, -5.8828]),
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should update in #29905 but forgot

@slow
@require_bitsandbytes
def test_model_7b_generation(self):
EXPECTED_TEXT_COMPLETION = {
7: "My favourite condiment is 100% ketchup. I love it on everything. I'm not a big",
7: "My favourite condiment is 100% ketchup. Im not a fan of mustard, mayo,",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should update in #29905 but forgot

Comment on lines -562 to -565
del model
backend_empty_cache(torch_device)
gc.collect()

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not help and worse cause some GPU OOM in subsequent tests

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to have this deleted but very confused why this would cause OOM 😭

Copy link
Collaborator Author

@ydshieh ydshieh Jun 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got to say I am confused too. torch.cuda.empty_cache is not really a magic

empty_cache() doesn’t increase the amount of GPU memory available for PyTorch.

but I was not expecting it would have undesired side-effect like this (even if it is not helpful).

I don't check if del model and gc.collect() plays a role here though.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity and keep info here for the record:

  • it is test_model_7b_long_prompt gets OOM.
    • previously with those empty cache, at the beginning of test_model_7b_long_prompt, nvidia-smi shows 150MiB / 15360MiB which looks nice but we get OOM afterward inside this test
    • wihout empty cache, nvidia-smi shows 9066MiB/ 15360MiB which looks not great but we DON'T get OOM afterward inside this test

It's very mysterious to me.

@@ -635,7 +622,7 @@ def test_speculative_generation(self):
# Note: Key 9 is currently set for MI300, but may need potential future adjustments for H100s,
# considering differences in hardware processing and potential deviations in generated text.
EXPECTED_TEXT_COMPLETION = {
7: "My favourite condiment is 100% Sriracha. I love the heat, the tang and the fact costs",
7: "My favourite condiment is 100% ketchup. I love it on everything. I’m not a big",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see PR description

@ydshieh ydshieh requested a review from amyeroberts June 4, 2024 14:13
Copy link
Collaborator

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing!

Comment on lines -562 to -565
del model
backend_empty_cache(torch_device)
gc.collect()

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to have this deleted but very confused why this would cause OOM 😭

@ydshieh ydshieh merged commit fd3238b into main Jun 4, 2024
21 checks passed
@ydshieh ydshieh deleted the fix_mistral_int branch June 4, 2024 16:04
zucchini-nlp pushed a commit to zucchini-nlp/transformers that referenced this pull request Jun 11, 2024
* fix

* fix

* fix

* fix

---------

Co-authored-by: ydshieh <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants