-
Notifications
You must be signed in to change notification settings - Fork 27.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix MistralIntegrationTest
#31231
Fix MistralIntegrationTest
#31231
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
@@ -526,7 +526,7 @@ def test_model_7b_logits(self): | |||
# Note: Key 9 is currently set for MI300, but may need potential future adjustments for H100s, | |||
# considering differences in hardware processing and potential deviations in output. | |||
EXPECTED_SLICE = { | |||
7: torch.tensor([-5.8781, -5.8616, -0.1052, -4.7200, -5.8781, -5.8774, -5.8773, -5.8777, -5.8781, -5.8780, -5.8781, -5.8779, -1.0787, 1.7583, -5.8779, -5.8780, -5.8783, -5.8778, -5.8776, -5.8781, -5.8784, -5.8778, -5.8778, -5.8777, -5.8779, -5.8778, -5.8776, -5.8780, -5.8779, -5.8781]), | |||
7: torch.tensor([-5.8828, -5.8633, -0.1042, -4.7266, -5.8828, -5.8789, -5.8789, -5.8828, -5.8828, -5.8828, -5.8828, -5.8828, -1.0801, 1.7598, -5.8828, -5.8828, -5.8828, -5.8828, -5.8828, -5.8828, -5.8828, -5.8828, -5.8828, -5.8828, -5.8828, -5.8828, -5.8828, -5.8828, -5.8828, -5.8828]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should update in #29905 but forgot
@slow | ||
@require_bitsandbytes | ||
def test_model_7b_generation(self): | ||
EXPECTED_TEXT_COMPLETION = { | ||
7: "My favourite condiment is 100% ketchup. I love it on everything. I'm not a big", | ||
7: "My favourite condiment is 100% ketchup. I’m not a fan of mustard, mayo,", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should update in #29905 but forgot
del model | ||
backend_empty_cache(torch_device) | ||
gc.collect() | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not help and worse cause some GPU OOM in subsequent tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to have this deleted but very confused why this would cause OOM 😭
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got to say I am confused too. torch.cuda.empty_cache
is not really a magic
empty_cache() doesn’t increase the amount of GPU memory available for PyTorch.
but I was not expecting it would have undesired side-effect like this (even if it is not helpful).
I don't check if del model
and gc.collect()
plays a role here though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity and keep info here for the record:
- it is
test_model_7b_long_prompt
gets OOM.- previously with those empty cache, at the beginning of
test_model_7b_long_prompt
,nvidia-smi
shows150MiB / 15360MiB
which looks nice but we get OOM afterward inside this test - wihout empty cache,
nvidia-smi
shows9066MiB/ 15360MiB
which looks not great but we DON'T get OOM afterward inside this test
- previously with those empty cache, at the beginning of
It's very mysterious to me.
@@ -635,7 +622,7 @@ def test_speculative_generation(self): | |||
# Note: Key 9 is currently set for MI300, but may need potential future adjustments for H100s, | |||
# considering differences in hardware processing and potential deviations in generated text. | |||
EXPECTED_TEXT_COMPLETION = { | |||
7: "My favourite condiment is 100% Sriracha. I love the heat, the tang and the fact costs", | |||
7: "My favourite condiment is 100% ketchup. I love it on everything. I’m not a big", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see PR description
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing!
del model | ||
backend_empty_cache(torch_device) | ||
gc.collect() | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to have this deleted but very confused why this would cause OOM 😭
* fix * fix * fix * fix --------- Co-authored-by: ydshieh <[email protected]>
What does this PR do?
test_speculative_generation
: it was failed due to 9efec11 (Jan 19 2024) then 2e27291 (May 13 2024). I would trust those 2 PRs and simply update the expected outputs (cc @gante )test_model_7b_generation
andtest_model_7b_logits
: it is from my PR Fix slow tests for important models to be compatible with A10 runners #29905 where I changeddtype
and/orload_in_4bit
infrom_pretrained
in the tests but forgot to update the expected output values for T4.Also,
this is not helping and worse, with it, we actually get
GPU OOM
(here fortest_model_7b_long_prompt_sdpa
). After removing them, the tests are all passing now.