Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replaces calls to .cuda with .to(torch_device) in tests #25571

Merged
merged 6 commits into from
Aug 18, 2023

Conversation

vvvm23
Copy link
Contributor

@vvvm23 vvvm23 commented Aug 17, 2023

torch.Tensor.cuda() is a pre-0.4 solution to changing a tensor's device. It is recommended to prefer .to(...) for greater flexibility and error handling. Furthermore, this makes it more consistent with other tests (that tend to use .to(torch_device)) and ensures the correct device backend is used (if torch_device is neither cpu or cuda).

This could be the case if TRANSFORMERS_TEST_DEVICE is not cpu or cuda. See #25506.

By default, I don't think this PR should change any test behaviour, but let me know if this is misguided.

What does this PR do?

Replaces calls to torch.Tensor.cuda() with .to(torch_device) equivalents. This not only ensures consistency between different tests and their management of device, but also makes tests more flexible with regard to custom or less common PyTorch backends.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

This affects multiple tests an doesn't target any specific modality. However, they are all PyTorch models. @sgugger, hope you don't mind me tagging you again 🙂

`torch.Tensor.cuda()` is a pre-0.4 solution to changing a tensor's device. It is recommended to prefer `.to(...)` for greater flexibility and error handling. Furthermore, this makes it more consistent with other tests (that tend to use `.to(torch_device)`) and ensures the correct device backend is used (if `torch_device` is neither `cpu` or `cuda`).
Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If cuda was specified as the device, we should use cuda and not torch_device since these tests are usually meant to be ran on GPU were results can vary.

@vvvm23
Copy link
Contributor Author

vvvm23 commented Aug 17, 2023

Isn't this the case for a lot of other tests? They use the decorator @require_torch_gpu to skip the test if torch_device != cuda, so the tests will still only be run on GPU as they are meant to be.

Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly talking about jukebox which does not have the require_gpu if I am not mistaken. You can just revert jukebox changes not really important.

Otherwise this does not seem to increase readability. If you can make the snippets fit in two lines would better!

Comment on lines 453 to 458
greedy_output = model.generate(
input_ids["input_ids"].cuda(), attention_mask=input_ids["attention_mask"], max_length=50, do_sample=False
input_ids["input_ids"].to(torch_device),
attention_mask=input_ids["attention_mask"],
max_length=50,
do_sample=False,
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can fit in two lines

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly not without causing the CI to fail when checking style 😓 Splitting into four lines was a direct result of calling make style

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can still do something like:

Suggested change
greedy_output = model.generate(
input_ids["input_ids"].cuda(), attention_mask=input_ids["attention_mask"], max_length=50, do_sample=False
input_ids["input_ids"].to(torch_device),
attention_mask=input_ids["attention_mask"],
max_length=50,
do_sample=False,
)
input_id, attention_mask = input_ids["input_ids"].to(torch_device), input_ids["attention_mask"]
greedy_output = model.generate(input_ids, attention_mask=attention_mask, max_length=50, do_sample=False)

@vvvm23
Copy link
Contributor Author

vvvm23 commented Aug 17, 2023

I added torch_device into jukebox as I didn't see why it shouldn't be there if it is in basically every other test. If there is some extra behaviour or meaning I am missing, it would be helpful to know. In any case, the modifications to Jukebox are limited to two tests: one of which is skipped entirely anyway and the other is test_fp16_slow_sampling which I noticed does not have the require_torch_gpu decorator – only the slow decorator. I'll add one if you think it is worth it here 🙂

Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just on last nit on the formatting.

Comment on lines 453 to 458
greedy_output = model.generate(
input_ids["input_ids"].cuda(), attention_mask=input_ids["attention_mask"], max_length=50, do_sample=False
input_ids["input_ids"].to(torch_device),
attention_mask=input_ids["attention_mask"],
max_length=50,
do_sample=False,
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can still do something like:

Suggested change
greedy_output = model.generate(
input_ids["input_ids"].cuda(), attention_mask=input_ids["attention_mask"], max_length=50, do_sample=False
input_ids["input_ids"].to(torch_device),
attention_mask=input_ids["attention_mask"],
max_length=50,
do_sample=False,
)
input_id, attention_mask = input_ids["input_ids"].to(torch_device), input_ids["attention_mask"]
greedy_output = model.generate(input_ids, attention_mask=attention_mask, max_length=50, do_sample=False)

tests/models/jukebox/test_modeling_jukebox.py Outdated Show resolved Hide resolved
@vvvm23
Copy link
Contributor Author

vvvm23 commented Aug 18, 2023

Nice suggestions, I misunderstood what you meant initially by splitting into two lines. Hope it is all good now~

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

@vvvm23
Copy link
Contributor Author

vvvm23 commented Aug 18, 2023

This should be good now 👍

Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@ArthurZucker ArthurZucker merged commit 9d7afd2 into huggingface:main Aug 18, 2023
@vvvm23 vvvm23 deleted the testing-remove-dotcuda branch August 18, 2023 10:41
blbadger pushed a commit to blbadger/transformers that referenced this pull request Nov 8, 2023
…ace#25571)

* Replaces calls to `.cuda` with `.to(torch_device)` in tests
`torch.Tensor.cuda()` is a pre-0.4 solution to changing a tensor's device. It is recommended to prefer `.to(...)` for greater flexibility and error handling. Furthermore, this makes it more consistent with other tests (that tend to use `.to(torch_device)`) and ensures the correct device backend is used (if `torch_device` is neither `cpu` or `cuda`).

* addressing review comments

* more formatting changes in Bloom test

* `make style`

* Update tests/models/bloom/test_modeling_bloom.py

Co-authored-by: Arthur <[email protected]>

* fixes style failures

---------

Co-authored-by: Arthur <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants