-
Notifications
You must be signed in to change notification settings - Fork 27.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replaces calls to .cuda
with .to(torch_device)
in tests
#25571
Conversation
`torch.Tensor.cuda()` is a pre-0.4 solution to changing a tensor's device. It is recommended to prefer `.to(...)` for greater flexibility and error handling. Furthermore, this makes it more consistent with other tests (that tend to use `.to(torch_device)`) and ensures the correct device backend is used (if `torch_device` is neither `cpu` or `cuda`).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If cuda was specified as the device, we should use cuda
and not torch_device
since these tests are usually meant to be ran on GPU were results can vary.
Isn't this the case for a lot of other tests? They use the decorator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly talking about jukebox
which does not have the require_gpu
if I am not mistaken. You can just revert jukebox changes not really important.
Otherwise this does not seem to increase readability. If you can make the snippets fit in two lines would better!
greedy_output = model.generate( | ||
input_ids["input_ids"].cuda(), attention_mask=input_ids["attention_mask"], max_length=50, do_sample=False | ||
input_ids["input_ids"].to(torch_device), | ||
attention_mask=input_ids["attention_mask"], | ||
max_length=50, | ||
do_sample=False, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can fit in two lines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sadly not without causing the CI to fail when checking style 😓 Splitting into four lines was a direct result of calling make style
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can still do something like:
greedy_output = model.generate( | |
input_ids["input_ids"].cuda(), attention_mask=input_ids["attention_mask"], max_length=50, do_sample=False | |
input_ids["input_ids"].to(torch_device), | |
attention_mask=input_ids["attention_mask"], | |
max_length=50, | |
do_sample=False, | |
) | |
input_id, attention_mask = input_ids["input_ids"].to(torch_device), input_ids["attention_mask"] | |
greedy_output = model.generate(input_ids, attention_mask=attention_mask, max_length=50, do_sample=False) |
I added |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just on last nit on the formatting.
greedy_output = model.generate( | ||
input_ids["input_ids"].cuda(), attention_mask=input_ids["attention_mask"], max_length=50, do_sample=False | ||
input_ids["input_ids"].to(torch_device), | ||
attention_mask=input_ids["attention_mask"], | ||
max_length=50, | ||
do_sample=False, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can still do something like:
greedy_output = model.generate( | |
input_ids["input_ids"].cuda(), attention_mask=input_ids["attention_mask"], max_length=50, do_sample=False | |
input_ids["input_ids"].to(torch_device), | |
attention_mask=input_ids["attention_mask"], | |
max_length=50, | |
do_sample=False, | |
) | |
input_id, attention_mask = input_ids["input_ids"].to(torch_device), input_ids["attention_mask"] | |
greedy_output = model.generate(input_ids, attention_mask=attention_mask, max_length=50, do_sample=False) |
Nice suggestions, I misunderstood what you meant initially by splitting into two lines. Hope it is all good now~ |
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. |
Co-authored-by: Arthur <[email protected]>
This should be good now 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
…ace#25571) * Replaces calls to `.cuda` with `.to(torch_device)` in tests `torch.Tensor.cuda()` is a pre-0.4 solution to changing a tensor's device. It is recommended to prefer `.to(...)` for greater flexibility and error handling. Furthermore, this makes it more consistent with other tests (that tend to use `.to(torch_device)`) and ensures the correct device backend is used (if `torch_device` is neither `cpu` or `cuda`). * addressing review comments * more formatting changes in Bloom test * `make style` * Update tests/models/bloom/test_modeling_bloom.py Co-authored-by: Arthur <[email protected]> * fixes style failures --------- Co-authored-by: Arthur <[email protected]>
torch.Tensor.cuda()
is a pre-0.4 solution to changing a tensor's device. It is recommended to prefer.to(...)
for greater flexibility and error handling. Furthermore, this makes it more consistent with other tests (that tend to use.to(torch_device)
) and ensures the correct device backend is used (iftorch_device
is neithercpu
orcuda
).This could be the case if
TRANSFORMERS_TEST_DEVICE
is notcpu
orcuda
. See #25506.By default, I don't think this PR should change any test behaviour, but let me know if this is misguided.
What does this PR do?
Replaces calls to
torch.Tensor.cuda()
with.to(torch_device)
equivalents. This not only ensures consistency between different tests and their management of device, but also makes tests more flexible with regard to custom or less common PyTorch backends.Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
This affects multiple tests an doesn't target any specific modality. However, they are all PyTorch models. @sgugger, hope you don't mind me tagging you again 🙂