[AIR] Address `UserWarning` from Torch training #28004

bveeramani · 2022-08-18T22:07:52Z

Why are these changes needed?

Related issue number

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

clarkzinzow · 2022-08-18T23:34:51Z

python/ray/air/_internal/torch_utils.py

+    # NOTE: PyTorch raises a `UserWarning` if `ndarray` isn't writeable. See #28003.
+    if not ndarray.flags["WRITEABLE"]:
+        ndarray = np.copy(ndarray)
    return torch.as_tensor(ndarray, dtype=dtype, device=device)


I don't think that copying the ndarray here is the right solution, since we want to be able to reuse the shared memory tensor data buffers without incurring unnecessary copies. We're also under the expectation the user/model will not mutate these input tensors during training and inference, so I don't think that the copy is necessary for correctness.

What if we suppressed the warning instead via something like:

Suggested change

# NOTE: PyTorch raises a `UserWarning` if `ndarray` isn't writeable. See #28003.

if not ndarray.flags["WRITEABLE"]:

ndarray = np.copy(ndarray)

return torch.as_tensor(ndarray, dtype=dtype, device=device)

with warnings.catch_warnings():

warnings.simplefilter("ignore")

tensor = torch.as_tensor(ndarray, dtype=dtype, device=device)

return tensor

Are we confident that the warning is harmless? If so, I think this is fine/

If the user tries to mutate the tensor (e.g. with in-place tensor operations) then this will be undefined behavior, but the common case is that this tensor data buffer will be left untouched in Plasma until we either (a) transfer the tensor to the GPU, or (b) we create a new tensor via an operation that creates a copy; in either case, we're not likely to mutate it.

I'm not sure if we should suppress the warning, though, since it's a useful signal to the user that they will need to .clone() it if they wish to mutate it.

It'd be great to suppress the repeated warnings (if that is somehow possible).

bveeramani · 2022-09-07T00:29:38Z

Closing this PR because I don't know how to suppress repeated warning with multiple workers

Update torch_utils.py

f4d370d

bveeramani assigned clarkzinzow Aug 18, 2022

bveeramani marked this pull request as draft August 18, 2022 22:07

clarkzinzow reviewed Aug 18, 2022

View reviewed changes

bveeramani self-assigned this Aug 30, 2022

bveeramani closed this Sep 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AIR] Address `UserWarning` from Torch training #28004

[AIR] Address `UserWarning` from Torch training #28004

bveeramani commented Aug 18, 2022

clarkzinzow Aug 18, 2022

bveeramani Aug 18, 2022

clarkzinzow Aug 19, 2022

clarkzinzow Aug 19, 2022

richardliaw Aug 19, 2022

bveeramani commented Sep 7, 2022

[AIR] Address UserWarning from Torch training #28004

[AIR] Address UserWarning from Torch training #28004

Conversation

bveeramani commented Aug 18, 2022

Why are these changes needed?

Related issue number

Checks

clarkzinzow Aug 18, 2022

Choose a reason for hiding this comment

bveeramani Aug 18, 2022

Choose a reason for hiding this comment

clarkzinzow Aug 19, 2022

Choose a reason for hiding this comment

clarkzinzow Aug 19, 2022

Choose a reason for hiding this comment

richardliaw Aug 19, 2022

Choose a reason for hiding this comment

bveeramani commented Sep 7, 2022

[AIR] Address `UserWarning` from Torch training #28004

[AIR] Address `UserWarning` from Torch training #28004