[dataloader] Multiple warnings printed when torch.as_tensor applied to readonly NumPy tensor #37581

vadimkantorov · 2020-04-30T15:28:39Z

I get this warning printed every time I do:

sample_rate_, signal = scipy.io.wavfile.read(audio_path)
signal = torch.as_tensor(signal)

UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program.

Maybe it's related to the fact that this code gets executed within multi-threaded DataLoader threads? So I get one warning per thread which is still nasty.

It would be nice to be able to suppress this warning with PyTorch ways without having to clone the tensor. This situation is pretty common, I'd like to suggest it doesn't deserve figuring out how to suppress in general Python warnings :), e.g. passing some flag to torch.as_tensor writable = True that would ensure that the user is conscious about what's going on.

cc @ssnl

The text was updated successfully, but these errors were encountered:

mruberry · 2020-05-04T17:11:47Z

We're actually planning to update our warning code to make these warnings less intrusive, but a workaround is to mark the NumPy array as writeable before calling torch.as_tensor() (see https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.setflags.html).

ezyang · 2020-05-04T17:12:10Z

You mean multiprocessing dataloader? I'm pretty sure TORCH_WARN_ONCE triggers only once per process execution.

Another workaround would be to just manually suppress the warning using Python's warning handlers.

albanD · 2020-05-04T17:27:46Z

I think this is made worst due to this: #15849
Making the once not work properly.

vadimkantorov · 2020-05-06T08:37:59Z

Setflags workaround doesn’t work:
“
ValueError: cannot set WRITEABLE flag to True of this array»

mruberry · 2020-05-06T17:08:04Z

Setflags workaround doesn’t work:
“
ValueError: cannot set WRITEABLE flag to True of this array»

That's interesting... this kinda of increases the importance of making a copy or warning our users. I guess @ezyang's suggestion is the thing in this case, then.

vadimkantorov · 2020-05-06T17:12:35Z

A special, more intuitive suppression flag to torch.as_tensor could be useful

vadimkantorov · 2020-05-13T06:45:20Z

Related: pytorch/vision#2194

ssnl · 2020-05-13T14:30:24Z

given that we don't support nonwritable tensor, shouldn't as_tensor just copy the data for nonwritable np arrays?

vadimkantorov · 2020-05-13T15:35:35Z

Often these non-writable NumPy tensors appear as wrapper of some contig memory from an external source (like read file). An optional no-copy behavior is often desirable even in these cases: e.g. audio after decompression can be huge, and copy 1) asks to have a second free big chunk of memory, 2) introduces copying overhead

ssnl · 2020-05-13T17:27:45Z

Fair points. I wonder if we should just support nonwritable tensors.... given that we already have nontions of views and inplace ops.

albanD · 2020-05-13T17:37:27Z

The problem with that is that the notions of view and inplace are only true for the autograd. So everything that is done outside the scope of the autograd (most of our low level code) will not respect these constructs :/

ssnl · 2020-05-13T17:42:24Z

@albanD Hmm but suppose we just add an assert self->storage().is_mutable() or something in codegen of inplace methods and nonconst storage.data_ptr(). What else could trigger inplace writes?

albanD · 2020-05-13T17:52:12Z

Ho I'm not saying it's impossible. But is a bit of work I think.

Does all the methods use storage.data_ptr() to access the content? Don't we ever go around it?
Also Tensors created from blobs can be writen by anything else.
Or a user that gets the data_ptr() in python and write into it by other means.

Also I think the jit would benefit from knowing that a Tensor is immutable :)

But that feels quite overkill for this issue.
I think the default being a warning is good here. And adding a flag like "ignore_non_writable" to the from_numpy() method (or as_tensor, whichever is the right one to use here) would do the trick to avoid the spam in some special cases.
@mruberry what do you think?

mruberry · 2020-05-13T18:09:07Z

We decided to warn when creating a tensor from a read-only NumPy array. Users can mark the array as writeable, make a copy of it, or avoid writing to it. Disallowing the creation seemed onerous.

In the future it'd be great to support read-only tensors.

vadimkantorov · 2020-05-17T18:20:53Z

We decided to warn when creating a tensor from a read-only NumPy array.

Then I propose that DataLoader docs show an example of how to suppress warnings in all threads (can be useful in other scenarios as well!).

Marking an array as writable does not work, copying can be wasteful (since uncompressed audio is quite large), avoiding writes by itself does not reduce spam of many-many warnings in dataloader setting.

mruberry · 2020-05-17T22:16:56Z

I don't know if this scenarios is common enough / alarming enough to warrant a note in the docs, especially since suppressing warnings is more of a Python feature than a PyTorch-specific issue. Would we, for example, wants docs describing every time PyTorch could throw a warning and showing how to use Python's warning filtering mechanisms?

I could see a brief blog post on the issue of writeable/read-only NumPy arrays, warnings, and filtering them being interesting, though.

vadimkantorov · 2020-05-17T22:56:24Z

A problem is with DataLoader, when warnings are being printed many times (by default they would be printed only once), and the specific mechanism for DataLoader (worker_init?) is advanced.

I'll try again with warnings.ignore and will let you know

mruberry · 2020-05-18T02:08:24Z

You can also catch the warnings with a guard, or set the warning filter.

ssnl · 2020-05-18T04:21:44Z

The problem is not with DataLoader. The problem is with multiprocessing. I'm not sure if we should list all multiprocessing pitfalls in our DataLoader docs...

vadimkantorov · 2020-05-18T08:24:56Z

ok

mhnatiuk · 2020-09-23T16:31:42Z

What is the usecase for warning about array being read-only? Probably as 99% of users, I use DataLoader to (suprisingly enough) __load data __ and I consider forcing user to have a writable array as one of the most absurd pattern I have yet seen in PyTorch. I have 4M rows of data, I won't copy it because it's stored and loaded using numpy.memmap and I don't want to allow anything to write to it. I also think that in multiprocessing context that becomes even a requirement! I can't follow the logic behind this decision.

mruberry · 2020-09-23T17:01:43Z

PyTorch doesn't (yet) support read-only tensors, so you're making the data writable when you create a PyTorch tensor from it. That's why we warn.

pbelevich added module: dataloader Related to torch.utils.data.DataLoader and Sampler module: numpy Related to numpy support, and also numpy compatibility of our operators triage review labels May 1, 2020

mruberry self-assigned this May 4, 2020

pbelevich added triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module and removed triage review labels May 4, 2020

vadimkantorov closed this as completed May 18, 2020

gmuraru mentioned this issue Oct 18, 2020

Allow sharing of tensors whose size is unknown to some parties facebookresearch/CrypTen#184

Closed

vadimkantorov mentioned this issue Aug 29, 2021

as_tensor and negative strided np arrays #64107

Open

clarkzinzow mentioned this issue Oct 31, 2022

[air] Suppress "NumPy array is not writable" error in torch conversion ray-project/ray#29808

Merged

7 tasks

Geremia mentioned this issue Jun 19, 2024

frombuffer() → "The given buffer is not writable" warning, tensor has some NaNs #129077

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[dataloader] Multiple warnings printed when torch.as_tensor applied to readonly NumPy tensor #37581

[dataloader] Multiple warnings printed when torch.as_tensor applied to readonly NumPy tensor #37581

vadimkantorov commented Apr 30, 2020 •

edited by pytorch-probot bot

Loading

mruberry commented May 4, 2020

ezyang commented May 4, 2020

albanD commented May 4, 2020

vadimkantorov commented May 6, 2020

mruberry commented May 6, 2020

vadimkantorov commented May 6, 2020

vadimkantorov commented May 13, 2020

ssnl commented May 13, 2020

vadimkantorov commented May 13, 2020 •

edited

Loading

ssnl commented May 13, 2020

albanD commented May 13, 2020 •

edited

Loading

ssnl commented May 13, 2020

albanD commented May 13, 2020

mruberry commented May 13, 2020 •

edited

Loading

vadimkantorov commented May 17, 2020

mruberry commented May 17, 2020

vadimkantorov commented May 17, 2020 •

edited

Loading

mruberry commented May 18, 2020

ssnl commented May 18, 2020

vadimkantorov commented May 18, 2020

mhnatiuk commented Sep 23, 2020

mruberry commented Sep 23, 2020

[dataloader] Multiple warnings printed when torch.as_tensor applied to readonly NumPy tensor #37581

[dataloader] Multiple warnings printed when torch.as_tensor applied to readonly NumPy tensor #37581

Comments

vadimkantorov commented Apr 30, 2020 • edited by pytorch-probot bot Loading

mruberry commented May 4, 2020

ezyang commented May 4, 2020

albanD commented May 4, 2020

vadimkantorov commented May 6, 2020

mruberry commented May 6, 2020

vadimkantorov commented May 6, 2020

vadimkantorov commented May 13, 2020

ssnl commented May 13, 2020

vadimkantorov commented May 13, 2020 • edited Loading

ssnl commented May 13, 2020

albanD commented May 13, 2020 • edited Loading

ssnl commented May 13, 2020

albanD commented May 13, 2020

mruberry commented May 13, 2020 • edited Loading

vadimkantorov commented May 17, 2020

mruberry commented May 17, 2020

vadimkantorov commented May 17, 2020 • edited Loading

mruberry commented May 18, 2020

ssnl commented May 18, 2020

vadimkantorov commented May 18, 2020

mhnatiuk commented Sep 23, 2020

mruberry commented Sep 23, 2020

vadimkantorov commented Apr 30, 2020 •

edited by pytorch-probot bot

Loading

vadimkantorov commented May 13, 2020 •

edited

Loading

albanD commented May 13, 2020 •

edited

Loading

mruberry commented May 13, 2020 •

edited

Loading

vadimkantorov commented May 17, 2020 •

edited

Loading