-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
torchvision.io.read_image return tensor shape is different. #3332
Comments
It seems that the documentation and development are not syncing. |
@kairos03 The latest master of TorchVision has been updated to support reading grayscale images, transparency etc. So what you report is not a bug but an expected behaviour. See #2984, #2988 and #3024 for details on the feature. This feature is not included at version 0.8.2 but it's only available on latest master. On version 0.8.2 you should be getting an exception: Which version of TorchVision as you currently using? |
@datumbox I'm using v0.8.2 |
Could you please confirm by checking the outputs of:
|
Here:
|
@kairos03 I can't reproduce what you see: Reading a grayscale or transparent image using 0.8.2 throws an exception for me:
I suspect that you might have installed a different torchvision version on a virtual environment, possibly the latest master or a nightly. Version 0.8.2 did not have support for non-RGB images and this was added later on latest master. At any case, what you report is not a bug. We just added support for additional image types. The documentation has also been updated to reflect that and the website will be updated on the next release. I'll close the issue but if you feel you need more support feel free to reopen it. |
set the Args 'mode=ImageReadMode.RGB' can change output to [3, width, height] |
What can we do if we are stuck with torchvision 0.8.2? Is there no solution? |
@ckyleda if you can't update torchvision to latest version, you'll have to add some extra logic in your code to handle it. Something like img = torchvision.io.read_image("my_img.png")
if img.shape[0] == 4:
img = img[:3]
elif img.shape[0] == 1:
img = img.repeat(3, 1, 1) |
This works for images that are grayscale; but I have RGB images where the actual channels are important and replicating the information across all channels is not desired behavior. It blows the mind that defaulting to single-channel image reading was ever implemented in the first place. I suspect this probably means I cannot use torch for my use case. |
@ckyleda I'm sorry, I don't understand your last comment. The default behavior for |
This mode load png as 3 channel |
it works! Thanks~ |
🐛 Bug
torchvision.io.read_image return tensor shape is different with [3, width, height] on the document when reading the grayscale or RGBA image. It returns [1, width, height] or [4, width, height].
https://pytorch.org/docs/stable/torchvision/io.html#torchvision.io.read_image
To Reproduce
Steps to reproduce the behavior:
Expected behavior
Environment
PyTorch version: 1.7.1
Is debug build: False
CUDA used to build PyTorch: 10.2
ROCM used to build PyTorch: N/A
OS: Ubuntu 18.04.3 LTS (x86_64)
GCC version: (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
Clang version: Could not collect
CMake version: version 3.10.2
Python version: 3.7 (64-bit runtime)
Is CUDA available: True
CUDA runtime version: Could not collect
GPU models and configuration: GPU 0: GeForce GTX 1080 Ti
Nvidia driver version: 440.100
cuDNN version: /usr/lib/x86_64-linux-gnu/libcudnn.so.7.6.5
HIP runtime version: N/A
MIOpen runtime version: N/A
Versions of relevant libraries:
[pip] numpy==1.19.4
[pip] torch==1.7.1
[pip] torchaudio==0.7.0a0+a853dff
[pip] torchvision==0.8.2
[conda] blas 1.0 mkl
[conda] cudatoolkit 10.2.89 hfd86e86_1
[conda] mkl 2020.0 166
[conda] mkl-service 2.3.0 py37he904b0f_0
[conda] mkl_fft 1.0.15 py37ha843d7b_0
[conda] mkl_random 1.1.0 py37hd6b4f25_0
[conda] numpy 1.19.4 pypi_0 pypi
[conda] pytorch 1.7.1 py3.7_cuda10.2.89_cudnn7.6.5_0 pytorch
[conda] torchaudio 0.7.2 py37 pytorch
[conda] torchvision 0.8.2 py37_cu102 pytorch
The text was updated successfully, but these errors were encountered: