torchvision.io.read_image return tensor shape is different. #3332

kairos03 · 2021-02-01T05:00:52Z

🐛 Bug

torchvision.io.read_image return tensor shape is different with [3, width, height] on the document when reading the grayscale or RGBA image. It returns [1, width, height] or [4, width, height].

https://pytorch.org/docs/stable/torchvision/io.html#torchvision.io.read_image

To Reproduce

Steps to reproduce the behavior:

>>> img =  torchvision.io.read_image(<grayscale image>)
>>> img.shape
(1, 123, 123)

>>> img =  torchvision.io.read_image(<RGBA image>)
>>> img.shape
(4, 123, 123)

Expected behavior

>>> img =  torchvision.io.read_image(<grayscale image>)
>>> img.shape
(3, 123, 123)

>>> img =  torchvision.io.read_image(<RGBA image>)
>>> img.shape
(3, 123, 123)

Environment

PyTorch version: 1.7.1
Is debug build: False
CUDA used to build PyTorch: 10.2
ROCM used to build PyTorch: N/A

OS: Ubuntu 18.04.3 LTS (x86_64)
GCC version: (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
Clang version: Could not collect
CMake version: version 3.10.2

Python version: 3.7 (64-bit runtime)
Is CUDA available: True
CUDA runtime version: Could not collect
GPU models and configuration: GPU 0: GeForce GTX 1080 Ti
Nvidia driver version: 440.100
cuDNN version: /usr/lib/x86_64-linux-gnu/libcudnn.so.7.6.5
HIP runtime version: N/A
MIOpen runtime version: N/A

Versions of relevant libraries:
[pip] numpy==1.19.4
[pip] torch==1.7.1
[pip] torchaudio==0.7.0a0+a853dff
[pip] torchvision==0.8.2
[conda] blas 1.0 mkl
[conda] cudatoolkit 10.2.89 hfd86e86_1
[conda] mkl 2020.0 166
[conda] mkl-service 2.3.0 py37he904b0f_0
[conda] mkl_fft 1.0.15 py37ha843d7b_0
[conda] mkl_random 1.1.0 py37hd6b4f25_0
[conda] numpy 1.19.4 pypi_0 pypi
[conda] pytorch 1.7.1 py3.7_cuda10.2.89_cudnn7.6.5_0 pytorch
[conda] torchaudio 0.7.2 py37 pytorch
[conda] torchvision 0.8.2 py37_cu102 pytorch

The text was updated successfully, but these errors were encountered:

kairos03 · 2021-02-01T09:52:57Z

It seems that the documentation and development are not syncing.
#2988

datumbox · 2021-02-01T09:53:07Z

@kairos03 The latest master of TorchVision has been updated to support reading grayscale images, transparency etc. So what you report is not a bug but an expected behaviour. See #2984, #2988 and #3024 for details on the feature.

This feature is not included at version 0.8.2 but it's only available on latest master. On version 0.8.2 you should be getting an exception:
https://github.com/pytorch/vision/blob/v0.8.2/torchvision/csrc/cpu/image/readpng_cpu.cpp#L74-L78

Which version of TorchVision as you currently using?

kairos03 · 2021-02-01T09:54:11Z

@datumbox I'm using v0.8.2

datumbox · 2021-02-01T10:05:01Z

Could you please confirm by checking the outputs of:

import torchvision
torchvision.__version__

kairos03 · 2021-02-01T10:07:32Z

Here:

Python 3.7.7 (default, Mar 23 2020, 22:36:06) 
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torchvision
>>> torchvision.__version__
'0.8.2'
>>>

datumbox · 2021-02-01T11:42:04Z

@kairos03 I can't reproduce what you see:

Reading a grayscale or transparent image using 0.8.2 throws an exception for me:

import torchvision

print(torchvision.__version__) #0.8.2

img =  torchvision.io.read_image("logos/gray_pytorch.png") #RuntimeError: Non RGB images are not supported.
print(img.shape)

I suspect that you might have installed a different torchvision version on a virtual environment, possibly the latest master or a nightly. Version 0.8.2 did not have support for non-RGB images and this was added later on latest master.

At any case, what you report is not a bug. We just added support for additional image types. The documentation has also been updated to reflect that and the website will be updated on the next release.

I'll close the issue but if you feel you need more support feel free to reopen it.

GDkids · 2021-05-26T10:16:43Z

set the Args 'mode=ImageReadMode.RGB' can change output to [3, width, height]
class ImageReadMode directly controls it
more infomation can be see in 'https://github.com/pytorch/vision/blob/master/torchvision/io/image.py#L234-L248'
I meet this question today and find this link in the first place
I think comment here maybe useful for later viewers

ckyleda · 2021-06-11T19:26:36Z

What can we do if we are stuck with torchvision 0.8.2?

Is there no solution?

fmassa · 2021-06-14T11:08:35Z

@ckyleda if you can't update torchvision to latest version, you'll have to add some extra logic in your code to handle it.

Something like

img = torchvision.io.read_image("my_img.png")
if img.shape[0] == 4:
    img = img[:3]
elif img.shape[0] == 1:
    img = img.repeat(3, 1, 1)

ckyleda · 2021-06-15T12:26:47Z

This works for images that are grayscale; but I have RGB images where the actual channels are important and replicating the information across all channels is not desired behavior.

It blows the mind that defaulting to single-channel image reading was ever implemented in the first place. I suspect this probably means I cannot use torch for my use case.

fmassa · 2021-06-16T08:23:26Z

@ckyleda I'm sorry, I don't understand your last comment.

The default behavior for torchvision.io.read_image in torchvision 0.8.2 was to only support RGB images for PNG, returning 3 channels.

majnas · 2021-09-23T08:58:08Z

This mode load png as 3 channel
img = torchvision.io.read_image("my_img.png", mode=torchvision.io.image.ImageReadMode.RGB)

kuangxiaoye · 2022-11-12T08:24:43Z

This works for images that are grayscale; but I have RGB images where the actual channels are important and replicating the information across all channels is not desired behavior.

It blows the mind that defaulting to single-channel image reading was ever implemented in the first place. I suspect this probably means I cannot use torch for my use case.

it works！ Thanks~

datumbox added the module: io label Feb 1, 2021

datumbox closed this as completed Feb 1, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

torchvision.io.read_image return tensor shape is different. #3332

torchvision.io.read_image return tensor shape is different. #3332

kairos03 commented Feb 1, 2021 •

edited

Loading

kairos03 commented Feb 1, 2021

datumbox commented Feb 1, 2021

kairos03 commented Feb 1, 2021

datumbox commented Feb 1, 2021

kairos03 commented Feb 1, 2021 •

edited

Loading

datumbox commented Feb 1, 2021

GDkids commented May 26, 2021

ckyleda commented Jun 11, 2021

fmassa commented Jun 14, 2021

ckyleda commented Jun 15, 2021

fmassa commented Jun 16, 2021

majnas commented Sep 23, 2021

kuangxiaoye commented Nov 12, 2022

torchvision.io.read_image return tensor shape is different. #3332

torchvision.io.read_image return tensor shape is different. #3332

Comments

kairos03 commented Feb 1, 2021 • edited Loading

🐛 Bug

To Reproduce

Expected behavior

Environment

kairos03 commented Feb 1, 2021

datumbox commented Feb 1, 2021

kairos03 commented Feb 1, 2021

datumbox commented Feb 1, 2021

kairos03 commented Feb 1, 2021 • edited Loading

datumbox commented Feb 1, 2021

GDkids commented May 26, 2021

ckyleda commented Jun 11, 2021

fmassa commented Jun 14, 2021

ckyleda commented Jun 15, 2021

fmassa commented Jun 16, 2021

majnas commented Sep 23, 2021

kuangxiaoye commented Nov 12, 2022

kairos03 commented Feb 1, 2021 •

edited

Loading

kairos03 commented Feb 1, 2021 •

edited

Loading