-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug: torchvision/transforms/functional/to_pil_image always converts 1-channel (gray) FloatTensor images to 8-bit unsigned int #448
Comments
Yes, it looks like we currently don't handle this case properly. One workaround for the moment seems to be to convert the torch tensor to a numpy array, but it would be better to fix this case. |
Thank you very much for your response. My workaround has been to use local copies of a few of the Torchvision functions, such as Currently PIL has very little support for scientific imaging, i.e. Thank you again. |
Unfortunately such change would not be backward compatible, as we assume that images are in float tensors in 0-1 range, so a different fix should be added. |
Fairly new to coding/deep learning/pytorch/vision, so hope I am doing this right. Here is what I found:
From 1-4 the API has been kept so that passing in a float from 0-1 with mode=None was a requirement. Since #4 (Oct.17,2017) passing in a FloatTensor with mode='F' has been broken. My thought on options
What is the preference on the fix? Either way would be interested in trying to code this one if it isn't too much of a problem. |
My current thinking to solve this issue is to have a wrapper |
I'd be interested to see this. Please let me know if there is anything I can contribute here. |
Hi @mathski , I'm currently looking for datasets from other domains, like medical imagery, astronomy / etc, which do have images but which are in specialized formats. |
Hey @fmassa , I've worked with multi-spectral satellite and medical image data previously, but those datasets are not publicly available. I actually opened this issue not because I was dealing with an uncommon dataset type, but because I was attempting to do some simple image processing on intermediate outputs during training. Thanks for the support in any case. |
I see. I think we might want to make torchvision support backpropagation, and possibly avoid the need of converting back and forth to PIL images or numpy arrays. I think we are getting there with better support for And thanks for the dataset! |
Just so I understand, functions like |
yes, definitely! If you want your kernel to be fixed, you can do something like class GaussianBlur(nn.Module):
def __init__(self, ...):
super(...)
self.register_buffer('filter', torch.rand(1, 1, 3, 3))
def forward(self, input):
return F.conv2d(input, self.filter, ...) |
Excellent. Thank you very much. |
Would it be possible to document this behavior in relevant places until this issue gets resolved? I encountered this issue when using the |
@tbung - I'll flag this for follow-up to make a decision on this |
@tbung I agree, I think we should improve the documentation of Would you mind sending a PR improving the documentation? |
Thanks for the help with this a while back. I have a follow up question: Is there a list of torchvision functions that are supported by autograd? I'm trying to figure out which transformation functions I can avoid re-implementing myself. |
ERROR:
ValueError: Incorrect mode (<class 'float'>) supplied for input type <class 'numpy.dtype'>. Should be L
The torchvision transform ToPILImage(mode=float) will always break for input of type torch.FloatTensor
ToPILImage() uses the internal function to_pil_image found in torchvision/transforms/functional.py
In https://github.com/pytorch/vision/blob/master/torchvision/transforms/functional.py:
Line 104 checks if the input is of type torch.FloatTensor
If so, line 105 scales the input by 255, but then converts it to byte
Lines 113-127 check if the user-specified mode is the expected mode, and throws an error if not.
The expected mode is assigned by npimg.dtype, which return np.uint8 if line 105 is executed
I believe the bug can be fixed by changing line 105 from:
pic = pic.mul(255).byte()
-to-
pic = pic.mul(255)
Test script:
import torch
from torchvision import transforms
a = torch.FloatTensor(1,64,64)
tform = transforms.Compose([transforms.ToPILImage(mode='F')])
b = tform(a)
Please let me know if I am in error.
Thank you.
The text was updated successfully, but these errors were encountered: