Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

return image tensors in WH format #34

Closed
wants to merge 1 commit into from
Closed

Conversation

CarloLucibello
Copy link
Member

@CarloLucibello CarloLucibello commented Mar 17, 2020

Most downstream packages such as Flux, Knet, and the Images ecosystem, assume the image tensors to be in WH format for greyscale and in WHC format for rgb.
Our SVHN2 dataset is consistent with julia ecosystem, while MNIST, FashionMNIST, CIFAR10 and CIFAR100 return images HW format. This is annoying at the very least, but most importantly it is the potential source of very hard to detect bugs.

Therefore this PR does the following:

  • makes every vision dataset return tensors in WH format
  • Since this is breaking, I'm also bumping version number.
  • I'm removing the convert2features since it's just a reshape into a vector (or a matrix for a batch of images)

@johnnychen94
Copy link
Member

Most downstream packages such as Flux, Knet, and the Images ecosystem, assume the image tensors to be in WH format for greyscale and in WHC format for rgb.

Didn't check the code changes. I'm a little confused and I think you're suggesting the opposite. The whole JuliaImages are using column-major order, i.e., HW format

image

I'm removing the convert2features since it's just a reshape into a vector (or a matrix for a batch of images)

Removing it is fine, but there should be deprecations since this name is exported.

@johnnychen94 johnnychen94 requested a review from oxinabox March 17, 2020 11:12
@CarloLucibello
Copy link
Member Author

jeez, I got confused by the current docstring saying
"""
the first dimension corresponds to the pixel rows (x) of the
image, the second dimension to the pixel columns (y)
"""
which means HW (right?) while I get a rotated 9 here:

Schermata del 2020-03-17 13-44-32

@CarloLucibello
Copy link
Member Author

So Knet and Flux use WHCN, while Images uses HW. MLDatasets does WHCN although its docstrings say HWCN. What should I do here, just fix the docstring?

@johnnychen94
Copy link
Member

I think the docstring is correct, because unlike other languages, in julia row is the second dimension.

julia> x = reshape(1:16, 2, 8)./16
2×8 Array{Float64,2}:
 0.0625  0.1875  0.3125  0.4375  0.5625  0.6875  0.8125  0.9375
 0.125   0.25    0.375   0.5     0.625   0.75    0.875   1.0

julia> first(eachrow(x)) == x[1, :]
true

@CarloLucibello
Copy link
Member Author

wait, looking again at the docstring
"""
the first dimension corresponds to the pixel rows (x) of the
image, the second dimension to the pixel columns (y)
"""
I think it does mean WH although it is very confusing (at least to me) and should be rephrased

@johnnychen94
Copy link
Member

johnnychen94 commented Mar 17, 2020

Yeah, perhaps some emphasis on the layout difference would be sufficient.

P.S. I tried to not use WH/HW because I find myself always confused 😅

Copy link
Member

@oxinabox oxinabox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no review to add, I don't do images.
I trust people who do to review this better than I
could

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants