-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add GIF decoder #8406
Add GIF decoder #8406
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/vision/8406
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit b8e9427 with merge base 1644fff (): This comment was automatically generated by Dr. CI and updates every 15 minutes. |
@@ -9,7 +9,7 @@ eval "$($(which conda) shell.bash hook)" && conda deactivate && conda activate c | |||
|
|||
echo '::group::Install testing utilities' | |||
# TODO: remove the <8 constraint on pytest when https://github.com/pytorch/vision/issues/8238 is closed | |||
pip install --progress-bar=off "pytest<8" pytest-mock pytest-cov expecttest!=0.2.0 | |||
pip install --progress-bar=off "pytest<8" pytest-mock pytest-cov expecttest!=0.2.0 requests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needed for newly added test.
libraries=image_link_flags, | ||
extra_compile_args=extra_compile_args, | ||
) | ||
ext_modules.append( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are now always building the image
library even if png and jpeg haven't been found, since we can always build libgif anyways. Maybe we'll add a WITH_LIBGIF
flag or something similar, but this can come later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice to add a TORCHVISION_USE_GIF
env var similar to all other image backends for consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious why you'd need it @adamjstewart ?
Those are useful for the other libraries as we dynamically link against them and they introduce dependencies, but since we vendor libgif we don't have the same issues with the gif decoder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm coming from the perspective of a maintainer of a from-source package manager: Spack. We need to be able to build packages like torchvision from source on all kinds of weird systems (Linux aarch64, ppc64le, etc.) and with compilers (OneAPI, AOCC, Fujitsu, etc.) for which they are not widely tested. This requires patching these packages to fix compilation errors. For example, our giflib recipe already has multiple patches required for certain systems. Vendoring results in us having to patch multiple locations instead of one. I'm not opposed to it being optionally vendored, but would love support for externally-installed copies too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be useful to have an option to use the shared library. We maintain things at conda-forge for shared linking which would make this easier to integrated.
One bug i'm running into (with the current startegy) is that clang for example likes to complain that -std-c++17
is added to compilation of a c file....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adam, you can likely take my small patches to unvendor
conda-forge/torchvision-feedstock#94
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would love to see these merged upstream so we don't need to maintain a separate set of patches.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
they would definitely need more work to be submitted here
# and torchvision decoded outputs are equal. | ||
# We're not testing against "welcome2" because PIL and GIFLIB disagee on what | ||
# the background color should be (likely a difference in the way they handle | ||
# transparency?) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the curious, top row is what giflib / torchvision is decoding, bottom row is what PIL is decoding (PIL doesn't use giflib and has its own decoder). There seem to be a disagreement in what the background color should be. When I open the file on a file viewer, that background is marked as "transparent" anyways. I don't think it's worth worrying about that, at least for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is PIL using internally to decode gif files? I wonder whether it would be more interesting to (re-)implement PIL's version of gif decoding instead of vendoring giflib here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Part of the PIL GIF decoder is here: https://github.com/python-pillow/Pillow/blob/58a47978af9f34851ce926303d05fa677010ce2a/src/libImaging/GifDecode.c
But I think there are other parts elsewhere because I suspect this part only decodes the file into a "colormap" image, i.e. the values aren't RGB, only indices pointing to values to the image's colormap. I'm not sure where the other part is, I didn't look.
I wonder whether it would be more interesting to (re-)implement PIL's version of gif decoding instead of vendoring giflib here?
I wouldn't want to re-implement a GIF decoder on our own, there is simply too much room for us to make mistakes. It is a lot safer and simpler to rely on GIFLIB which has been around for decades, and is still maintained to this day (https://sourceforge.net/p/giflib/code/ci/master/tree/NEWS).
We could vendor the PIL decoder instead of vendoring GIFLIB but I don't see how this would simplify anything.
# transparency?) | ||
|
||
path = tmpdir / f"{name}.gif" | ||
url = f"https://sourceforge.net/p/giflib/code/ci/master/tree/pic/{name}.gif?format=raw" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have avoided downloading test stuff from the internet in the past because it was tricky to make those work on the internal CI, but I'll just skip those tests internally and add something else.
This avoids having weak tests and polluting the repo with image test files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can also create random gif files using PIL for the internal CI?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might eventually do that, but this won't be a problem for a while. We just need to focus on the GitHub CI for this PR, I'll handle the internal stuff later
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to review any file in cpu/giflib
, except maybe for this one.
// TODO: We are potentially doing an unnecessary copy of the encoded bytes: | ||
// - 1 copy in from file to tensor (in read_file()) | ||
// - 1 copy from tensor to GIFLIB buffers (in read_from_tensor()) | ||
// Since we're vendoring GIFLIB we can potentially modify the calls to | ||
// InternalRead() and just set the `buf` pointer to the tensor data directly. | ||
// That might even save allocation of those buffers. | ||
// If we do that, we'd have to make sure the buffers are never written to by | ||
// GIFLIB, otherwise we'd be overridding the tensor data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leaving this out for now, we can try to address it if gif decoding ever becomes a bottleneck...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@NicolasHug the code looks OK to me, except maybe reader_helper_t
to use simple datatypes... As for CL output, this may be done in a follow-up PR...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies for the noob comments.
// This reads `len` bytes and writes them into `buf`. The data is read from the | ||
// input tensor passed to decode_gif() starting at the `num_bytes_read` | ||
// position. | ||
int read_from_tensor(GifFileType* gifFile, GifByteType* buf, int len) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sizes and lengths should ideally be size_t not ints
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately this function's signature is enforced by GIFLIB, so I'm afraid we can't change it? https://sourceforge.net/p/giflib/code/ci/master/tree/dgif_lib.c#l33
(it forces us to cast below in std::min
which is annoying!)
|
||
auto options = torch::TensorOptions() | ||
.dtype(torch::kU8) | ||
.memory_format(torch::MemoryFormat::ChannelsLast); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why this is ChannelsLast while below and in the loop the channels are before height and width? I am likely missing something. Could you explain in a code comment so other readers can also understand better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed offline the shape of the tensor is NCHW, but its format/layout is channels-last i.e. NHWC. Below we are using an accessor on the tensor data, whose indexing follows that of the shape i.e. [n][c][h][w]
but the underlying operation still respects the channels-last format. If it didn't, I think our correctness tests would fail hard (and maybe even we'd have a segfault since W >> C
?)
@vfdev-5 can you confirm that the accessor out_a
properly respects the channels-last attribute?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, accessor
is respecting tensor strides:
https://github.com/pytorch/pytorch/blob/7864d287a1e56685aa754285cc2d3c31ff055f62/aten/src/ATen/core/TensorAccessor.h#L97-L109
out = out.squeeze(0); // remove batch dim if there's only one image | ||
|
||
DGifCloseFile(gifFile, &error); | ||
TORCH_CHECK(error == D_GIF_SUCCEEDED, "DGifCloseFile() failed - ", error); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use TORCH_CHECK_EQ here so when it fails it prints both the expected and actual value in the error message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't able to useTORCH_CHECK_EQ
while still provinding a custom error message 🤔 , not sure that is supported.
TORCH_CHECK_EQ lead to a segfault. According to https://github.com/pytorch/pytorch/blob/a89177936cc46c476f73a555f499a89ee4c3aa01/c10/util/Exception.h#L421-L439, TORCH_CHECK is the right util to use here.
reader_helper.encoded_data_size = encoded_data.numel(); | ||
reader_helper.num_bytes_read = 0; | ||
GifFileType* gifFile = | ||
DGifOpen(static_cast<void*>(&reader_helper), read_from_tensor, &error); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ahmadsharif1 I ended up using a static cast here as well - is that OK?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @NicolasHug !
Hey @NicolasHug! You merged this PR, but no labels were added. |
Reviewed By: vmoens Differential Revision: D57099453 fbshipit-source-id: 0e85ec0c92cc4e2ee06b5d4183fedc639f38dec0
Reviewers: DON'T PANIC this PR is a lot simpler than it looks. It's not 2.6k locs, only ~300 loc.
This PR adds the ability to decode GIF files via the
decode_gif()
function.We are vendoring GIFLIB i.e. I copy/pasted the relevant giflib files into
torchvision/csrc/io/image/cpu/giflib
- this is why the PR looks big, but there's no need to review these files. GIFLIB is MIT licensed so we can do that. Vendoring avoids the dependency nightmare that libjpeg[-turbo] and libpng have been. I don't think giflib even exists on conda (except onconda-forge
but even if it did, vendoring would still be easier.Benchmark
TL;DR Sometimes it's faster than PIL (2x at most), somtimes it's slower (2x at most)
Benchmark script:
Benchmark results: