Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transforms with nested tensor #7761

Open
agunapal opened this issue Jul 26, 2023 · 3 comments
Open

Transforms with nested tensor #7761

agunapal opened this issue Jul 26, 2023 · 3 comments

Comments

@agunapal
Copy link

🚀 The feature

For batched inference on images of different sizes, we need to do the following

  • Resize each image to the same size and convert to a tensor
  • Stack the batch of tensors
  • Do further image transformations on the batched tensors
  • Run inference

It would be nice to do the following instead

  • create a nested tensor of images of different sizes
  • Run transformation including resizing on the nested tensor
  • Run inference

Motivation, pitch

This would result in improved performance for image pre-processing

Alternatives

  • Resize each image to the same size and convert to a tensor
  • Stack the batch of tensors

Additional context

No response

@AnimeshMaheshwari22
Copy link

Hello @agunapal this is quite interesting! This will be a part of transforms right?

@agunapal
Copy link
Author

If this is possible, yes.. this would need to be supported by transforms to handle pre-processing bottleneck in inference

@NicolasHug
Copy link
Member

Hi @agunapal , thanks for the feature request.

I understand that in general, processing input in batches makes the transforms faster. And I also acknowledge that passing batches of images to Resize is pretty much impossible as-is, because.. well, we can't batch images of different sizes.

So that's where NestedTensor comes in, as it provides a nice UX to manipulate tensors of different sizes. But unfortunately, I'm afraid NestedTensor won't help regarding perf. There aren't a lot of torch operations that natively support Nestedtensor, and in particular torch.nn.functional.interpolate isn't one of them: that's what Resize() relies on. So technically even if there was support for NestedTensor in torchvision, we wouldn't be able to do much more than just manually loop over the entries of the NestedTensor and pass them one-by-one to interpolate().

Regarding the UX: I tried to see if our V2 transforms could natively support NestedTensor. It's pretty much the same story as for TensorDict (#7763): NestedTensor don't integrate too well with pytree, so right now nothing really works. Maybe the NestedTensor devs would be open to integrate it with pytree?

(Thinking about all this made me open #7774, which is partially related to what you want to do. But I still don't fix it will be a silver bullet, sorry :/ )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants