Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make processing of arbitrary inputs to transforms.v2 public and document it #8721

Closed
liopeer opened this issue Nov 11, 2024 · 3 comments · Fixed by #8787
Closed

make processing of arbitrary inputs to transforms.v2 public and document it #8721

liopeer opened this issue Nov 11, 2024 · 3 comments · Fixed by #8787

Comments

@liopeer
Copy link

liopeer commented Nov 11, 2024

🚀 The feature

Supporting arbitrary input structures in custom transforms is very important in the case of transform compositions:

tr = Compose([RandomCrop((128,128), CustomTransform])

This can be done by inheriting from torchvision.transforms.v2.Transform and implementing the private ._transform method, which avoids having to unravel the data structure on your own (since this is done anyway in the .forward method).

class CustomTransform(Transform):
  def __init__(self, *kwargs):
    pass
  def _transform(self, inpt, params):
    if isinstance(inpt, Image):
      pass
    elif isinstance(inpt, BoundingBoxes):
      pass
    else:
      pass
    return transformed_inpt

The method has also been described in this blog post How to Create Custom Torchvision V2 Transforms, but the official torchvision docs do not yet describe it and instead suggest hard-coding the input structure.

Having to implement a private method for this (even though the class Transform is public) feels very wrong this means that things could break on our side any time. I would appreciate if the ._transform method was made public -> .transform and the Transform class would receive proper documentation on how this method should be implemented for custom transforms.

Motivation, pitch

The torchvision.transforms.v2 API has now been around for quite some time already and it would be nice to give developers the chance to develop transforms of the same quality and flexibility as the originally implemented ones!

Alternatives

No response

Additional context

No response

@lparolari
Copy link

Following!

@NicolasHug
Copy link
Member

Thanks for the feature request @liopeer . That's fair, I think the existing design has been used for long-enough that we should be comfortable making it public. I'll try to expose it before the next release. I don't guarantee that there will be nice docs for that just yet, but making those public should be fine.

@NicolasHug
Copy link
Member

Done in #8787, this will be available with the next release, early next year.

I don't guarantee that there will be nice docs

I eneded up writing some basic tutorial for that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants