Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

masks_to_bounding_boxes op #4290

Merged
merged 40 commits into from
Sep 21, 2021
Merged

masks_to_bounding_boxes op #4290

merged 40 commits into from
Sep 21, 2021

Conversation

0x00b1
Copy link
Contributor

@0x00b1 0x00b1 commented Aug 18, 2021

This (draft) pull request resolves #3960. I created a draft to kickoff new contributor on-boarding (e.g. CLA).

I'm working on a gallery example now. I'll also add test against different dtypes.

@facebook-github-bot
Copy link

Hi @0x00b1!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at [email protected]. Thanks!

@facebook-github-bot
Copy link

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Facebook open source project. Thanks!

@oke-aditya
Copy link
Contributor

Sorry for an early poke at this PR.

I think it would be nice to place the test in test_ops.py

I'm not sure of how it should be tested. My initially were we could manually create boolean masks and test them like the ones done in utils.draw_segmentation_masks test?

Also the code can be kept directly in boxes ? No strong opinion here though.

@0x00b1
Copy link
Contributor Author

0x00b1 commented Aug 19, 2021

Hi, @oke-aditya! I appreciate any and all feedback! 😄 My code organization was my preference but I am more than happy to adopt whatever convention you or any other maintainers prefer.

I will parametrize the dtype in the fixtures but other than that I think the unit test does its job. If you're curious, I used the random_shapes function from skimage.draw to create the fixtures. I was the original author of that function, or the code that would become that function, and its purpose was to solve the exact issue of writing unit tests for object localization methods. Ideally, in the future, if torchvision starts adding more of these types of operators it would be nice to port over that function and similar data generators from scikit-image to simplify writing these types of tests (e.g. extending the functionality to videos and volumes or arbitrary color spaces). It would also allow for some basic fuzzing.

@oke-aditya
Copy link
Contributor

if torchvision starts adding more of these types of operators it would be nice to port over that function and similar data generators from scikit-image to simplify writing these types of tests

Not sure if they fall into same category, but there too we hardcoded some of the boxes and masks for images to test them.
Actually we have bunch of tests, from here (these are operators used for plotting boxes, masks) Also a many operators like box_iou, box_area, where the boxes are predefined (but these are mostly mathematical so probably ok). You can have a look at these and I would be glad to hear your thoughts!

My initial thought was that this function too would be tested in similar way instead of drawing random shapes / boxes on an image.

P.S. I'm just a small contributor to the library (also a novice developer) so please don't mind.

Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the PR @0x00b1 and @oke-aditya for the review. I made a few comments and will look at the rest once this isn't draft anymore. Thanks for the initiative of writing a gallery example!!

test/test_masks_to_bounding_boxes.py Outdated Show resolved Hide resolved
test/test_masks_to_bounding_boxes.py Outdated Show resolved Hide resolved
test/test_masks_to_bounding_boxes.py Outdated Show resolved Hide resolved
torchvision/ops/_masks_to_bounding_boxes.py Outdated Show resolved Hide resolved
torchvision/ops/_masks_to_bounding_boxes.py Outdated Show resolved Hide resolved
torchvision/ops/_masks_to_bounding_boxes.py Outdated Show resolved Hide resolved

@pytest.fixture
def masks() -> torch.Tensor:
with PIL.Image.open(os.path.join(ASSETS_DIRECTORY, "masks.tiff")) as image:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it would be possible to write a test without the need for new images and hard-coded coordinates?

Ideally, we could generate random masks and have a super simple version of masks_to_boxes which we could use as the reference implementation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. I wrote about this elsewhere in the thread. I'd love to add a generator for various outputs similar to the function @goldsborough and I wrote for scikit-image (skimage.draw.random_shapes). However, would you mind if I did this in a follow-up commit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NicolasHug a friendly bump

@datumbox
Copy link
Contributor

@0x00b1 Just checking that you still plan to complete the PR. Please let me know :)

@0x00b1
Copy link
Contributor Author

0x00b1 commented Aug 31, 2021

@datumbox Yep! I was on vacation last week (it was lovely) and started on-boarding at Facebook yesterday. I'll finish this today or tomorrow. Thanks, @NicolasHug for the comments!

@RylanSchaeffer
Copy link

Can you generalize this to 3D images?

@datumbox
Copy link
Contributor

@RylanSchaeffer it's definitely worth discussing it on a new issue. I would prefer if we did this on a separate PR to avoid blocking it for longer.

@RylanSchaeffer
Copy link

@datumbox I have two opinions. On one hand, I agree that not blocking this PR is good. On the other, a half solution means a fix to the complete problem will probably be delayed.

@NicolasHug
Copy link
Member

a half solution means a fix to the complete problem will probably be delayed

This PR isn't half a solution, it's a complete solution to a complete problem: 2D images.

3D images are a different problem which we'll be happy to tackle at a future time once this PR is merged, as @datumbox suggested

@RylanSchaeffer
Copy link

RylanSchaeffer commented Aug 31, 2021

This PR isn't half a solution, it's a complete solution to a complete problem: 2D images.

That's a strange way to think about things. Imagine someone submitted a cross entropy loss implementation for a 2D array. By your metric, it's a complete solution to a complete problem i.e. 1-dimensional classification. But look at the cross entropy loss implementation: it works for arbitrary dimensions, not just one, because we shouldn't be limited to a 2D array.

image

On this topic, the real problem is more general than 2D images. For us, the real problem is: given a N-dimensional segmentation mask, how to convert the mask to N-dimensional bounding boxes? A solution for 2D is a partial solution.

@NicolasHug
Copy link
Member

@RylanSchaeffer , just because we can generalize a problem doesn't mean that one problem is less "real" or "complete" than the other. 2D masks are a normal use-case that a lot of people have and solving this will be valuable on its own.

When it comes to software development, a merged PR that solves one problem is worth more than an unmerged PR that solves 2 problems.

Again, we'll be happy to consider an extension if the use-case is compelling.

@RylanSchaeffer
Copy link

Ok issue opened! #4339

@datumbox
Copy link
Contributor

@RylanSchaeffer In addition to what Nicolas said, and for full transparency, here are some reasons for why we often choose not to go straight for the most generic/complicated implementation:

  1. We often create bite-sized issues to help on-board new contributors and new members of the team to the code-base. Limiting the scope can help keep the work manageable.
  2. We might have urgent need to cover a specific limited use-case or there is a time constraint to release a feature.
  3. We might be not certain about some technical parts of the generic implementation and require additional discussions.

I think it's worth continuing the discussion of how this can be made generic on the new issue that you opened.

@0x00b1 Welcome back. Sounds great, if you have any issues with the CI let us know and we can help.

@0x00b1
Copy link
Contributor Author

0x00b1 commented Aug 31, 2021

@RylanSchaeffer I agree that this should come in a future PR. But don't worry, 3D or n-D images are something I care about too so I'm more than happy to do that work.

Repurposing annotations
=======================

The following example illustrates the operations available in :ref:`the torchvision.ops module <ops>` for repurposing
Copy link
Contributor

@oke-aditya oke-aditya Sep 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some debugging I found out the reason for build_docs CI failure. The problem is torchvision.ops does not have a nice index on right side (basically a html link to #ops like transforms has). This causes CI failure.

We need to remove the ref, and it will work fine. This is slightly hacky fix, but works fine.
I tried running it locally. I could build the gallery example. It looks nice.

Suggested change
The following example illustrates the operations available in :ref:`the torchvision.ops module <ops>` for repurposing
The following example illustrates the operations available in the torchvision.ops module for repurposing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I appreciate the debugging.

Copy link
Contributor

@oke-aditya oke-aditya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey Allen you need to add docs to docs/ops.rst where you can use.

.. autofunction:: masks_to_boxes

This will add docs for this code.

Copy link
Contributor

@datumbox datumbox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@0x00b1 sorry for the back and forth. Adding an operator is possibly one of the most complex things as one needs to add many things across many files. I think we are almost there to merge. Let me summarize the comments that I think remain unresolved:

  1. Address the docs failure as described here: https://github.com/pytorch/vision/pull/4290/files#r712457642
  2. We missed one use of numpy vs torch. Just copy paste what you got on the examples and we should be good to go. https://github.com/pytorch/vision/pull/4290/files#r712884767
  3. Add the masks_to_boxes in docs as described here: masks_to_bounding_boxes op #4290 (review)

@0x00b1
Copy link
Contributor Author

0x00b1 commented Sep 21, 2021

Hey Allen you need to add docs to docs/ops.rst where you can use.

.. autofunction:: masks_to_boxes

This will add docs for this code.

Nice catch! Fixed.

@0x00b1
Copy link
Contributor Author

0x00b1 commented Sep 21, 2021

@0x00b1 sorry for the back and forth. Adding an operator is possibly one of the most complex things as one needs to add many things across many files. I think we are almost there to merge. Let me summarize the comments that I think remain unresolved:

  1. Address the docs failure as described here: https://github.com/pytorch/vision/pull/4290/files#r712457642
  2. We missed one use of numpy vs torch. Just copy paste what you got on the examples and we should be good to go. https://github.com/pytorch/vision/pull/4290/files#r712884767
  3. Add the masks_to_boxes in docs as described here: #4290 (review)

It's no problem dude! I sincerely appreciate your and @oke-aditya's patience! In the future, it might be worth investigating whether someone should add a cookiecutter or cookiecutter-like method for generating op scaffolding.

@0x00b1
Copy link
Contributor Author

0x00b1 commented Sep 21, 2021

@datumbox OK. Everything has been addressed. Hopefully we don't see any CI failures!

I would also be more than happy to squash these commits down.

torchvision/ops/boxes.py Outdated Show resolved Hide resolved
@0x00b1 0x00b1 marked this pull request as ready for review September 21, 2021 18:42
Copy link
Contributor

@datumbox datumbox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks a lot @0x00b1. Congrats on your first contribution. :)

@datumbox datumbox merged commit f0422e7 into pytorch:main Sep 21, 2021
@0x00b1 0x00b1 deleted the issues/3960 branch September 21, 2021 23:15
@NicolasHug
Copy link
Member

NicolasHug commented Sep 22, 2021

Hi @0x00b1 ,

Thank you for the great work on this PR!
I'm sorry I wasn't able to make a last pass, I think I missed the point where this PR got un-drafted #4290 (review).

I only have 2 remaining comments at this point:

  • would it be possible to not rely on a hard-coded image for the tests, as suggested in masks_to_bounding_boxes op #4290 (comment) (sorry again I missed your ping)? It would help make the tests more robust, and also avoid storing files in the repo which can end up bloated. The PR was merged already so it will be included anyway I guess, but this can still help when we make shallow clones of the repo.
  • Would it be possible to use the draw_segmentation_masks and draw_bounding_box utilities in the example, as suggested by @oke-aditya in masks_to_bounding_boxes op #4290 (comment)? It would likely simplify the example and trim it down to its essential part: the new masks_to_boxes operator, instead of having lots of plotting code. It would also help users discover these plotting tools that they might come useful in other scenarios.

Would you or @oke-aditya be interested in a follow-up PR with these? The first point might be a bit trickier, but the second one should be reasonably simple. We can do them in separate PRs. Thanks!

@oke-aditya
Copy link
Contributor

I'm fine with either. I will leave choice to @0x00b1

@oke-aditya
Copy link
Contributor

Also another thought about the gallery example.
Another example that can be added is to show how a simple Segmentation dataset can be rewritten to detection dataset.
As pointed out in #3960. This is a common use of masks_to_boxes.

Adding this example will help users to convert PenFudan / Panopatic datasets easily to detection.
This might look simple, but let's keep an example to help users.

from torchvision.ops import masks_to_boxes, box_convert

class SegmentationToDetectionDataset(Dataset):
    def __getitem__(self, idx):
          boxes_xyxy = masks_to_boxes(segmentation_masks)

         # Now for any change of boxes to COCO Format.
          boxes_xywh = box_convert(boxes_xyxy, in_fmt="xyxy", out_fmt="xywh")
          return boxes_xywh


n = masks.shape[0]

bounding_boxes = torch.zeros((n, 4), device=masks.device, dtype=torch.int)
Copy link
Contributor

@oke-aditya oke-aditya Sep 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My initial thought was dtype should be torch.float. Since all other ops follow float dtype.

cc @datumbox @NicolasHug

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, also the above zeros needs to have a device:
torch.zeros((0, 4), device=masks.device)

Could you please send a PR that fixes these 2 issues? The rest of the doc/test improvements discussed here can happen on a separate PR.

facebook-github-bot pushed a commit that referenced this pull request Sep 30, 2021
Summary:
* ops.masks_to_bounding_boxes

* test fixtures

* unit test

* ignore lint e201 and e202 for in-lined matrix

* ignore e121 and e241 linting rules for in-lined matrix

* draft gallery example text

* removed type annotations from pytest fixtures

* inlined fixture

* renamed masks_to_bounding_boxes to masks_to_boxes

* reformat inline array

* import cleanup

* moved masks_to_boxes into boxes module

* docstring cleanup

* updated docstring

* fix formatting issue

* gallery example

* use torch

* use torch

* use torch

* use torch

* updated docs and test

* cleanup

* updated import

* use torch

* Update gallery/plot_repurposing_annotations.py

* Update gallery/plot_repurposing_annotations.py

* Update gallery/plot_repurposing_annotations.py

* Autodoc

* use torch instead of numpy in tests

* fix build_docs failure

* Closing quotes.

Reviewed By: datumbox

Differential Revision: D31268025

fbshipit-source-id: 65f88779516ff0a411600a25b783f00369d56719

Co-authored-by: Aditya Oke <[email protected]>
Co-authored-by: Aditya Oke <[email protected]>
Co-authored-by: Aditya Oke <[email protected]>
Co-authored-by: Vasilis Vryniotis <[email protected]>
Co-authored-by: Aditya Oke <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ops to convert masks to boxes
6 participants