Add utility to draw bounding boxes #2785

oke-aditya · 2020-10-10T19:26:06Z

Closes #2556 Supersedes #2631

As per the new API discussion, I will make this compatible with output of detection models.
This will be compatible with only VOC format boxes as this is our default for Input and Output in torchvision.
Users can convert the boxes using the new box_convert function and pass. (We can fix this internally too but let's leave it for now)

Will try to get this in before October release 😃

Adds code
Adds docs
Adds tests

oke-aditya · 2020-10-11T12:55:42Z

Got it working with Faster RCNN outputs 😄

It supports different colors too.

I have implemented this fully, all the parameters are supported as in function decleration.

oke-aditya · 2020-10-11T19:05:00Z

Current caveats. I need help here @pmeier @fmassa

Image to be passed should be (C, H, W) and not (B, C, H, W). (B, C, H, W) is the input format for detection models in eval() mode.

I would like to know the best way to handle this.
Should we ask the user to pass list of tensors ? Or simply do squeeze(0) and allow to pass (1, C, H, W) ?
Or we process the entire batch of images (slow operation can be an issue then) ?

Should we return the image drawn tensor? I'm not sure about why we are doing so.
If the user wants to fill the bounding box, how it can be semi-transparent. We don't want a filled up rectangle, but something like a mask.
How should the tests be ? I have absolutely no clue what they should check for this.

pmeier

Thanks for the PR @oke-aditya! About your questions:

IMO this basically reduces to: do we allow batch processing or not. Given that we cant parallelize this due to our PIL usage I'd say we don't allow it. Passing batches would internally mean we use a for loop anyway and thus the user has no real advantage. I think we can simply try to squeeze(0) if we encounter an image with 4 dimensions and handle the error if this fails.
Yes, we should return the tensor. Why wouldn't we or more importantly: why would a user use this function if he gets no results back?
It can only be transparent if we use a forth channel for the alpha, i.e. change the image from RGB to RGBA. I'm not sure if this is a good idea. Is it common to fill the bounding boxes? I've never seen it before, which does not mean it does not exist. Otherwise I see no point in adding a fill option.
I agree testing this will not be easy, but I think we can cover some basic stuff. For example you can start with an all white image, draw a bounding box and check if only the pixels you would expect are changed. You can also test the colors with this.

In addition to your questions, I have some other remarks below. The linter is also failing, but lets postpone this until we have the functionality right.

torchvision/utils.py

pmeier · 2020-10-12T06:53:52Z

torchvision/utils.py

+        if colors is None:
+            draw.rectangle(bbox, width=width)
+        else:
+            draw.rectangle(bbox, width=width, outline=colors[label])


I think it would be more clear to set colors = {} if it is None and use colors.get(label) here. With this we would have no branching and would use the default color if a label is not included in colors.

Colors is optional, and white colored boxes will be drawn if it is None.

Colors is very tricky parameter. PIL will throw error for unsupported color though.

If we have to handle for unsupported color we might need to catch exception from PIL and then revert to default color.

We could use getrgb() to parse the colors before we enter the loop and handle the exception there. In general, we should not constrict colors to strings, but also allow int triplets.

torchvision/utils.py

pmeier · 2020-10-12T06:55:30Z

torchvision/utils.py

+        else:
+            if draw_labels is True:
+                draw.text((bbox[0], bbox[1]), label_names[int(label)])


Similar to above: can we maybe use dict.get() to assign a default label if it is not present in label_names?

torchvision/utils.py

oke-aditya · 2020-10-12T09:21:54Z

Hey, @pmeier I too gave a thought on above points. Let me summarize a few pointers.

I think we can handle (B, C, H, W) situation using squeeze(0). Also raise an error if B>1. I think this is just a simply extension to present functionality, and that should suffice. (Lets see what users think)
Let's return the tensor. 👍
Filled bounding box makes little sense to me. Stuff would be completely invisible inside box.
Also for segmentation tasks, we should probably have another utility draw_masks like we have this. That will draw only Instance segmentation / semantic segmenation masks for tasks.
The colors, width, draw labels these parameters can be optimized further.
E.g Labels should probably not be drawn beyond the image. If the label is too long, it should be split in 2 lines etc.
As you pointed width should be something in ratio of image.
Also colors should restore to default if there is error. Can we automatically assign them to be class specific colors? if possible.

I guess there is too much scope in these improvements, but we should first have tests, proper documentation and minimal implementation working.

pmeier · 2020-10-12T09:26:20Z

I agree with 1. - 3. IMO we should include the proper color handling in the original PR. The automatic line with adaptation as well as the label drawing could be done in separate PRs.

codecov · 2020-10-12T09:43:13Z

Codecov Report

Merging #2785 into master will decrease coverage by 0.09%.
The diff coverage is 48.27%.

@@            Coverage Diff             @@
##           master    #2785      +/-   ##
==========================================
- Coverage   73.22%   73.13%   -0.10%     
==========================================
  Files          96       96              
  Lines        8446     8473      +27     
  Branches     1320     1329       +9     
==========================================
+ Hits         6185     6197      +12     
- Misses       1859     1868       +9     
- Partials      402      408       +6

Impacted Files	Coverage Δ
torchvision/utils.py	`60.24% <48.27%> (-7.62%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 42e7f1f...1aa1b03. Read the comment docs.

oke-aditya · 2020-10-12T18:58:51Z

@pmeier I have handled the image batch issue now. But there are few doubts.

I will try to fix up the colors in this PR. There are lot of edge cases to think through. Below are my thoughts,

User passes no colors -> (Handled already) We use PIL's default and draw all boxes with same colors
User passes colors for only specific classes -> Not sure, I think we should through error or draw others with default?
User passes colors for all classes with names of colors in string -> (Handled already) we draw as they need
User passes colors for all classes with names of colors in rgb tuple -> Should probably convert his and work like 3.

Same doubt is about label names.

User passes no label names : - (Handled already) We simply draw the class number as text
User passes only some label names : - Not sure here, this is not probably how people should use.
User passes all label names : - (Handled already) We draw the label texts.

The problem arises when case 2 in colors and case 2 in label names both occur together. Coz this lead to some unexpected conditions.
Maybe that label has a label name or not. Again making this API super complex to maintain.

I feel a user should either give everything needed or avoid at all. We can raise Errors to ensure user passes correct content.
This would be easier to maintain and use. Since tests are really complicated for this already.

fmassa

Hi @oke-aditya

Thanks for the PR!

I've left a few comments, let me know what you think.

On a more high level, I think we are on the right direction but we should think a bit more on a few aspects of the API so that we can cover most use-cases for the users.
Also, I wouldn't want to hurry this into the 0.8.0 release, as the deadline is very fast approaching, but I think it would be better to have it for next release (0.9.0).

Let me know what you think.

torchvision/utils.py

oke-aditya · 2020-10-13T08:30:30Z

Also, I wouldn't want to hurry this into the 0.8.0 release, as the deadline is very fast approaching, but I think it would be better to have it for next release (0.9.0)

I too think the same.
We should probably discuss more about this API, since in future possibly we will extend utlitities to visualize segmentation outputs too.
I think this would be half baked till 0.8.0 release (with release being probably this week) and we can provide this feature in next one.

oke-aditya · 2020-10-13T09:41:16Z

Let me write here how this API works with torchvision models right now

# We need to specify all 91 COCO clasess I'm keeping it short
label_names = ['background', 'person', 'bicycle']

# Again all 91 are needed.
colors = {1: "blue", 2: "aqua"}

image = Image.open(img_path)
image = T.ToTensor()(image)

# For detection model
image = torch.unsqueeze(image, 0)

model = fasterrcnn_resnet50_fpn(pretrained=True)
model = model.eval()
out = model(image)

boxes = out[0]["boxes"]
labels = out[0]["labels"]

img_drawn = draw_bounding_boxes(image, boxes, labels, label_names, colors=colors)

…rawbox

fmassa · 2020-10-13T12:16:42Z

Yes, let's chat a bit more about this after the release. I'm pretty busy today preparing the branch cut etc so I won't have much time to iterate today, but I gave this function a try and I faced a few issues / problem that I enumerate below:

The default colors being all white is a bit annoying, it would be preferable to have distinct default (maybe given by a fixed function like the one I used for maskrcnn-benchmark.
When I tried passing colors to the function it didn't seem to work, although I might have done something wrong
When I tried passing a uint8 image tensor to the function I got an error

I'll check back on this PR by the end of the week, but here are a few things I would like us to think about:

Labels

Now we need to pass two tensors for printing the labels. What I was originally thinking was to let the user directly specify the labels for each box, so that they can do arbitrary customization (including scores). This way, we don't need to expose a class_label argument to the function either.
Here is what I had in mind:

# let the user create the description for each box
description = [f'{CLASSES[cls_id]} : {s:02f}' for cls_id, sin zip(labels, scores)]
# now just pass it to the visualization function
draw_bounding_boxes(images, boxes, description

I'm not sure how much of a hassle for the user it would be to do this one-liner, but at least it makes things more generic.

Thoughts?

oke-aditya · 2020-10-13T16:10:17Z

Let's not hurry over this. We can work on this after the release 😄

I think the color choice has to improve (colors remain major discussion). I think we can give good default colors for labels.
Maybe something like distinct rgb values generated from label id ?
Colors should probably work, I have tried them locally and had attached output. Maybe I will share more code.

For the labels API which you proposed, I think the following.

It can surely do much more job than the current one.
It is quite complicated and doesn't seem intuitive, most probably users are going to face errors in it which will be hard to debug too. I'm not quite sure what else would someone plot apart from labels (description) and scores. This is quite more open-ended and I'm not sure if people will use the same one-linear (might not be obvious to all).

I guess trade-off question that, is this API for plug and play to torchvision models or a generic one ❓

I had in my mind it is plug and play (hence supporting image with 4 dims, allow to pass detection model outputs).
My thoughts were since this is a utility function for torchvision and not for computer vision, in general, it should adhere to this API only.

Currently.
Label names are optional and can contain a description of class/class name so it is quite flexible.
For scores, I propose a new optional parameter though.
I'm not sure what else a person might be interested to plot.

I guess there is lot to discuss over this (maybe I have misunderstood something). Lets catch this up after 0.8.0.

fmassa · 2020-11-16T14:52:14Z

Hi @oke-aditya

Sorry for the delay in getting back to you.

In general, I still think that it is preferable to have a slightly more generic function, even if it requires the user to be a bit more verbose. I think it's an ok trade-off to be made, as it would allow more use-cases for the function. The function would become something like visualize_boxes_with_annotations or something like that, where the annotations for each box can be arbitrary and provided by the user. We just ensure that the annotations are located in a particular location wrt the bounding boxes.

If you don't have enough bandwidth to work on this refactoring for now it's ok, @datumbox agreed to help and to build on top of your PR so that we can get this functionality merged in torchvision soon. Otherwise @datumbox will help co-design / review this PR.

Let us know what you think.

oke-aditya · 2020-11-16T15:06:48Z

Hi @fmassa.

I think this will take some significant changes, refactoring.
I'm happy both the ways.

I think @datumbox will have something great in his mind. Two people handling this might slow down it.
I will leave it to @datumbox if he would like to take-over completely or help to codesign and allow me to continue on this.

datumbox · 2020-11-17T09:16:52Z

hi @oke-aditya, it's completely up to you!

We think that this PR is useful and we would like to merge it soon. If you think you have the time to make the changes discussed above, I'm happy to support. Else I'll take your PR and make the necessary changes, so that we can merge it ASAP. Just don't close your branch because I plan to make the changes in-place.

Let me know! :)

oke-aditya · 2020-11-17T11:07:26Z

Looks like all of us want this 😄 .

@datumbox You can go ahead 🚀 . I won't close or delete this branch.

Let me know if you need access to this fork, etc. Super Eager to see this PR getting to master

fmassa

Thanks!

I've made a few comments, let me know what you think

torchvision/utils.py

fmassa · 2020-11-20T13:51:54Z

torchvision/utils.py

+    colors: Optional[List[str]] = None,
+    labels: Optional[List[str]] = None,
+    width: int = 1,
+    font: Optional[ImageFont] = None


Given that we won't be using this function in torchscript, I'm ok having the input type of the function to be PIL-specific

I'm not terribly excited about this TBH:

On one hand the method receives a uint8 tensor as input (not a PIL image) and hides completely any dependency on PIL. I would agree with earlier comments of yours that it's a bit odd that we expose ImageFont here.

On the other hand, using PIL's ImageFont gives the flexibility to the user to do whatever they want without having to deal on our side with the details on how to instantiate the object. It's surely is ugly though and makes for a weird API.

I could try to create a font parameter similar to PIL with description "A filename or file-like object containing a TrueType font." and a font_size. Thoughts?

Have a look on the latest commit for an alternative to passing ImageFont. We can choose any of the two options, I'm OK with both.

torchvision/utils.py

test/test_utils.py

fmassa · 2020-11-20T13:59:29Z

test/test_utils.py

+        boxes = torch.tensor([[0, 0, 100, 100], [0, 0, 0, 0],
+                             [10, 15, 30, 35], [23, 35, 93, 95]], dtype=torch.float)
+        labels = ['a', 'b', 'c', 'd']
+        utils.draw_bounding_boxes(img, boxes, labels=labels)


can you also add a test checking that the color of the output image at pixel values out[:, 0, 0:100] == fillcolor etc, so that we know that we are masking the correct pixels in the image?

I agree we should test pixels but I would rather test all functionalities including labels, fonts etc. I wonder if that's possible or if it will crate a flaky test due to differences on fonts across platforms. I'll give a try to test what I proposed on the earlier comment and see if this works.

See latest code for the proposed approach of testing.

datumbox · 2020-11-20T18:23:29Z

test/common_utils.py

+def set_rng_seed(seed):
+    torch.manual_seed(seed)
+    random.seed(seed)
+    np.random.seed(seed)


This change was originally done on an intermediate commit where I was producing a random image and had to fix the seed. Though I switched to non-random to reduce the size, I think it's a good idea to move this method from test_models.py to commont_utils.py, so I kept the change in this PR.

fmassa

Looks great, thanks a lot!

fmassa · 2020-11-27T14:06:08Z

test/test_utils.py

+        if not os.path.exists(path):
+            Image.fromarray(result.permute(1, 2, 0).numpy()).save(path)


nit: any particular reason why you use PIL to save the result, and not write_image? Although this is not really important as the file is committed to the repo.

I agree that this is worth changing.

fmassa · 2020-11-27T14:09:32Z

torchvision/utils.py

+        draw.rectangle(bbox, width=width, outline=color)
+
+        if labels is not None:
+            txt_font = ImageFont.load_default() if font is None else ImageFont.truetype(font=font, size=font_size)


nit for a follow-up PR: we can move this to outside of the for loop

Agreed, this can move outside of the loop.

datumbox · 2020-11-27T14:13:44Z

Lots of thanks to @oke-aditya and @sumanthratna for their thorough investigations and contributions on the final API and implementation.

oke-aditya · 2020-11-27T14:16:52Z

That's so kind of you @datumbox . Not much from me, it was your great work to get this done.

* initital prototype * flake * Adds documentation * minimal working bboxes * Adds label display * adds colors :-) * adds suggestions and fixes CI * handles image of dim 4 * fixes image handling * removes dev file * adds suggested changes * Updating the API. * Update test. * Implementing code review improvements. * Further refactoring and adding test. * Replace random to white to reduce size and change font on tests. Co-authored-by: Vasilis Vryniotis <[email protected]>

oke-aditya added 2 commits October 11, 2020 00:51

initital prototype

0a40928

flake

42ab7aa

oke-aditya changed the title ~~[WIP] Adds utlity to draw bounding boxes~~ [WIP] Adds utility to draw bounding boxes Oct 10, 2020

Adds documentation

e229fc7

oke-aditya added 3 commits October 11, 2020 18:49

minimal working bboxes

47acfad

Adds label display

86c0dc9

adds colors :-)

e6225e5

oke-aditya changed the title ~~[WIP] Adds utility to draw bounding boxes~~ Adds utility to draw bounding boxes Oct 11, 2020

sumanthratna mentioned this pull request Oct 11, 2020

[WIP] Implement torchvision.utils.draw_bounding_boxes #2631

Closed

3 tasks

pmeier requested changes Oct 12, 2020

View reviewed changes

oke-aditya and others added 2 commits October 12, 2020 14:00

Merge branch 'master' into add_drawbox

a66076b

adds suggestions and fixes CI

124977f

oke-aditya and others added 4 commits October 12, 2020 15:21

handles image of dim 4

83524a5

fixes image handling

deccdab

removes dev file

6886cac

Merge branch 'master' into add_drawbox

fc34ccb

fmassa reviewed Oct 12, 2020

View reviewed changes

oke-aditya added 2 commits October 13, 2020 15:21

adds suggested changes

396dc53

Merge branch 'add_drawbox' of github.com:oke-aditya/vision into add_d…

1aa1b03

…rawbox

Merge branch 'master' into add_drawbox

c486660

Merge branch 'master' into add_drawbox

9c14a15

Merge branch 'master' into add_drawbox

28d29af

datumbox self-assigned this Nov 20, 2020

datumbox changed the title ~~Adds utility to draw bounding boxes~~ [WIP] Adds utility to draw bounding boxes Nov 20, 2020

datumbox added 3 commits November 20, 2020 11:30

Merge branch 'master' into add_drawbox

d8e10b4

Updating the API.

cbd5ee9

Update test.

35f6951

fmassa reviewed Nov 20, 2020

View reviewed changes

datumbox added 3 commits November 20, 2020 17:11

Implementing code review improvements.

07274f1

Further refactoring and adding test.

30905e9

Replace random to white to reduce size and change font on tests.

1568b54

datumbox reviewed Nov 20, 2020

View reviewed changes

datumbox changed the title ~~[WIP] Adds utility to draw bounding boxes~~ Add utility to draw bounding boxes Nov 20, 2020

fmassa approved these changes Nov 27, 2020

View reviewed changes

fmassa merged commit 240210c into pytorch:master Nov 27, 2020

This was referenced Dec 1, 2020

Improve the bounding boxes implementation #3072

Closed

Remove torchvision.io from test_utils.py #3092

Merged

oke-aditya mentioned this pull request Jan 21, 2021

Utility to draw Semantic Segmentation Masks #3272

Closed

oke-aditya deleted the add_drawbox branch January 22, 2021 08:49

This was referenced Feb 9, 2021

Improve utilites #3364

Closed

Add utilites to plot Keypoints #3365

Closed

oke-aditya mentioned this pull request Mar 22, 2021

Improved utilites, adds examples, tests #3594

Merged

6 tasks

oke-aditya mentioned this pull request Dec 27, 2021

Random colors for drawing boxes #5127

Merged

		if not os.path.exists(path):
		Image.fromarray(result.permute(1, 2, 0).numpy()).save(path)

Add utility to draw bounding boxes #2785

Add utility to draw bounding boxes #2785

Conversation

oke-aditya commented Oct 10, 2020 • edited by datumbox Loading

oke-aditya commented Oct 11, 2020 • edited Loading

oke-aditya commented Oct 11, 2020 • edited Loading

pmeier left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

oke-aditya commented Oct 12, 2020

pmeier commented Oct 12, 2020

codecov bot commented Oct 12, 2020 • edited Loading

Codecov Report

oke-aditya commented Oct 12, 2020 • edited Loading

fmassa left a comment • edited Loading

Choose a reason for hiding this comment

oke-aditya commented Oct 13, 2020 • edited Loading

oke-aditya commented Oct 13, 2020 • edited Loading

fmassa commented Oct 13, 2020

Labels

oke-aditya commented Oct 13, 2020 • edited Loading

fmassa commented Nov 16, 2020

oke-aditya commented Nov 16, 2020

datumbox commented Nov 17, 2020

oke-aditya commented Nov 17, 2020

fmassa left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fmassa left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

datumbox commented Nov 27, 2020

oke-aditya commented Nov 27, 2020

oke-aditya commented Oct 10, 2020 •

edited by datumbox

Loading

oke-aditya commented Oct 11, 2020 •

edited

Loading

oke-aditya commented Oct 11, 2020 •

edited

Loading

codecov bot commented Oct 12, 2020 •

edited

Loading

oke-aditya commented Oct 12, 2020 •

edited

Loading

fmassa left a comment •

edited

Loading

oke-aditya commented Oct 13, 2020 •

edited

Loading

oke-aditya commented Oct 13, 2020 •

edited

Loading

oke-aditya commented Oct 13, 2020 •

edited

Loading