Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add masks to boundaries #7704

Open
wants to merge 27 commits into
base: main
Choose a base branch
from
Open
Changes from 1 commit
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
79dcbb1
Add masks to boundaries
bhack Jun 27, 2023
d171ffd
Doesn't expose directly the def
bhack Jun 28, 2023
9d41c0a
change erosion
bhack Jul 1, 2023
e277308
Add dummy test
bhack Jul 1, 2023
330301c
Merge pull request #1 from bhack/patch-2
bhack Jul 1, 2023
a8bd95c
Update ops.rst
bhack Jul 11, 2023
7311956
Merge branch 'main' into patch-1
bhack Dec 30, 2023
08485c0
Merge branch 'main' into patch-1
bhack Feb 15, 2024
59fb72c
Add debug image option
bhack Feb 17, 2024
091f3fb
Merge branch 'main' into patch-1
bhack Feb 17, 2024
fa68881
Merge branch 'main' into patch-1
bhack Mar 5, 2024
c2d8074
Merge branch 'main' into patch-1
bhack Mar 5, 2024
aa4b2e3
Merge branch 'main' into patch-1
bhack Mar 7, 2024
293e436
Merge branch 'main' into patch-1
bhack Mar 29, 2024
cf07bc0
Merge branch 'main' into patch-1
bhack Apr 18, 2024
7abbc3b
Merge branch 'main' into patch-1
bhack Apr 29, 2024
762992f
Merge branch 'main' into patch-1
bhack Apr 29, 2024
0991f93
Merge branch 'main' into patch-1
bhack Aug 30, 2024
4de4913
Merge branch 'main' into patch-1
bhack Sep 15, 2024
91df477
Merge branch 'main' into patch-1
bhack Oct 18, 2024
ebee25e
Merge branch 'main' into patch-1
bhack Oct 28, 2024
9fc12a9
Merge branch 'main' into patch-1
bhack Nov 6, 2024
080fa0d
Merge branch 'main' into patch-1
bhack Nov 11, 2024
e526765
Merge branch 'main' into patch-1
bhack Nov 14, 2024
1ec78df
Merge branch 'main' into patch-1
bhack Dec 2, 2024
78062c0
Merge branch 'main' into patch-1
bhack Dec 18, 2024
6cce8a6
Merge branch 'main' into patch-1
bhack Dec 30, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 33 additions & 0 deletions torchvision/ops/boxes.py
Original file line number Diff line number Diff line change
Expand Up @@ -382,7 +382,39 @@ def _box_diou_iou(boxes1: Tensor, boxes2: Tensor, eps: float = 1e-7) -> Tuple[Te
# distance between boxes' centers squared.
return iou - (centers_distance_squared / diagonal_distance_squared), iou

def masks_to_boundaries(masks: torch.Tensor, dilation_ratio: float = 0.02) -> torch.Tensor:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it's OK to have the implementation in this file even though this isn't related to boxed. However, I don't think we should expose it here. I think we should just expose it in from the torchvision.ops namespace (otherwise the implementation will always have to stay in this file for BC, and that may lock us).

We probably just need to rename this to _masks_to_boundaries and the expose it in torchvision.ops.__init__.py like

from .boxes import import _masks_to_boundaries as masks_to_boundaries

Any other suggestion @pmeier @vfdev-5 @oke-aditya ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it's OK to have the implementation in this file even though this isn't related to boxed.

No strong opinion, but could we maybe also have a new _masks.py module or move it into the misc.py one?

👍 for only exposing it in the torchvision.ops namespace.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tbh there is demand for mask_utils. Several of them, #4415 . Candidate utils like convert_masks_format, paste_masks_in_images, etc. Maybe it's time to create new files mask_utils.py and make future extensions possible?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can always create an ops.mask* namespace at any time. We should only do that when we know for sure we need it, i.e. when we start having 2+ mask utils. Alls ops are exposed in the ops. namespace anyway so there's no need to rush and create a file which will only have one single util in it ATM.

I'm OK with creating _mask.py as well (and we can rename it into mask.py later if we want to).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm OK with creating _mask.py as well (and we can rename it into mask.py later if we want to).

This sounds best solution! We can avoid the bloat inside this file as well as keep them private 😄

"""
Compute the boundaries around the provided masks using morphological operations.

Returns a tensor of the same shape as the input masks containing the boundaries of each mask.

Args:
masks (Tensor[N, H, W]): masks to transform where N is the number of masks
and (H, W) are the spatial dimensions.
dilation_ratio (float, optional): ratio used for the dilation operation. Default: 0.02

Returns:
Tensor[N, H, W]: boundaries
bhack marked this conversation as resolved.
Show resolved Hide resolved
"""
# If no masks are provided, return an empty tensor
if masks.numel() == 0:
return torch.zeros_like(masks)

n, h, w = masks.shape
img_diag = math.sqrt(h ** 2 + w ** 2)
dilation = int(round(dilation_ratio * img_diag))
selem_size = dilation * 2 + 1
bhack marked this conversation as resolved.
Show resolved Hide resolved
bhack marked this conversation as resolved.
Show resolved Hide resolved
selem = torch.ones((n, 1, selem_size, selem_size), device=masks.device)

# Compute the boundaries for each mask
masks = masks.float().unsqueeze(1)
eroded_masks = F.conv2d(masks, selem, padding=dilation, groups=n)
eroded_masks = (eroded_masks == selem.view(n, -1).sum(1, keepdim=True)).byte() # Make the output binary

contours = masks.byte() - eroded_masks

return contours.squeeze(1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think this code works as expected. Here is my test example and it fails in multiple places:

import torch
import numpy as np
from PIL import ImageDraw, Image

mask = torch.zeros(4, 32, 32, dtype=torch.bool)
mask[0, 1:10, 1:10] = True
mask[0, 12:20, 12:20] = True
mask[0, 15:18, 20:32] = True

mask[1, 15:23, 15:23] = True
mask[1, 22:33, 22:33] = True

mask[2, 1:5, 22:30] = True
mask[2, 5:14, 25:27] = True


pil_img = Image.new("L", (32, 32))

draw = ImageDraw.Draw(pil_img)
draw.ellipse([2, 7, 26, 26], fill=1, outline=1, width=1)

mask[3, ...] = torch.from_numpy(np.asarray(pil_img))


import math
from torch.nn import functional as F

dilation_ratio = 0.05
masks = mask.clone()

n, h, w = masks.shape
img_diag = math.sqrt(h ** 2 + w ** 2)
dilation = int(round(dilation_ratio * img_diag))
selem_size = dilation * 2 + 1
selem = torch.ones((n, 1, selem_size, selem_size), device=masks.device)


# Compute the boundaries for each mask
masks = masks.float().unsqueeze(1)
eroded_masks = F.conv2d(masks, selem, padding=dilation, groups=n)
eroded_masks = (eroded_masks == selem.view(n, -1).sum(1, keepdim=True)).byte()  # Make the output binary

contours = masks.byte() - eroded_masks
contours. = contours.squeeze(1)

Error:

---> 17 eroded_masks = (eroded_masks == selem.view(n, -1).sum(1, keepdim=True)).byte()  # Make the output binary

RuntimeError: The size of tensor a (32) must match the size of tensor b (4) at non-singleton dimension 2

Masks:
image

Error is related to masks = masks.float().unsqueeze(1) where we may need to unsqueeze(0) instead.
But if fixed like that, the next line does not make much sense IMO:

eroded_masks = (eroded_masks == selem.view(n, -1).sum(1, keepdim=True)).byte()

as eroded_masks shape wont match the size of conv weights...

Sorry, if I'm missing something...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about:

import torch
import numpy as np
from PIL import ImageDraw, Image
import math
from torch.nn import functional as F
import matplotlib.pyplot as plt

mask = torch.zeros(4, 32, 32, dtype=torch.bool)
mask[0, 1:10, 1:10] = True
mask[0, 12:20, 12:20] = True
mask[0, 15:18, 20:32] = True

mask[1, 15:23, 15:23] = True
mask[1, 22:33, 22:33] = True

mask[2, 1:5, 22:30] = True
mask[2, 5:14, 25:27] = True

pil_img = Image.new("L", (32, 32))
draw = ImageDraw.Draw(pil_img)
draw.ellipse([2, 7, 26, 26], fill=1, outline=1, width=1)
mask[3, ...] = torch.from_numpy(np.asarray(pil_img))

dilation_ratio = 0.05
masks = mask.clone()

n, h, w = masks.shape
img_diag = math.sqrt(h ** 2 + w ** 2)
dilation = int(round(dilation_ratio * img_diag))
selem_size = dilation * 2 + 1
selem = torch.ones((1, 1, selem_size, selem_size), device=masks.device)

# Compute the boundaries for each mask
masks = masks.float().unsqueeze(1)
eroded_masks = torch.zeros_like(masks)

#for i in range(n):
#    eroded_masks[i] = F.conv2d(masks[i].unsqueeze(0), selem, padding=dilation)
eroded_masks = F.conv2d(masks, selem, padding=dilation)

eroded_masks = (eroded_masks == selem.view(-1).sum()).byte()  # Make the output binary
contours = masks.byte() - eroded_masks
contours = contours.squeeze(1)

# Visualize the results
fig, ax = plt.subplots(n, 3, figsize=(10, 10))

for i in range(n):
    ax[i, 0].imshow(mask[i], cmap='gray')
    ax[i, 1].imshow(eroded_masks[i].squeeze(), cmap='gray')
    ax[i, 2].imshow(contours[i], cmap='gray')

plt.show()

immagine

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bhack why do we need dilation_ratio ? I think we can do the following without extra parametrization:

masks = masks.float().unsqueeze(1)
w_size = 3
w = torch.ones((1, 1, w_size, w_size), device=masks.device) / (w_size ** 2)
eroded_masks = F.conv2d(masks, w, padding=1)
contours = (masks - eroded_masks) > 0
contours = contours.squeeze(1)

what do you think ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is in the paper official implementation

https://github.com/bowenc0221/boundary-iou-api/blob/master/boundary_iou/utils/boundary_utils.py#L12

But also in the more classical F score (Davis dataset/challenge official eval kit).

https://github.com/davisvideochallenge/davis2017-evaluation/blob/master/davis2017/metrics.py#L57

As this is often a preprocessing step used in the boundary overlapping metrics (BoundaryIOU/Boundary F-Score) the dilate will give the control over the tolerance of the exact boundaries overlapping of the boundaries.

In both the papers they talked about bipartite graph matching but then they have always approximated with morphological ops.

Of you see the F/Davis case impl there is also an option where the tolerance/dilate Is defined by the input resolution.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the links. According to https://github.com/davisvideochallenge/davis2017-evaluation/blob/master/davis2017/metrics.py#L57 code, mask to boundary is done without using any parameters, see _seg2bmap:
https://github.com/davisvideochallenge/davis2017-evaluation/blob/ac7c43fca936f9722837b7fbd337d284ba37004b/davis2017/metrics.py#L122
Anyway, I see why they have dilation_ratio arg.

However, previously I missed the issue description and the context for this PR:

A mask to boundary API is useful for implementing many segmentation metrics used in many dataset and challenges (Davis F score, BoundaryIOU, etc..).
It could be also used more generally for visualization tasks.

In this case, I'm not very sure about torchvision's interest in following line by line what does https://github.com/bowenc0221/boundary-iou-api as 1) IMO we wont be able to reproduce cv2.erode behaviour and 2) as such helper function can be used within a metric implementation, it should be carefully tested vs ref implementation in a lot of corner cases etc (and this is not the role of torchvision, IMO).

In general, a method to produce mask to edges (sort of edge detector) could make sense like mask to bboxes.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the links. According to https://github.com/davisvideochallenge/davis2017-evaluation/blob/master/davis2017/metrics.py#L57 code, mask to boundary is done without using any parameters, see _seg2bmap:
https://github.com/davisvideochallenge/davis2017-evaluation/blob/ac7c43fca936f9722837b7fbd337d284ba37004b/davis2017/metrics.py#L122

Yes but cause in F they are dilating in an extra post-processing step in the metric instead of the BoundariesIOU approach (see dilate disk param)
https://github.com/davisvideochallenge/davis2017-evaluation/blob/master/davis2017/metrics.py#L77

In this case, I'm not very sure about torchvision's interest in following line by line what does https://github.com/bowenc0221/boundary-iou-api as 1) IMO we wont be able to reproduce cv2.erode behaviour and 2) as such helper function can be used within a metric implementation, it should be carefully tested vs ref implementation in a lot of corner cases etc (and this is not the role of torchvision, IMO).

I've tested another early implementation with some inputs but the Boundary IOU paper reference impl doesn't have a test suite.

In general, a method to produce mask to edges (sort of edge detector) could make sense like mask to bboxes.

Let me know as I am mainly interested to achieve the metric and eventually to contribute also an intermediate function here in the case it could be compatible and useful for other contexts/domain.

Copy link
Author

@bhack bhack Jun 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see you are also a member of the MONAI project so you have already something similar but it still rely on a non-Pytorch implementation:
https://github.com/Project-MONAI/MetricsReloaded/blob/main/MetricsReloaded/metrics/pairwise_measures.py#L963


def masks_to_boxes(masks: torch.Tensor) -> torch.Tensor:
"""
Compute the bounding boxes around the provided masks.
Expand Down Expand Up @@ -415,3 +447,4 @@ def masks_to_boxes(masks: torch.Tensor) -> torch.Tensor:
bounding_boxes[index, 3] = torch.max(y)

return bounding_boxes