-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add masks to boundaries #7704
base: main
Are you sure you want to change the base?
Add masks to boundaries #7704
Changes from 1 commit
79dcbb1
d171ffd
9d41c0a
e277308
330301c
a8bd95c
7311956
08485c0
59fb72c
091f3fb
fa68881
c2d8074
aa4b2e3
293e436
cf07bc0
7abbc3b
762992f
0991f93
4de4913
91df477
ebee25e
9fc12a9
080fa0d
e526765
1ec78df
78062c0
6cce8a6
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -382,7 +382,39 @@ def _box_diou_iou(boxes1: Tensor, boxes2: Tensor, eps: float = 1e-7) -> Tuple[Te | |
# distance between boxes' centers squared. | ||
return iou - (centers_distance_squared / diagonal_distance_squared), iou | ||
|
||
def masks_to_boundaries(masks: torch.Tensor, dilation_ratio: float = 0.02) -> torch.Tensor: | ||
""" | ||
Compute the boundaries around the provided masks using morphological operations. | ||
|
||
Returns a tensor of the same shape as the input masks containing the boundaries of each mask. | ||
|
||
Args: | ||
masks (Tensor[N, H, W]): masks to transform where N is the number of masks | ||
and (H, W) are the spatial dimensions. | ||
dilation_ratio (float, optional): ratio used for the dilation operation. Default: 0.02 | ||
|
||
Returns: | ||
Tensor[N, H, W]: boundaries | ||
bhack marked this conversation as resolved.
Show resolved
Hide resolved
|
||
""" | ||
# If no masks are provided, return an empty tensor | ||
if masks.numel() == 0: | ||
return torch.zeros_like(masks) | ||
|
||
n, h, w = masks.shape | ||
img_diag = math.sqrt(h ** 2 + w ** 2) | ||
dilation = int(round(dilation_ratio * img_diag)) | ||
selem_size = dilation * 2 + 1 | ||
bhack marked this conversation as resolved.
Show resolved
Hide resolved
bhack marked this conversation as resolved.
Show resolved
Hide resolved
|
||
selem = torch.ones((n, 1, selem_size, selem_size), device=masks.device) | ||
|
||
# Compute the boundaries for each mask | ||
masks = masks.float().unsqueeze(1) | ||
eroded_masks = F.conv2d(masks, selem, padding=dilation, groups=n) | ||
eroded_masks = (eroded_masks == selem.view(n, -1).sum(1, keepdim=True)).byte() # Make the output binary | ||
|
||
contours = masks.byte() - eroded_masks | ||
|
||
return contours.squeeze(1) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I do not think this code works as expected. Here is my test example and it fails in multiple places: import torch
import numpy as np
from PIL import ImageDraw, Image
mask = torch.zeros(4, 32, 32, dtype=torch.bool)
mask[0, 1:10, 1:10] = True
mask[0, 12:20, 12:20] = True
mask[0, 15:18, 20:32] = True
mask[1, 15:23, 15:23] = True
mask[1, 22:33, 22:33] = True
mask[2, 1:5, 22:30] = True
mask[2, 5:14, 25:27] = True
pil_img = Image.new("L", (32, 32))
draw = ImageDraw.Draw(pil_img)
draw.ellipse([2, 7, 26, 26], fill=1, outline=1, width=1)
mask[3, ...] = torch.from_numpy(np.asarray(pil_img))
import math
from torch.nn import functional as F
dilation_ratio = 0.05
masks = mask.clone()
n, h, w = masks.shape
img_diag = math.sqrt(h ** 2 + w ** 2)
dilation = int(round(dilation_ratio * img_diag))
selem_size = dilation * 2 + 1
selem = torch.ones((n, 1, selem_size, selem_size), device=masks.device)
# Compute the boundaries for each mask
masks = masks.float().unsqueeze(1)
eroded_masks = F.conv2d(masks, selem, padding=dilation, groups=n)
eroded_masks = (eroded_masks == selem.view(n, -1).sum(1, keepdim=True)).byte() # Make the output binary
contours = masks.byte() - eroded_masks
contours. = contours.squeeze(1) Error:
Error is related to
as eroded_masks shape wont match the size of conv weights... Sorry, if I'm missing something... There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What do you think about: import torch
import numpy as np
from PIL import ImageDraw, Image
import math
from torch.nn import functional as F
import matplotlib.pyplot as plt
mask = torch.zeros(4, 32, 32, dtype=torch.bool)
mask[0, 1:10, 1:10] = True
mask[0, 12:20, 12:20] = True
mask[0, 15:18, 20:32] = True
mask[1, 15:23, 15:23] = True
mask[1, 22:33, 22:33] = True
mask[2, 1:5, 22:30] = True
mask[2, 5:14, 25:27] = True
pil_img = Image.new("L", (32, 32))
draw = ImageDraw.Draw(pil_img)
draw.ellipse([2, 7, 26, 26], fill=1, outline=1, width=1)
mask[3, ...] = torch.from_numpy(np.asarray(pil_img))
dilation_ratio = 0.05
masks = mask.clone()
n, h, w = masks.shape
img_diag = math.sqrt(h ** 2 + w ** 2)
dilation = int(round(dilation_ratio * img_diag))
selem_size = dilation * 2 + 1
selem = torch.ones((1, 1, selem_size, selem_size), device=masks.device)
# Compute the boundaries for each mask
masks = masks.float().unsqueeze(1)
eroded_masks = torch.zeros_like(masks)
#for i in range(n):
# eroded_masks[i] = F.conv2d(masks[i].unsqueeze(0), selem, padding=dilation)
eroded_masks = F.conv2d(masks, selem, padding=dilation)
eroded_masks = (eroded_masks == selem.view(-1).sum()).byte() # Make the output binary
contours = masks.byte() - eroded_masks
contours = contours.squeeze(1)
# Visualize the results
fig, ax = plt.subplots(n, 3, figsize=(10, 10))
for i in range(n):
ax[i, 0].imshow(mask[i], cmap='gray')
ax[i, 1].imshow(eroded_masks[i].squeeze(), cmap='gray')
ax[i, 2].imshow(contours[i], cmap='gray')
plt.show() There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @bhack why do we need dilation_ratio ? I think we can do the following without extra parametrization: masks = masks.float().unsqueeze(1)
w_size = 3
w = torch.ones((1, 1, w_size, w_size), device=masks.device) / (w_size ** 2)
eroded_masks = F.conv2d(masks, w, padding=1)
contours = (masks - eroded_masks) > 0
contours = contours.squeeze(1) what do you think ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It is in the paper official implementation But also in the more classical F score (Davis dataset/challenge official eval kit). As this is often a preprocessing step used in the boundary overlapping metrics (BoundaryIOU/Boundary F-Score) the dilate will give the control over the tolerance of the exact boundaries overlapping of the boundaries. In both the papers they talked about bipartite graph matching but then they have always approximated with morphological ops. Of you see the F/Davis case impl there is also an option where the tolerance/dilate Is defined by the input resolution. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for the links. According to https://github.com/davisvideochallenge/davis2017-evaluation/blob/master/davis2017/metrics.py#L57 code, mask to boundary is done without using any parameters, see However, previously I missed the issue description and the context for this PR:
In this case, I'm not very sure about torchvision's interest in following line by line what does https://github.com/bowenc0221/boundary-iou-api as 1) IMO we wont be able to reproduce In general, a method to produce mask to edges (sort of edge detector) could make sense like mask to bboxes. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yes but cause in F they are dilating in an extra post-processing step in the metric instead of the
I've tested another early implementation with some inputs but the Boundary IOU paper reference impl doesn't have a test suite.
Let me know as I am mainly interested to achieve the metric and eventually to contribute also an intermediate function here in the case it could be compatible and useful for other contexts/domain. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see you are also a member of the MONAI project so you have already something similar but it still rely on a non-Pytorch implementation: |
||
|
||
def masks_to_boxes(masks: torch.Tensor) -> torch.Tensor: | ||
""" | ||
Compute the bounding boxes around the provided masks. | ||
|
@@ -415,3 +447,4 @@ def masks_to_boxes(masks: torch.Tensor) -> torch.Tensor: | |
bounding_boxes[index, 3] = torch.max(y) | ||
|
||
return bounding_boxes | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it's OK to have the implementation in this file even though this isn't related to boxed. However, I don't think we should expose it here. I think we should just expose it in from the
torchvision.ops
namespace (otherwise the implementation will always have to stay in this file for BC, and that may lock us).We probably just need to rename this to
_masks_to_boundaries
and the expose it intorchvision.ops.__init__.py
likeAny other suggestion @pmeier @vfdev-5 @oke-aditya ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No strong opinion, but could we maybe also have a new
_masks.py
module or move it into themisc.py
one?👍 for only exposing it in the
torchvision.ops
namespace.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tbh there is demand for mask_utils. Several of them, #4415 . Candidate utils like
convert_masks_format
,paste_masks_in_images
, etc. Maybe it's time to create new filesmask_utils.py
and make future extensions possible?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can always create an
ops.mask*
namespace at any time. We should only do that when we know for sure we need it, i.e. when we start having 2+ mask utils. Alls ops are exposed in theops.
namespace anyway so there's no need to rush and create a file which will only have one single util in it ATM.I'm OK with creating
_mask.py
as well (and we can rename it intomask.py
later if we want to).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sounds best solution! We can avoid the bloat inside this file as well as keep them private 😄