Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

3D NMS and RoiAlign for volumetric data #2402

Open
mibaumgartner opened this issue Jul 7, 2020 · 4 comments
Open

3D NMS and RoiAlign for volumetric data #2402

mibaumgartner opened this issue Jul 7, 2020 · 4 comments

Comments

@mibaumgartner
Copy link

🚀 Feature

3D data gains more and more popularity inside the deep learning community. As a consequence it would be great to have a unified 3D NMS and 3D ROI Align for future and current projects like MONAI .

Motivation

Information added from @mjorgecardoso
Medical imaging is a huge field of research, with conferences such as ISMRM (5k+ attendees), MICCAI (2.5k+), ISBI (1.5k+). Volumetric neural network operations (convolutions, pooling, etc), are common and supported in PyTorch (see here https://pytorch.org/docs/master/generated/torch.nn.Conv3d.html).

Spatial dimensions summarised:
N = batch size, C = channels, H = height, W = width, D = depth / T = time

Typically found in 2D: [N, C, H, W]

Typically found in 2d + time (video): [N, C, T, H, W]
Expected behaviour: operations are only applied along the spatial dimensions (H, W) and NOT along T

Typically found in 3d (volumetric): [N, C, D, H, W] (sometimes also [N, C, H, W, D] as in medicaldetectiontoolkit)
Expected behaviour: operations are applied along all spatial dimensions (D,H,W)

Pitch

Add support for NMS and RoiAlign for volumetric data and define the right conventions and proper documentation to make clear which function needs to be used in which case.

For backward compatibility nms and roialign should be kept as an alias for their plain 2d counterparts. Moving forward, there could be two functions nms2d and nms3d (like typically found in pytorch e.g. Conv2d and Conv3d). I'm not quite sure what the optimal way of handling/naming the video case is (maybe a flag inside the 3d versions?).

Alternatives

Additional context

#2337
#1678
@pfjaeger

@naga-karthik
Copy link

Hello, I am wondering what's the status of this issue? Are 3D NMS and 3D ROI Align going to be implemented in future version of torchvision anytime soon? As the OP mentioned, having access to 3D versions of the above ops would make it convenient to train models on volumetric (medical) data. Thanks!

@datumbox
Copy link
Contributor

@naga-karthik Thanks for the interest. Right now we don't have the bandwidth to investigate and implement the proposed features. We are a small team and we are currently tackling other more high-priority issues (new Datasets API, new Transforms API etc). Rest assured we will definitely review this on the next planning session.

@etasnadi
Copy link

Dear All,

If anyone is considering to implement this in torchvision, I have a working 3D RoiAlign kernel implemented in Tensnorflow that could be directly ported back into PyTorch. You can pull the 3D kernels from here: https://github.com/etasnadi/roi_align_3D.

@etasnadi
Copy link

etasnadi commented Apr 29, 2024

Might worth considering https://github.com/TimothyZero/MedVision/tree/main for the torch version also.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants