Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Interest in box-IoU implementation that supports distance-IoU (DIoU) and complete-IoU (CIoU)? #3026

Closed
HamsterHuey opened this issue Nov 18, 2020 · 5 comments

Comments

@HamsterHuey
Copy link

🚀 Feature

Implement Distance and Complete IoU within torchvision to complement the current supported implementations of IoU and GIoU

Motivation

Distance and Complete IoU have been published about in the literature (Nov, 2019): https://arxiv.org/abs/1911.08287

They have been shown to be very effective when utilized as loss-terms in box-regression for object detectors in comparison to conventional IoU loss or even generalized IoU (GIoU), both of which currently have implementations in torchvision. They are also actively used in training state-of-the-art object detectors (eg: Yolo v4)

Pitch

I already have a working prototype for an implementation. How this would be structured is open-ended. I think it might be better to just have a single Box IoU method that supports multiple modes? This is the call-signature of my current implementation:

def box_iou(
        boxes1: torch.Tensor,
        boxes2: torch.Tensor,
        fmt: str = 'xcycwh',
        iou_type: str = 'ciou',
) -> torch.Tensor:
    """
    A general-purpose intersection-over-union (Jaccard index) estimation method
    for boxes that supports standard IoU, Generalized IoU, Distance IoU and
    Complete IoU.

    Arguments:
        boxes1 (Tensor[N, 4]): [N, 4] Tensor of N boxes
        boxes2 (Tensor[M, 4]): [M, 4] Tensor of M boxes
        fmt (str, optional): Specifies format of input boxes. One of {'xywh',
            'xyxy', 'xcycwh'}; Default is **'xcycwh'**
        iou_type (str, optional): One of {'iou', 'giou', 'ciou', diou'};
            Default is **'ciou'**

    Returns:
        iou (Tensor[N, M]): the NxM matrix containing the pairwise
            IoU values for every element in boxes1 and boxes2. The type of IoU
            metric is determined by the input ``iou_type`` param.
    """

Obviously, the fmt param above is entirely optional. I've found it convenient to support the different formats, but it is easily stripped out if we want to enforce all boxes be in XYXY format.

I've written this up by adapting the current implementation of GIoU and there shouldn't be any extra overhead for any single pathway for the IoU type. Happy to contribute if this is of interest to the community.

@oke-aditya
Copy link
Contributor

oke-aditya commented Nov 19, 2020

Hmm, orignally while adding gIoU Issue #2545 also proposed to add CIoU and DIoU.

fmt parameter I think would need to be implemented for other ops too for consistency.
I think adding optional fmt parameter won't cause a breaking change.

E.g. box_area, clip_boxes_to_image etc. too don't take this parameter.

For all torchvision ops the format used is XYXY.
To convert into XYXY from YOLO / COCO users can use box_convert from ops.

@fmassa should ops make use of fmt parameter ?
It extends the functionality. Previously before adding ops we didn't have box_convert so we couldn't.

@HamsterHuey
Copy link
Author

I'd personally vote for a simpler API and removing fmt from here. Like you said, box_convert exists for this reason. It's probably best for all the other methods to work only on XYXY format, otherwise you end up with a bunch of format handling overhead in each new method that operates on boxes.

@senarvi
Copy link

senarvi commented Jan 17, 2022

I support the idea of adding CIoU and DIoU. I agree with @HamsterHuey that a simpler API is better than every function doing also format conversion.

There's also one thing that I don't like with the current box_iou / generalized_box_iou interface: They take two input arguments, boxes1 and boxes2 and return a matrix of output values for every possible pair (box1, box2), where box1 is from the first input argument and box2 is from the second one. What e.g. YOLOv4 needs is just the IoU values of the matching boxes, so one needs to call box_iou(boxes1, boxes2).diagonal(). This is slow (O(N^2) instead of O(N)) and it's not consistent with other PyTorch operations. For example, if you want to calculate the sum of matching elements of two vectors, you call x + y, and if you want a matrix of all possible combinations, then you call x[:, None] + y[None, :]. If we want the functions to support producing all possible combinations, I suggest making it optional.

@oke-aditya
Copy link
Contributor

We have implemented this in release 0.13 Hope you find it useful @senarvi @HamsterHuey

cc @datumbox

@datumbox
Copy link
Contributor

datumbox commented Jul 4, 2022

Ooops, yes thanks for the ping. It's available on v0.13. I'll close the issue.

@datumbox datumbox closed this as completed Jul 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants