关于Complete-IoU Loss and Cluster-NMS for Improving Object Detection and Instance Segmentation #19

Wanghe1997 · 2022-03-13T13:16:03Z

作者您好，您的这篇论文中CIoU和Cluster-NMS的组合可以用在YOLOv5中吗？我在YOLOv5中该如何将torchvision原始的NMS替换为Cluster-NMS呢？谢谢

Wanghe1997 · 2022-03-13T13:51:31Z

# For batch mode Cluster-Weighted NMS
def non_max_suppression(prediction, conf_thres=0.1, iou_thres=0.6, max_box=1500, merge=False, classes=None, agnostic=False):
    """Performs Non-Maximum Suppression (NMS) on inference results
    Returns:
         detections with shape: nx6 (x1, y1, x2, y2, conf, cls)
    """

    nc = prediction[0].shape[1] - 5  # number of classes
    xc = prediction[..., 4] > conf_thres  # candidates

    # Settings
    min_wh, max_wh = 2, 4096  # (pixels) minimum and maximum box width and height
    max_det = 300  # maximum number of detections per image
    time_limit = 10.0  # seconds to quit after
    redundant = True  # require redundant detections
    multi_label = nc > 1  # multiple labels per box (adds 0.5ms/img)

    t = time.time()
    output = [None] * prediction.shape[0]   
    pred1 = (prediction < -1).float()[:,:max_box,:6]    # pred1.size()=[batch, max_box, 6] denotes boxes without offset by class
    pred2 = pred1[:,:,:4]+0   # pred2 denotes boxes with offset by class
    batch_size = prediction.shape[0]   
    for xi, x in enumerate(prediction):  # image index, image inference
        # Apply constraints
        # x[((x[..., 2:4] < min_wh) | (x[..., 2:4] > max_wh)).any(1), 4] = 0  # width-height
        x = x[xc[xi]]  # confidence

        # If none remain process next image
        if not x.shape[0]:
            continue

        # Compute conf
        x[:, 5:] *= x[:, 4:5]  # conf = obj_conf * cls_conf

        # Box (center x, center y, width, height) to (x1, y1, x2, y2)
        box = xywh2xyxy(x[:, :4])

        # Detections matrix nx6 (xyxy, conf, cls)
        if multi_label:
            i, j = (x[:, 5:] > conf_thres).nonzero(as_tuple=False).T
            x = torch.cat((box[i], x[i, j + 5, None], j[:, None].float()), 1)
        else:  # best class only
            conf, j = x[:, 5:].max(1, keepdim=True)
            x = torch.cat((box, conf, j.float()), 1)[conf.view(-1) > conf_thres]

        # Filter by class
        if classes:
            x = x[(x[:, 5:6] == torch.tensor(classes, device=x.device)).any(1)]

        # Apply finite constraint
        # if not torch.isfinite(x).all():
        #     x = x[torch.isfinite(x).all(1)]

        # If none remain process next image
        n = x.shape[0]  # number of boxes
        if not n:
            continue

        # Sort by confidence
        x = x[x[:, 4].argsort(descending=True)]
        c = x[:, 5] * 0 if agnostic else x[:, 5]  # classes

        boxes = (x[:, :4].clone() + c.view(-1, 1) * max_wh)[:max_box]  # boxes (offset by class), scores
        pred2[xi,:] = torch.cat((boxes, pred2[xi,:]), 0)[:max_box]        # If less than max_box, padding 0.
        pred1[xi,:] = torch.cat((x[:max_box], pred1[xi,:]), 0)[:max_box]

    # Batch mode Weighted Cluster-NMS

    iou = jaccard(pred2, pred2).triu_(diagonal=1)    # switch to 'jaccard_diou' function for using Cluster-DIoU-NMS
    B = iou
    for i in range(200):
        A=B
        maxA=A.max(dim=1)[0]
        E = (maxA<iou_thres).float().unsqueeze(2).expand_as(A)
        B=iou.mul(E)
        if A.equal(B)==True:
            break
    keep = (maxA <= iou_thres) 
    weights = (B*(B>0.8) + torch.eye(max_box).cuda().expand(batch_size,max_box,max_box)) * (pred1[:,:,4].reshape((batch_size,1,max_box)))
    pred1[:,:, :4]=torch.matmul(weights,pred1[:,:,:4]) / weights.sum(2, keepdim=True)   # weighted coordinates

    for jj in range(batch_size):
        output[jj] = pred1[jj][keep[jj]]

    return output

作者，是不是将这段代码替换掉YOLOv5 general.py中同名的non_max_suppression函数就相当于使用了Cluster-NMS了呢？我使用的YOLOv5代码是5.0版本，默认的边框损失已经使用CIoU。这是第一个问题。
第二个问题是：

图中画圈的这个参数对应YOLOv5代码中的哪个参数呢？
第三个问题是：
我在YOLOv5 Evaluation过程（也就是运行test.py）中的策略是batchsize为16，不使用TTA。在这种策略下，使用您的Cluster-NMS替代原YOLOv5的NMS会提高mAP等参数吗？按您的经验，超参数应该如何设置效果最好？谢谢

Zzh-tju · 2022-03-14T09:40:39Z

这个代码是Batch mode Weighted Cluster-NMS，batch mode的意思便是在一个batch中统一执行NMS，否则需要循环处理每一张图片。鉴于torchvision NMS非常快，以及Batch mode Weighted Cluster-NMS需要一些预处理操作，因此在TTA开启时Batch mode Weighted Cluster-NMS的速度才可超过torchvision NMS。如果你不想用batch mode Cluster-NMS，可以使用https://github.com/Zzh-tju/ultralytics-YOLOv3-Cluster-NMS
不管你使用哪个，都要注意变量名可能会与最新YOLOv5有所出入，毕竟它一直在不断更新，所以简单替换未必能跑通。
weighted threshold指的是weights = (B*(B>0.8)中的0.8这个阈值，它表示只挑选与被保留框IoU>0.8的相邻框进行坐标加权。
据YOLOv5作者说，YOLOv5的检测框已经预测得足够准了，因此NMS的坐标加权，DIoU-NMS之类可能提升有限。因此他出于速度的考虑，维持torchvision NMS (精度等同于Original NMS) 的使用。因此我的建议是，Cluster-NMS及其变体系列有可能可以获得精度提升，需要尝试几组参数，如weighted threshold，DIoU-NMS的中心点距离惩罚幅度，如果使用batch mode Cluster-NMS还可考虑max-box的参数，以求得速度-精度的权衡。当然，在开启TTA时，batch mode Cluster-NMS是最快的。值得一提的是，如果你重视召回率，那么DIoU-NMS必能提高recall。

yilifan · 2022-06-27T06:44:10Z

您好，关于源代码实现计算ciou的小疑问，bboxes1 = torch.sigmoid(bboxes1)，bboxes2 = torch.sigmoid(bboxes2)，计算后面高宽的时候w1 = torch.exp(bboxes1[:, 2])，不是很明白为什么这里选择先对box激活。参考其他的一些iou实现源码都没有激活这个操作，反而是直接对box计算

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

关于Complete-IoU Loss and Cluster-NMS for Improving Object Detection and Instance Segmentation #19

关于Complete-IoU Loss and Cluster-NMS for Improving Object Detection and Instance Segmentation #19

Wanghe1997 commented Mar 13, 2022

Wanghe1997 commented Mar 13, 2022

Zzh-tju commented Mar 14, 2022 •

edited

Loading

yilifan commented Jun 27, 2022

关于Complete-IoU Loss and Cluster-NMS for Improving Object Detection and Instance Segmentation #19

关于Complete-IoU Loss and Cluster-NMS for Improving Object Detection and Instance Segmentation #19

Comments

Wanghe1997 commented Mar 13, 2022

Wanghe1997 commented Mar 13, 2022

Zzh-tju commented Mar 14, 2022 • edited Loading

yilifan commented Jun 27, 2022

Zzh-tju commented Mar 14, 2022 •

edited

Loading