Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于Complete-IoU Loss and Cluster-NMS for Improving Object Detection and Instance Segmentation #19

Open
Wanghe1997 opened this issue Mar 13, 2022 · 3 comments

Comments

@Wanghe1997
Copy link

作者您好,您的这篇论文中CIoU和Cluster-NMS的组合可以用在YOLOv5中吗?我在YOLOv5中该如何将torchvision原始的NMS替换为Cluster-NMS呢?谢谢

@Wanghe1997
Copy link
Author

# For batch mode Cluster-Weighted NMS
def non_max_suppression(prediction, conf_thres=0.1, iou_thres=0.6, max_box=1500, merge=False, classes=None, agnostic=False):
    """Performs Non-Maximum Suppression (NMS) on inference results
    Returns:
         detections with shape: nx6 (x1, y1, x2, y2, conf, cls)
    """

    nc = prediction[0].shape[1] - 5  # number of classes
    xc = prediction[..., 4] > conf_thres  # candidates

    # Settings
    min_wh, max_wh = 2, 4096  # (pixels) minimum and maximum box width and height
    max_det = 300  # maximum number of detections per image
    time_limit = 10.0  # seconds to quit after
    redundant = True  # require redundant detections
    multi_label = nc > 1  # multiple labels per box (adds 0.5ms/img)

    t = time.time()
    output = [None] * prediction.shape[0]   
    pred1 = (prediction < -1).float()[:,:max_box,:6]    # pred1.size()=[batch, max_box, 6] denotes boxes without offset by class
    pred2 = pred1[:,:,:4]+0   # pred2 denotes boxes with offset by class
    batch_size = prediction.shape[0]   
    for xi, x in enumerate(prediction):  # image index, image inference
        # Apply constraints
        # x[((x[..., 2:4] < min_wh) | (x[..., 2:4] > max_wh)).any(1), 4] = 0  # width-height
        x = x[xc[xi]]  # confidence

        # If none remain process next image
        if not x.shape[0]:
            continue

        # Compute conf
        x[:, 5:] *= x[:, 4:5]  # conf = obj_conf * cls_conf

        # Box (center x, center y, width, height) to (x1, y1, x2, y2)
        box = xywh2xyxy(x[:, :4])

        # Detections matrix nx6 (xyxy, conf, cls)
        if multi_label:
            i, j = (x[:, 5:] > conf_thres).nonzero(as_tuple=False).T
            x = torch.cat((box[i], x[i, j + 5, None], j[:, None].float()), 1)
        else:  # best class only
            conf, j = x[:, 5:].max(1, keepdim=True)
            x = torch.cat((box, conf, j.float()), 1)[conf.view(-1) > conf_thres]

        # Filter by class
        if classes:
            x = x[(x[:, 5:6] == torch.tensor(classes, device=x.device)).any(1)]

        # Apply finite constraint
        # if not torch.isfinite(x).all():
        #     x = x[torch.isfinite(x).all(1)]

        # If none remain process next image
        n = x.shape[0]  # number of boxes
        if not n:
            continue

        # Sort by confidence
        x = x[x[:, 4].argsort(descending=True)]
        c = x[:, 5] * 0 if agnostic else x[:, 5]  # classes

        boxes = (x[:, :4].clone() + c.view(-1, 1) * max_wh)[:max_box]  # boxes (offset by class), scores
        pred2[xi,:] = torch.cat((boxes, pred2[xi,:]), 0)[:max_box]        # If less than max_box, padding 0.
        pred1[xi,:] = torch.cat((x[:max_box], pred1[xi,:]), 0)[:max_box]

    # Batch mode Weighted Cluster-NMS

    iou = jaccard(pred2, pred2).triu_(diagonal=1)    # switch to 'jaccard_diou' function for using Cluster-DIoU-NMS
    B = iou
    for i in range(200):
        A=B
        maxA=A.max(dim=1)[0]
        E = (maxA<iou_thres).float().unsqueeze(2).expand_as(A)
        B=iou.mul(E)
        if A.equal(B)==True:
            break
    keep = (maxA <= iou_thres) 
    weights = (B*(B>0.8) + torch.eye(max_box).cuda().expand(batch_size,max_box,max_box)) * (pred1[:,:,4].reshape((batch_size,1,max_box)))
    pred1[:,:, :4]=torch.matmul(weights,pred1[:,:,:4]) / weights.sum(2, keepdim=True)   # weighted coordinates

    for jj in range(batch_size):
        output[jj] = pred1[jj][keep[jj]]

    return output

作者,是不是将这段代码替换掉YOLOv5 general.py中同名的non_max_suppression函数就相当于使用了Cluster-NMS了呢?我使用的YOLOv5代码是5.0版本,默认的边框损失已经使用CIoU。这是第一个问题。
第二个问题是:
image
图中画圈的这个参数对应YOLOv5代码中的哪个参数呢?
第三个问题是:
我在YOLOv5 Evaluation过程(也就是运行test.py)中的策略是batchsize为16,不使用TTA。在这种策略下,使用您的Cluster-NMS替代原YOLOv5的NMS会提高mAP等参数吗?按您的经验,超参数应该如何设置效果最好?谢谢

@Zzh-tju
Copy link
Owner

Zzh-tju commented Mar 14, 2022

  1. 这个代码是Batch mode Weighted Cluster-NMS,batch mode的意思便是在一个batch中统一执行NMS,否则需要循环处理每一张图片。鉴于torchvision NMS非常快,以及Batch mode Weighted Cluster-NMS需要一些预处理操作,因此在TTA开启时Batch mode Weighted Cluster-NMS的速度才可超过torchvision NMS。如果你不想用batch mode Cluster-NMS,可以使用https://github.com/Zzh-tju/ultralytics-YOLOv3-Cluster-NMS
    不管你使用哪个,都要注意变量名可能会与最新YOLOv5有所出入,毕竟它一直在不断更新,所以简单替换未必能跑通。

  2. weighted threshold指的是weights = (B*(B>0.8)中的0.8这个阈值,它表示只挑选与被保留框IoU>0.8的相邻框进行坐标加权。

  3. 据YOLOv5作者说,YOLOv5的检测框已经预测得足够准了,因此NMS的坐标加权,DIoU-NMS之类可能提升有限。因此他出于速度的考虑,维持torchvision NMS (精度等同于Original NMS) 的使用。因此我的建议是,Cluster-NMS及其变体系列有可能可以获得精度提升,需要尝试几组参数,如weighted threshold,DIoU-NMS的中心点距离惩罚幅度,如果使用batch mode Cluster-NMS还可考虑max-box的参数,以求得速度-精度的权衡。当然,在开启TTA时,batch mode Cluster-NMS是最快的。值得一提的是,如果你重视召回率,那么DIoU-NMS必能提高recall。

@yilifan
Copy link

yilifan commented Jun 27, 2022

您好,关于源代码实现计算ciou的小疑问,bboxes1 = torch.sigmoid(bboxes1),bboxes2 = torch.sigmoid(bboxes2),计算后面高宽的时候w1 = torch.exp(bboxes1[:, 2]),不是很明白为什么这里选择先对box激活。参考其他的一些iou实现源码都没有激活这个操作,反而是直接对box计算

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants