RetinaNet object detection (take 2) #2784

fmassa · 2020-10-10T14:45:09Z

This is entirely based on top of the great work from @hgaiser in #1697

I'm creating a new PR because there are some minor things that could be fixed in that PR but I don't have rights to push to the PR, so in order to move faster I'm creating a new PR but all the history is kept here.

Here is the mAP for the uploaded model:

IoU metric: bbox
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.364
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.558
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.383
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.193
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.400
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.490
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.315
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.506
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.558
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.386
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.595
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.699

Make tests pass

Needs cleanup and to add back support for images with no annotations

Gives 1mAP improvement over smooth l1

Need to fix distributed first

hgaiser · 2020-10-13T09:38:20Z

torchvision/models/detection/retinanet.py

@@ -205,7 +205,7 @@ def compute_loss(self, targets, head_outputs, anchors, matched_idxs):
            target_regression = self.box_coder.encode_single(matched_gt_boxes_per_image, anchors_per_image)

            # compute the loss
-            losses.append(det_utils.smooth_l1_loss(
+            losses.append(torch.nn.functional.l1_loss(


Why switch it to regular l1 loss?

This gives a 1 mAP improvement on the models, and we were still lagging a bit behind on the mAP compared to detectron2 (which has now adopted the L1 loss by default as well, see facebookresearch/detectron2@b0e2687)

Ah nice, wasn't expecting such a difference. What is the mAP that you are getting now? Also, wow, 37.4 mAP. That's impressive.

Here is the mAP scores for the model I'll be uploading, with L1 loss:

IoU metric: bbox Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.364 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.558 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.383 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.193 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.400 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.490 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.315 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.506 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.558 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.386 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.595 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.699

It is still lagging behind compared to D2, but it's a bit closer. We might revisit the models in torchvision in the near future to improve mAP with latest training tricks.

hgaiser · 2020-10-13T09:39:38Z

torchvision/models/detection/retinanet.py

@@ -100,7 +100,8 @@ def compute_loss(self, targets, head_outputs, matched_idxs):
            foreground_idxs_per_image = matched_idxs_per_image >= 0
            num_foreground = foreground_idxs_per_image.sum()
            # no matched_idxs means there were no annotations in this image
-            if False:#matched_idxs_per_image.numel() == 0:
+            # TODO: enable support for images without annotations that works on distributed


What is the problem with images without annotations on distributed?

There might be cases where in one GPU no images had annotations, while in the other there were. In this case, there will be part of the computation graph which would not be executed in one GPU, leading to synchronization issues (and even deadlocks).

So for now I'm disabling support for this to move forward, and we will enable it in a later PR (after the release)

Note that we are not passing find_unused_params to DDP, which you were probably using (because I had to change something else in the resnet_fpn_backbone for it to work). With find_unused_params=True, this might not be an issue, but I prefer to be on the safer side and let all computation graphs to be the same on every GPU

Need to deduplicate those box checks

liminghu · 2020-11-12T01:59:19Z

This is entirely based on top of the great work from @hgaiser in #1697

I'm creating a new PR because there are some minor things that could be fixed in that PR but I don't have rights to push to the PR, so in order to move faster I'm creating a new PR but all the history is kept here.

Here is the mAP for the uploaded model:

IoU metric: bbox
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.364
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.558
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.383
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.193
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.400
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.490
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.315
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.506
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.558
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.386
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.595
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.699

Thanks a lot. Any ttutorial on how to train the Retinanet on COCO database or other image sets?

fmassa · 2020-11-12T15:00:07Z

@liminghu We have training scripts for all detection models in https://github.com/pytorch/vision/tree/master/references/detection and a finetuning tutorial (for mask rcnn) in https://pytorch.org/tutorials/intermediate/torchvision_tutorial.html

* Add rough implementation of RetinaNet. * Move AnchorGenerator to a seperate file. * Move box similarity to Matcher. * Expose extra blocks in FPN. * Expose retinanet in __init__.py. * Use P6 and P7 in FPN for retinanet. * Use parameters from retinanet for anchor generation. * General fixes for retinanet model. * Implement loss for retinanet heads. * Output reshaped outputs from retinanet heads. * Add postprocessing of detections. * Small fixes. * Remove unused argument. * Remove python2 invocation of super. * Add postprocessing for additional outputs. * Add missing import of ImageList. * Remove redundant import. * Simplify class correction. * Fix pylint warnings. * Remove the label adjustment for background class. * Set default score threshold to 0.05. * Add weight initialization for regression layer. * Allow training on images with no annotations. * Use smooth_l1_loss with beta value. * Add more typehints for TorchScript conversions. * Fix linting issues. * Fix type hints in postprocess_detections. * Fix type annotations for TorchScript. * Fix inconsistency with matched_idxs. * Add retinanet model test. * Add missing JIT annotations. * Remove redundant model construction Make tests pass * Fix bugs during training on newer PyTorch and unused params in DDP Needs cleanup and to add back support for images with no annotations * Cleanup resnet_fpn_backbone * Use L1 loss for regression Gives 1mAP improvement over smooth l1 * Disable support for images with no annotations Need to fix distributed first * Fix retinanet tests Need to deduplicate those box checks * Fix Lint * Add pretrained model * Add training info for retinanet Co-authored-by: Hans Gaiser <[email protected]> Co-authored-by: Hans Gaiser <[email protected]> Co-authored-by: Hans Gaiser <[email protected]>

hgaiser and others added 30 commits October 9, 2020 14:06

Add rough implementation of RetinaNet.

50f822c

Move AnchorGenerator to a seperate file.

022f8e1

Move box similarity to Matcher.

8e0804d

Expose extra blocks in FPN.

ad53194

Expose retinanet in __init__.py.

2a5a5be

Use P6 and P7 in FPN for retinanet.

49e990c

Use parameters from retinanet for anchor generation.

b5966eb

General fixes for retinanet model.

aab1b28

Implement loss for retinanet heads.

c078114

Output reshaped outputs from retinanet heads.

eae4ee5

Add postprocessing of detections.

3dac477

Small fixes.

9981a3c

Remove unused argument.

5571dfe

Remove python2 invocation of super.

fc7751b

Add postprocessing for additional outputs.

b942648

Add missing import of ImageList.

b619936

Remove redundant import.

8c86588

Simplify class correction.

2934f0d

Fix pylint warnings.

32b8e77

Remove the label adjustment for background class.

437bfe9

Set default score threshold to 0.05.

9e810d6

Add weight initialization for regression layer.

f7d8c2e

Allow training on images with no annotations.

d86c437

Use smooth_l1_loss with beta value.

72e46f2

Add more typehints for TorchScript conversions.

41c90fa

Fix linting issues.

b9daa86

Fix type hints in postprocess_detections.

97d63b6

Fix type annotations for TorchScript.

eba7e16

Fix inconsistency with matched_idxs.

9545059

Add retinanet model test.

4865952

Hans Gaiser and others added 4 commits October 9, 2020 14:09

Add missing JIT annotations.

6e065be

Remove redundant model construction

7dc4c6b

Make tests pass

Fix bugs during training on newer PyTorch and unused params in DDP

640e59b

Needs cleanup and to add back support for images with no annotations

Cleanup resnet_fpn_backbone

23cabe3

hgaiser mentioned this pull request Oct 13, 2020

What is the future of this repository? fizyr/keras-retinanet#1471

Closed

fmassa added 3 commits October 13, 2020 02:25

Use L1 loss for regression

214ead7

Gives 1mAP improvement over smooth l1

Disable support for images with no annotations

44a1333

Need to fix distributed first

Merge branch 'master' of github.com:pytorch/vision into retinanet

c2f6334

hgaiser reviewed Oct 13, 2020

View reviewed changes

fmassa added 4 commits October 13, 2020 02:43

Fix retinanet tests

e560039

Need to deduplicate those box checks

Fix Lint

0647732

Add pretrained model

aa6364f

Add training info for retinanet

9b62169

fmassa merged commit 5bb81c8 into pytorch:master Oct 13, 2020

fmassa deleted the retinanet branch October 13, 2020 11:02

This was referenced Oct 13, 2020

RetinaNet object detection. #1697

Closed

Add support for image with no annotations in RetinaNet #2800

Closed

zhiqwang mentioned this pull request May 13, 2021

Unify onnx and JIT resize implementations #3654

Merged

datumbox mentioned this pull request Jan 24, 2022

fix bug when the target is empty in FCOS #5267

Merged

hgaiser mentioned this pull request May 25, 2023

ViTDet object detection #7630

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RetinaNet object detection (take 2) #2784

RetinaNet object detection (take 2) #2784

fmassa commented Oct 10, 2020 •

edited

Loading

hgaiser Oct 13, 2020

fmassa Oct 13, 2020

hgaiser Oct 13, 2020

fmassa Oct 13, 2020 •

edited

Loading

hgaiser Oct 13, 2020

fmassa Oct 13, 2020

fmassa Oct 13, 2020

liminghu commented Nov 12, 2020

fmassa commented Nov 12, 2020

RetinaNet object detection (take 2) #2784

RetinaNet object detection (take 2) #2784

Conversation

fmassa commented Oct 10, 2020 • edited Loading

hgaiser Oct 13, 2020

Choose a reason for hiding this comment

fmassa Oct 13, 2020

Choose a reason for hiding this comment

hgaiser Oct 13, 2020

Choose a reason for hiding this comment

fmassa Oct 13, 2020 • edited Loading

Choose a reason for hiding this comment

hgaiser Oct 13, 2020

Choose a reason for hiding this comment

fmassa Oct 13, 2020

Choose a reason for hiding this comment

fmassa Oct 13, 2020

Choose a reason for hiding this comment

liminghu commented Nov 12, 2020

fmassa commented Nov 12, 2020

fmassa commented Oct 10, 2020 •

edited

Loading

fmassa Oct 13, 2020 •

edited

Loading