Refactor the backbone builders of detection #4656

datumbox · 2021-10-19T15:15:47Z

Implements some of the same ideas on the detection space. More specifically, we move the CNN backbone initialization on the model builder and pass it directly to the method that implements the feature extractor for the specific task.

This PR will enable the work on #4611. Changes are BC, though we are forced to update an expected file because of the different order of initialization. See #4656 (comment) for details.

Accuracy remains the same for all pre-trained models.

datumbox · 2021-10-19T17:44:23Z

It seems that the test for retinanet_resnet50_fpn() is failing. This is not because of a broken change but because of the different order of initiatilization.

If we move the extra_block initialization from after the backbone:

    backbone = resnet50(pretrained=pretrained_backbone, progress=progress, norm_layer=misc_nn_ops.FrozenBatchNorm2d)
    backbone = _resnet_fpn_extractor(
        backbone, trainable_backbone_layers, returned_layers=[2, 3, 4], extra_blocks=LastLevelP6P7(256, 256)
    )

To before, all tests work:

    extra_block = LastLevelP6P7(256, 256)
    backbone = resnet50(pretrained=pretrained_backbone, progress=progress, norm_layer=misc_nn_ops.FrozenBatchNorm2d)
    backbone = _resnet_fpn_extractor(
        backbone, trainable_backbone_layers, returned_layers=[2, 3, 4], extra_blocks=extra_block
    )

So clearly this is a problem cause by the order of initialization given the seed, so we should be OK to replace the expected file.

…itialization.

datumbox

Adding some explanations to assist reviewing:

datumbox · 2021-10-19T18:43:10Z

torchvision/models/detection/backbone_utils.py



-def _resnet_backbone_config(
+def _resnet_fpn_extractor(


Rename for consistency.

datumbox · 2021-10-19T18:44:05Z

torchvision/models/detection/backbone_utils.py


+def _mobilenet_extractor(
+    backbone: Union[mobilenet.MobileNetV2, mobilenet.MobileNetV3],


Pass a full pre-initialized MobileNetV* backbone instead of a backbone_name.

datumbox · 2021-10-19T18:45:26Z

torchvision/models/detection/faster_rcnn.py

@@ -385,7 +388,9 @@ def fasterrcnn_resnet50_fpn(
    if pretrained:
        # no need to download the backbone if pretrained is set
        pretrained_backbone = False
-    backbone = resnet_fpn_backbone("resnet50", pretrained_backbone, trainable_layers=trainable_backbone_layers)
+
+    backbone = resnet50(pretrained=pretrained_backbone, progress=progress, norm_layer=misc_nn_ops.FrozenBatchNorm2d)


The pretrained backbone is always initialized beforehand in the builder.

datumbox · 2021-10-19T18:45:48Z

torchvision/models/detection/faster_rcnn.py

-    backbone = resnet_fpn_backbone("resnet50", pretrained_backbone, trainable_layers=trainable_backbone_layers)
+
+    backbone = resnet50(pretrained=pretrained_backbone, progress=progress, norm_layer=misc_nn_ops.FrozenBatchNorm2d)
+    backbone = _resnet_fpn_extractor(backbone, trainable_backbone_layers)


Then passed to the *_extractor method which builds the proper backbone.

datumbox · 2021-10-19T18:47:28Z

torchvision/models/detection/retinanet.py

-        extra_blocks=LastLevelP6P7(256, 256),
-        trainable_layers=trainable_backbone_layers,
+    backbone = _resnet_fpn_extractor(
+        backbone, trainable_backbone_layers, returned_layers=[2, 3, 4], extra_blocks=LastLevelP6P7(256, 256)


As commented earlier, because the initialization of the backbone happens now before the one of the LastLevelP6P7, the expected value of the test needs to change. I tried moving the initialization before and it works with the previous expected file. So all good.

datumbox · 2021-10-19T18:49:59Z

torchvision/models/detection/ssd.py

+    backbone = vgg.vgg16(pretrained=False, progress=progress)
+    if pretrained_backbone:
+        state_dict = load_state_dict_from_url(backbone_urls["vgg16_features"], progress=progress)
+        backbone.load_state_dict(state_dict)


By initializing the backbone on the builder, the complexity is reduced. Moreover, the multi-pretrained weights project will allow us to remove the if statement and just pass directly the exact weights we want to load to the vgg16.

datumbox · 2021-10-19T18:51:40Z

torchvision/models/detection/ssd.py

@@ -23,7 +23,8 @@
 backbone_urls = {
    # We port the features of a VGG16 backbone trained by amdegroot because unlike the one on TorchVision, it uses the
    # same input standardization method as the paper. Ref: https://s3.amazonaws.com/amdegroot-models/vgg16_reducedfc.pth
-    "vgg16_features": "https://download.pytorch.org/models/vgg16_features-amdegroot.pth"
+    # Only the `features` weights have proper values, those on the `classifier` module are filled with nans.
+    "vgg16_features": "https://download.pytorch.org/models/vgg16_features-amdegroot-88682ab5.pth"


The previous weights only contained the values of vgg.features. Now it contains the values of the entire vgg model. Though the size increases, it allows us to load the weights with the standard mechanism instead of relying on hacks (this will become possible on the multi-pretrained weights PR).

datumbox · 2021-10-19T18:52:33Z

torchvision/models/detection/ssdlite.py

@@ -183,7 +173,7 @@ def _mobilenet_extractor(
        for parameter in b.parameters():
            parameter.requires_grad_(False)

-    return SSDLiteFeatureExtractorMobileNet(backbone, stage_indices[-2], norm_layer, **kwargs)


Safe. No kwargs were used.

datumbox · 2021-10-19T18:53:27Z

torchvision/models/detection/ssdlite.py

@@ -235,14 +225,16 @@ def ssdlite320_mobilenet_v3_large(
    if norm_layer is None:
        norm_layer = partial(nn.BatchNorm2d, eps=0.001, momentum=0.03)

+    backbone = mobilenet.mobilenet_v3_large(
+        pretrained=pretrained_backbone, progress=progress, norm_layer=norm_layer, reduced_tail=reduce_tail, **kwargs


The reduced_tail and kwargs were supposed to be passed to the mobilenet_v3_large builder.

datumbox · 2021-10-19T18:53:59Z

torchvision/prototype/models/detection/backbone_utils.py

-) -> BackboneWithFPN:
-
-    backbone = resnet.__dict__[backbone_name](weights=weights, norm_layer=norm_layer)
-    return _resnet_backbone_config(backbone, trainable_layers, returned_layers, extra_blocks)


We no longer need to copy these files on the prototype.

NicolasHug

Thanks @datumbox , LGTM! The many comments really help

different order of initialization

Do you think this is something worth mentioning in the future release notes, or nah?

datumbox · 2021-10-20T11:58:38Z

@NicolasHug Thanks for the help!

Do you think this is something worth mentioning in the future release notes, or nah?

Hm... I don't think we have done this before. This is probably because PyTorch doesn't guarantee getting the same random behaviour across versions. @fmassa any thoughts on this based on history?

We have 3 options here:

Say nothing.
Just mention it on the notes.
Ensure that the behaviour is identical with before. For that I just need to initialize the LastLevelP6P7(256, 256) before the backbone and we are good to go.

I don't mind going for any of the above solutions. I'll merge now but happy to follow up with option 3 if needed.

datumbox · 2021-10-20T13:17:37Z

I've tested again all models and their metrics look fine. I'm merging this.

Reviewed By: NicolasHug Differential Revision: D31916220 fbshipit-source-id: bc8316bab03ed73b32cc78ae520f32938d6746cb

* Refactoring resnet_fpn backbone building. * Passing the change to *_rcnn and retinanet. * Applying for faster_rcnn + mobilenetv3 * Applying for ssdlite + mobilenetv3 * Applying for ssd + vgg16 * Update the expected file of retinanet_resnet50_fpn to fix order of initialization. * Adding full model weights for the VGG16 features.

datumbox added 2 commits October 19, 2021 16:06

Refactoring resnet_fpn backbone building.

ef1c220

Passing the change to *_rcnn and retinanet.

ed319d7

pytorch-probot bot added the ciflow/default label Oct 19, 2021

datumbox changed the title ~~Refactor the backbone builders of detection~~ [WIP] Refactor the backbone builders of detection Oct 19, 2021

facebook-github-bot added the cla signed label Oct 19, 2021

datumbox added 3 commits October 19, 2021 16:30

Applying for faster_rcnn + mobilenetv3

9ff8df5

Applying for ssdlite + mobilenetv3

33f6db1

Applying for ssd + vgg16

893ff41

datumbox added 2 commits October 19, 2021 18:46

Update the expected file of retinanet_resnet50_fpn to fix order of in…

6d60c5f

…itialization.

Adding full model weights for the VGG16 features.

ab71f1d

datumbox commented Oct 19, 2021

View reviewed changes

datumbox changed the title ~~[WIP] Refactor the backbone builders of detection~~ Refactor the backbone builders of detection Oct 19, 2021

datumbox added code quality module: models labels Oct 19, 2021

NicolasHug approved these changes Oct 20, 2021

View reviewed changes

Merge branch 'main' into models/detection_refactoring

1526680

datumbox merged commit d18c487 into pytorch:main Oct 20, 2021

datumbox deleted the models/detection_refactoring branch October 20, 2021 13:17

facebook-github-bot pushed a commit that referenced this pull request Oct 26, 2021

[fbsync] Refactor the backbone builders of detection (#4656)

4cb496d

Reviewed By: NicolasHug Differential Revision: D31916220 fbshipit-source-id: bc8316bab03ed73b32cc78ae520f32938d6746cb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor the backbone builders of detection #4656

Refactor the backbone builders of detection #4656

datumbox commented Oct 19, 2021 •

edited

Loading

datumbox commented Oct 19, 2021

datumbox left a comment

datumbox Oct 19, 2021

datumbox Oct 19, 2021

datumbox Oct 19, 2021

datumbox Oct 19, 2021

datumbox Oct 19, 2021

datumbox Oct 19, 2021

datumbox Oct 19, 2021

datumbox Oct 19, 2021

datumbox Oct 19, 2021

datumbox Oct 19, 2021

NicolasHug left a comment

datumbox commented Oct 20, 2021

datumbox commented Oct 20, 2021


		def _mobilenet_extractor(
		backbone: Union[mobilenet.MobileNetV2, mobilenet.MobileNetV3],

Refactor the backbone builders of detection #4656

Refactor the backbone builders of detection #4656

Conversation

datumbox commented Oct 19, 2021 • edited Loading

datumbox commented Oct 19, 2021

datumbox left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

NicolasHug left a comment

Choose a reason for hiding this comment

datumbox commented Oct 20, 2021

datumbox commented Oct 20, 2021

datumbox commented Oct 19, 2021 •

edited

Loading