Add Quantizable MobilenetV3 architecture for Classification #3323

datumbox · 2021-01-29T11:22:45Z

The pre-trained model was trained:

python -m torch.distributed.launch --nproc_per_node=8 --use_env train_quantization.py\
--model='mobilenet_v3_large' --wd 0.00001 --lr 0.001

Submitted batch job 35496554

Validated with:

python train_quantization.py --device='cpu' --model='mobilenet_v3_large' --test-only

Accuracy metrics (Epoch 89):
Acc@1 73.004 Acc@5 90.858

Speed Benchmark: 0.0162 sec per image on CPU

codecov · 2021-01-29T11:58:11Z

Codecov Report

Merging #3323 (bc27744) into master (859a535) will increase coverage by 0.13%.
The diff coverage is 92.85%.

@@            Coverage Diff             @@
##           master    #3323      +/-   ##
==========================================
+ Coverage   73.90%   74.04%   +0.13%     
==========================================
  Files         104      105       +1     
  Lines        9618     9692      +74     
  Branches     1544     1554      +10     
==========================================
+ Hits         7108     7176      +68     
- Misses       2028     2033       +5     
- Partials      482      483       +1

Impacted Files	Coverage Δ
torchvision/models/quantization/mobilenetv3.py	`92.42% <92.42%> (ø)`
torchvision/models/mobilenetv3.py	`92.42% <93.33%> (-0.38%)`	⬇️
torchvision/models/quantization/mobilenet.py	`100.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 859a535...274c6a1. Read the comment docs.

fmassa

Looks great, thanks a lot!

I have a few questions to @raghuramank100 which are not blocking to get this PR merged, but I would love to get this thoughts on a couple of points.

fmassa · 2021-02-02T16:08:51Z

torchvision/models/quantization/mobilenetv3.py

+        model.qconfig = torch.quantization.get_default_qat_qconfig(backend)
+        torch.quantization.prepare_qat(model, inplace=True)
+
+        if pretrained:
+            _load_weights(arch, model, quant_model_urls.get(arch + '_' + backend, None), progress)
+
+        torch.quantization.convert(model, inplace=True)


@raghuramank100 the approach we had to follow here for loading the pre-trained weights for a quantized model is different from what we did for the other models. Would you happen to know why we can't use the previous approach

Can you elaborate on why the other approach didnt work? For QAT, we should load the fp32 model (prior to prepare) and then start training with that.

For QAT, we should load the fp32 model (prior to prepare) and then start training with that.

We did not face any issues during the training process and what you describe is the approach we used for training the model.

Can you elaborate on why the other approach didnt work?

After the training was completed, we tried to load the weights of the quantized model (key "model_eval" of the checkpoint) to do inference. Unfortunately doing so leads to extremely low accuracy (less than 1%). We are certain that the weights are loaded on the model.

As a workaround, we opted for loading the weights of the QAT model (key "model"` of the checkpoint) and then convert it. This works fine and gives the same inference accuracy as observed during training.

We are trying to understand what could be the reason for this behaviour. I can provide a demo script if that helps.

fmassa · 2021-02-02T16:10:12Z

docs/source/models.rst

+Model                             Acc@1          Acc@5
+================================  =============  =============
+MobileNet V2                      71.658         90.150
+MobileNet V3 Large                73.004         90.858


@raghuramank100 this is ~ 1 acc@1 point drop compared to the fp32 reference. Would you have any tips on how to make this gap smaller?

@fmassa The non-quantized version of MobileNet V3 Large uses averaging of checkpoints which I don't do here. That's possibly one of the reasons we get lower accuracy.

If you start with the averaged checkpoint to start quantization aware training, you should get better accuracy as the starting point is better.

Also, one additional hyper-parameter that helps is to turn on QAT in steps: We first turn observers on (i.e collect statistics) and then turn fake-quantization on, and after sometime we turn batch norm off. Currently, in train_quantization, steps 1 and 2 are combined. We have seen that separating them helps with QAT accuracy in some models. You could try something like:

# Initially only turn on observers, disable fake quant model.apply(torch.quantization.enable_observer) model.apply(torch.quantization.disable_fake_quant) .... if epoch >= args.num_fake_quant_start_epochs: model.apply(torch.quantization.enable_fake_quant) if epoch >= args.num_observer_update_epochs: print('Disabling observer for subseq epochs, epoch = ', epoch) model.apply(torch.quantization.disable_observer) if epoch >= args.num_batch_norm_update_epochs: print('Freezing BN for subseq epochs, epoch = ', epoch) model.apply(torch.nn.intrinsic.qat.freeze_bn_stats)

If you start with the averaged checkpoint to start quantization aware training, you should get better accuracy as the starting point is better.

We indeed start from an averaged checkpoint but that's not what I mean here. I'm referring to the post-training averaging step which is missing.

We first turn observers on (i.e collect statistics) and then turn fake-quantization on.

That's worth integrating on the new quant training script.

I believe key reason why the accuracy is lagging is because the quant training script does not currently support all the enhancements made on the classification training script. These enhancements (Multiple restarts, Optimizer tuning, Data augmentation, model averaging at the end etc) helped me push the accuracy by 2 points.

fmassa · 2021-02-02T16:13:20Z

BTW, test failures seem to be related

datumbox · 2021-02-02T17:11:01Z

@fmassa Thanks for flagging. I fixed the issues, did 1 epoch of training and 1 validation to ensure everything still works fine. The rest of the failing tests are not related to this PR. So we should be OK.

raghuramank100 · 2021-02-03T03:28:05Z

torchvision/models/quantization/mobilenetv3.py

+        backend = 'qnnpack'
+
+        model.fuse_model()
+        model.qconfig = torch.quantization.get_default_qat_qconfig(backend)


One suggestion for improvement is to change the qconfig to the following: This configuration uses per-channel quantization for weights with qnnpack, which is now supported.
qconfig = QConfig(activation=torch.quantization.FakeQuantize.with_args(observer=torch.quantization.MovingAverageMinMaxObserver,
quant_min=0,
quant_max=255,
reduce_range=False),
weight= default_per_channel_weight_fake_quant)

Summary: * Refactoring mobilenetv3 to make code reusable. * Adding quantizable MobileNetV3 architecture. * Fix bug on reference script. * Moving documentation of quantized models in the right place. * Update documentation. * Workaround for loading correct weights of quant model. * Update weight URL and readme. * Adding eval. Reviewed By: datumbox Differential Revision: D26226613 fbshipit-source-id: 050d53d91abf68975f2dc3ede8db633a08b33a25

Refactoring mobilenetv3 to make code reusable.

fc728c1

facebook-github-bot added the cla signed label Jan 29, 2021

Adding quantizable MobileNetV3 architecture.

a4ec036

datumbox force-pushed the mobilenetv3_quantized branch from 1ab143e to a4ec036 Compare January 29, 2021 11:26

datumbox force-pushed the mobilenetv3_quantized branch from 44dd528 to f3ddbf5 Compare January 29, 2021 12:29

Fix bug on reference script.

4e03a0b

datumbox force-pushed the mobilenetv3_quantized branch from f3ddbf5 to 4e03a0b Compare January 29, 2021 12:32

Moving documentation of quantized models in the right place.

5baebb4

datumbox mentioned this pull request Jan 29, 2021

TorchVision Roadmap - 2021 H1 #3221

Closed

13 tasks

datumbox and others added 4 commits February 1, 2021 19:03

Merge branch 'master' into mobilenetv3_quantized

bc27744

Update documentation.

3d69b2a

Workaround for loading correct weights of quant model.

274c6a1

Update weight URL and readme.

aa44856

datumbox changed the title ~~[WIP] Add Quantizable MobilenetV3 architecture for Classification~~ Add Quantizable MobilenetV3 architecture for Classification Feb 2, 2021

datumbox requested a review from fmassa February 2, 2021 15:31

fmassa approved these changes Feb 2, 2021

View reviewed changes

Adding eval.

6bd42ff

Merge branch 'master' into mobilenetv3_quantized

34f1312

datumbox merged commit 8317295 into pytorch:master Feb 2, 2021

datumbox deleted the mobilenetv3_quantized branch February 2, 2021 17:36

raghuramank100 reviewed Feb 3, 2021

View reviewed changes

datumbox mentioned this pull request Apr 27, 2022

MobileNetV3 quantization bug #5890

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Quantizable MobilenetV3 architecture for Classification #3323

Add Quantizable MobilenetV3 architecture for Classification #3323

datumbox commented Jan 29, 2021 •

edited

Loading

codecov bot commented Jan 29, 2021 •

edited

Loading

fmassa left a comment

fmassa Feb 2, 2021

raghuramank100 Feb 3, 2021

datumbox Feb 3, 2021

fmassa Feb 2, 2021

datumbox Feb 2, 2021

raghuramank100 Feb 3, 2021

raghuramank100 Feb 3, 2021 •

edited

Loading

datumbox Feb 3, 2021

fmassa commented Feb 2, 2021

datumbox commented Feb 2, 2021

raghuramank100 Feb 3, 2021

Add Quantizable MobilenetV3 architecture for Classification #3323

Add Quantizable MobilenetV3 architecture for Classification #3323

Conversation

datumbox commented Jan 29, 2021 • edited Loading

codecov bot commented Jan 29, 2021 • edited Loading

Codecov Report

fmassa left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

raghuramank100 Feb 3, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fmassa commented Feb 2, 2021

datumbox commented Feb 2, 2021

Choose a reason for hiding this comment

datumbox commented Jan 29, 2021 •

edited

Loading

codecov bot commented Jan 29, 2021 •

edited

Loading

raghuramank100 Feb 3, 2021 •

edited

Loading