-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace MobileNetV3's SqueezeExcitation with EfficientNet's one #4487
Conversation
69d462e
to
e269817
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Highlighting some interesting bits of the implementation.
@@ -107,13 +110,13 @@ def _mobilenet_v3_model( | |||
torch.quantization.prepare_qat(model, inplace=True) | |||
|
|||
if pretrained: | |||
_load_weights(arch, model, quant_model_urls.get(arch + '_' + backend, None), progress) | |||
_load_weights(arch, model, quant_model_urls.get(arch + '_' + backend, None), progress, False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Earlier versions of the SqueezeExcite class used F.adaptive_avg_pool2d()
and F.hardsigmoid()
instead of their nn.Module
equivalents. Using the latter are advised as because QAT can further optimize them.
Loading the old weights, is still possible but the QAT bits of the above two layers will be missing. Passing strict=false
allows us to use the previous weights and achieve the same accuracy.
8d86c45
to
eedb939
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM overall. Thanks for working on this!
I only have a question on quantizable module BC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for consolidating the SE layers!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR!
I've left one comment which I think would be a better way of handling the BC in the quantized model.
I'm approving the PR now, as I would be ok merging the PR as it currently stands.
b143cf8
to
1851e55
Compare
1851e55
to
24ce2bd
Compare
…one (#4487) Summary: * Reuse EfficientNet SE layer. * Deprecating the mobilenetv3.SqueezeExcitation layer. * Passing the right activation on quantization. * Making strict named param. * Set default params if missing. * Fixing typos. Reviewed By: datumbox Differential Revision: D31270916 fbshipit-source-id: bd10285771f12f61f9b0d0a5487e8ae7aae0a2fc
…rch#4487) * Reuse EfficientNet SE layer. * Deprecating the mobilenetv3.SqueezeExcitation layer. * Passing the right activation on quantization. * Making strict named param. * Set default params if missing. * Fixing typos.
Fixes #4455
Partially resolves #4333
All validation stats of models remain the same:
mobilenet_v3_large
mobilenet_v3_small
quantized mobilenet_v3_large
ssd300_vgg16
lraspp_mobilenet_v3_large