-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add RegNet Architecture in TorchVision #4403
Conversation
use_se=False, **kwargs) | ||
return _regnet("regnet_x_32gf", params, pretrained, progress, **kwargs) | ||
|
||
# TODO(kazhang): Add RegNetZ_500MF and RegNetZ_4GF |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RegNetZ from paper Fast and Accurate Model Scaling and are available in Classy.
We would need ConvBNReLU before the last FC layer. I will address this in the next commit.
Publish to gather early feedback. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kazhang thanks for the contribution!
You didn't specify if you wanted us to focus on the ML side or on the architectural side of the code. Given that in the past you have implemented this model on ClassyVision, below I focus only on high-level comments about the architecture. Hope it's OK.
My comments below can be grouped in 3 categories:
- I highlight whenever you introduce a block for which we have one or many copies. This is in order to discuss whether we want to share an existing module. In the past we favoured keeping the models with minimum inheritance so that people can copy-paste them and edit them easily, but this does not scale very well.
- I let you know when you use a different pattern/idiom than the one usually found in vision. All patterns have pros/cons, your usages are valid and tastes are highly subjective. So I think this will boil down to the question "Do we want to keep the entire codebase consistent?"
- General questions / clarifications.
Happy to provide more detailed input once we discuss these. Let me know your thoughts, happy to jump in a call to speed this up as well and later summarize in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a ton for the PR, this is looking good!
I did an initial pass and I have a few high-level comments.
Maybe we could have a joint chat with Vasilis to try and converge on some specific points brought by our comments, what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
High level comment for the group - do we want to support the following -
- Fast and Accurate Model Scaling proposes a way to scale models. Once a model is scaled, it cannot be defined by
w_a
,w_m
, etc. anymore and instead needs to be specified by the stage widths, group widths, etc. which means theRegNetParams
API does not suffice. We already have models like these being trained currently with multiple downstream applications (happy to share details internally). In Classy Vision we had anAnyNetParams
API for this (https://github.com/facebookresearch/ClassyVision/blob/b9ea86b39cc469867aabb76ada33bf5a8530eb19/classy_vision/models/anynet.py#L364), whichRegNetParams
derived from. - The Designing Network Design Spaces paper specified AnyNets as fairly general networks and
RegNetX
,RegNetY
,RegNetZ
are instantiations of these. Which is why we have multiple block types - a user can specify very general networks with these settings.
IMO at the very least we should support instantiating scaled models. Supporting general AnyNets might be out of scope for torchvision, I'm not sure (they're quite useful too). Answering these two questions is important to answer some of the comments in code currently!
cc @pdollar
We could support both scaled and regular regnet model if we init the Regarding to various block type, I'm leaning towards no including unused block types. From engineering point of view, if a component is not actively used within the code base, it is likely to be ignored in regular maintenance and eventually abandoned. Additionally, it's much easier to add a component than deleting it in a BC manner. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates @kazhang. I think we are getting very close, I really like the changes.
I left couple of comments, feel free to ignore the nits as we can address them optionally on a separate PR. Let me know what you think.
Agree on both accounts! Thanks Kai! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @kazhang.
The comments I left are optional nits or discussion points, none of them are blocking. IMO you should be able to merge this once you add the weights.
What's the plan for the pre-trained models? There are lots of new variants being added, do you foresee having weights for all of them prior the release? Alternatively, do you propose releasing the methods without the weights (it's not common but it has been done before)? Another option is to offer builder methods only for models that have weights.
if block_type is None: | ||
block_type = ResBottleneckBlock | ||
if activation is None: | ||
activation = nn.ReLU |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @kazhang's earlier comment that this looks unnecessarily busy. It's true that in most places in vision, the idiom used is to pass None
for the default layer and then assign a value. That seems not great IMO.
@fmassa any specific historical reason why we did this? Any concerns adopting Kai's proposal and do:
stem_type: Callable[..., nn.Module] = SimpleStemIN,
block_type: Callable[..., nn.Module] = ResBottleneckBlock,
norm_layer: Callable[..., nn.Module] = nn.BatchNorm2d,
activation: Callable[..., nn.Module] = nn.ReLU,
|
||
current_width = width_out | ||
|
||
self.trunk_output = nn.Sequential(OrderedDict(blocks)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: In other models this is called features
self.trunk_output = nn.Sequential(OrderedDict(blocks)) | ||
|
||
self.avgpool = nn.AdaptiveAvgPool2d((1, 1)) | ||
self.fc = nn.Linear(in_features=current_width, out_features=num_classes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: If some models (not all), we have a classifier attribute (see here).
docs/source/models.rst
Outdated
regnet_x_3_2gf = models.regnet_x_3_2gf(pretrained=True) | ||
regnet_x_8gf = models.regnet_x_8gf(pretrained=True) | ||
regnet_x_16gf = models.regnet_x_16gf(pretrained=True) | ||
regnet_x_32gf = models.regnet_x_32gf(pretrained=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before merge remember to remove from this list all models for which we don't offer weights. :)
This reverts commit 850f5f3.
@kazhang I did a final check and the PR looks good to go for me. We can follow up for the nits and todos on another PR. Two quick questions:
|
@datumbox yes, |
I've tested the pretrained model on ImageNet:
|
Hey @kazhang! You merged this PR, but no labels were added. |
Summary: * initial code * add SqueezeExcitation * initial code * add SqueezeExcitation * add SqueezeExcitation * regnet blocks, stems and model definition * nit * add fc layer * use Callable instead of Enum for block, stem and activation * add regnet_x and regnet_y model build functions, add docs * remove unused depth * use BN/activation constructor and ConvBNActivation * add expected test pkl files * allow custom activation in SqueezeExcitation * use ReLU as the default activation * initial code * add SqueezeExcitation * initial code * add SqueezeExcitation * add SqueezeExcitation * regnet blocks, stems and model definition * nit * add fc layer * use Callable instead of Enum for block, stem and activation * add regnet_x and regnet_y model build functions, add docs * remove unused depth * use BN/activation constructor and ConvBNActivation * reuse SqueezeExcitation from efficientnet * refactor RegNetParams into BlockParams * use nn.init, replace np with torch * update README * construct model with stem, block, classifier instances * Revert "construct model with stem, block, classifier instances" This reverts commit 850f5f3. * remove unused blocks * support scaled model * fuse into ConvBNActivation * make reset_parameters private * fix type errors * fix for unit test * add pretrained weights for 6 variant models, update docs Reviewed By: prabhat00155, NicolasHug Differential Revision: D31309546 fbshipit-source-id: 56e01f105279c3d3c5514607c23d2835896b3d03
* initial code * add SqueezeExcitation * initial code * add SqueezeExcitation * add SqueezeExcitation * regnet blocks, stems and model definition * nit * add fc layer * use Callable instead of Enum for block, stem and activation * add regnet_x and regnet_y model build functions, add docs * remove unused depth * use BN/activation constructor and ConvBNActivation * add expected test pkl files * allow custom activation in SqueezeExcitation * use ReLU as the default activation * initial code * add SqueezeExcitation * initial code * add SqueezeExcitation * add SqueezeExcitation * regnet blocks, stems and model definition * nit * add fc layer * use Callable instead of Enum for block, stem and activation * add regnet_x and regnet_y model build functions, add docs * remove unused depth * use BN/activation constructor and ConvBNActivation * reuse SqueezeExcitation from efficientnet * refactor RegNetParams into BlockParams * use nn.init, replace np with torch * update README * construct model with stem, block, classifier instances * Revert "construct model with stem, block, classifier instances" This reverts commit 850f5f3. * remove unused blocks * support scaled model * fuse into ConvBNActivation * make reset_parameters private * fix type errors * fix for unit test * add pretrained weights for 6 variant models, update docs
Resolves #2655
cc @datumbox