Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Refactor Estimator for computing FLOPs/Params/Latency. #230

Merged
merged 22 commits into from
Aug 23, 2022

Conversation

gaoyang07
Copy link
Contributor

@gaoyang07 gaoyang07 commented Aug 15, 2022

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.

Motivation

Refactor Estimator for computing FLOPs/Params/Latency.

Modification

  1. Add ResourceEstimator to estimate model resources.
  2. Refactor mmcv.flops_counter as flops_params_counter.
  3. Add latency_counter.
  4. Add counters for common op counters, e.g. ConvCounter.
  5. Add EstimateResourcesHook.
  6. Add UT for flops_params_counter & ResourceEstimator.
  7. Remove old FlopsEstimator in mmrazor.

BC-breaking (Optional)

Does the modification introduce changes that break the backward compatibility of the downstream repositories?
If so, please describe how it breaks the compatibility and how the downstream projects should modify their code to keep compatibility with this PR.

Use cases (Optional)

If this PR introduces a new feature, it is better to list some use cases here and update the documentation.

Checklist

Before PR:

  • Pre-commit or other linting tools are used to fix the potential lint issues.
  • Bug fixes are fully covered by unit tests, the case that causes the bug should be added in the unit tests.
  • The modification is covered by complete unit tests. If not, please add more unit tests to ensure the correctness.
  • The documentation has been modified accordingly, like docstring or example tutorials.

After PR:

  • If the modification has potential influence on downstream or other related projects, this PR should be tested with those projects, like MMDet or MMSeg.
  • CLA has been signed and all committers have signed the CLA in this PR.

1. add EvaluatorLoop in engine.runners;
2. add estimator for structures (both subnet & supernet);
3. add layer_counter for each op.
@codecov
Copy link

codecov bot commented Aug 15, 2022

Codecov Report

Merging #230 (77e8095) into dev-1.x (57aec1f) will decrease coverage by 0.03%.
The diff coverage is 0.00%.

@@            Coverage Diff             @@
##           dev-1.x    #230      +/-   ##
==========================================
- Coverage     0.48%   0.44%   -0.04%     
==========================================
  Files          144     159      +15     
  Lines         5943    6454     +511     
  Branches       959    1059     +100     
==========================================
  Hits            29      29              
- Misses        5909    6420     +511     
  Partials         5       5              
Flag Coverage Δ
unittests 0.44% <0.00%> (-0.04%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
mmrazor/engine/__init__.py 0.00% <0.00%> (ø)
mmrazor/engine/hooks/__init__.py 0.00% <0.00%> (ø)
mmrazor/engine/hooks/estimate_resources_hook.py 0.00% <0.00%> (ø)
mmrazor/engine/runner/autoslim_val_loop.py 0.00% <0.00%> (ø)
mmrazor/engine/runner/evolution_search_loop.py 0.00% <0.00%> (ø)
mmrazor/engine/runner/slimmable_val_loop.py 0.00% <0.00%> (ø)
mmrazor/engine/runner/subnet_sampler_loop.py 0.00% <0.00%> (ø)
mmrazor/models/__init__.py 0.00% <0.00%> (ø)
mmrazor/models/task_modules/__init__.py 0.00% <0.00%> (ø)
mmrazor/models/task_modules/estimators/__init__.py 0.00% <0.00%> (ø)
... and 17 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@@ -111,3 +111,6 @@ def build_razor_model_from_cfg(
VISUALIZERS = Registry('visualizer', parent=MMENGINE_VISUALIZERS)
# manage visualizer backend
VISBACKENDS = Registry('vis_backend', parent=MMENGINE_VISBACKENDS)

ESTIMATOR = Registry('estimator')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ESTIMATOR -> ESTIMATORS

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

if (i + 1) == max_iter:
fps = (i + 1 - num_warmup) / pure_inf_time
if PRINT:
print(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use logger to print, with debug logger level.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

mean_times_pre_image_ = sum(times_pre_image_list_) / len(
times_pre_image_list_)
if PRINT:
print(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use logger to print, with debug logger level.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


@ESTIMATOR.register_module()
class BaseEstimator(metaclass=ABCMeta):
"""Evaluator for calculating the accuracy and resources consume. Accuracy
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update docstring, including necessary Notes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add docstring in ResourceEstimator, showing 3 cases when using it.

self.units = units
self.disabled_counters = disabled_counters

def evaluate(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

evaluate -> estimate

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@gaoyang07 gaoyang07 mentioned this pull request Aug 15, 2022
6 tasks
gaoyang07 added 2 commits August 15, 2022 17:38
1. add ResourceEstimator based on BaseEstimator;
2. add notes & examples for ResourceEstimator & EvaluatorLoop usage;
3. fix a bug of latency test.
4. minor changes according to comments.
return resource_results

def export_subnet(self, model):
"""Export current best subnet."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update docstring

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.



@LOOPS.register_module()
class EvaluatorLoop(ValLoop):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-> ResourceEvaluatorLoop would be better ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall the file name be changed?

Copy link
Contributor

@sunnyxiaohu sunnyxiaohu Aug 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, and so do the releated UTs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


return resource_results

def export_subnet(self, model):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is export_subnet sutable for all the NAS alogorithm?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is called when it comes to those NAS algorithms that require building a supernet for training. For those algorithms, measuring subnet resources is more meaningful than supernet during validation, therefore this method is required to get the current searched subnet from the supernet.

@@ -93,12 +93,12 @@ class FlopsEstimator:
def get_model_complexity_info(
model: Module,
fix_mutable: Optional[ValidFixMutable] = None,
input_shape: Iterable[int] = (3, 224, 224),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete the directory: subnet/estimators and update the corresponding refs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

{'flops': 1.0, 'params': 0.7, 'latency': 0.0}

>>> # calculate mmrazor.model flops
NOTE: check 'EvaluatorLoop' in engine.runner.evaluator_val_loop
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add more details for disabled_counters

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

from abc import ABCMeta, abstractclassmethod


class BaseCounter(object, metaclass=ABCMeta):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Point that XXModuleCounter is responsible for XXModule, which could refers to flops_params_counter::get_counter_type().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Collaborator

@humu789 humu789 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. It seems to lack the function of counting flops with the specified scope.
  2. Better not use new registries without parents, it will be not used by other repos of OpenMMLab. You can use directly TASK_UTILS instead of estimator and op_counter
  3. estimator/ is unsuitable to be under structures/. Suggestion location : mmrazor/models/task_modules/.
  4. The file structure of estimator/ could be optimizer. Suggestion:
    a. add couters/ in estimator
    b. move flops_params_counter.py , latency.py , op_spec_counters/ to counters/
    c. rename latency.py to latency_counter.py
    d. rename estimator/ to estimators/

@@ -111,3 +111,6 @@ def build_razor_model_from_cfg(
VISUALIZERS = Registry('visualizer', parent=MMENGINE_VISUALIZERS)
# manage visualizer backend
VISBACKENDS = Registry('vis_backend', parent=MMENGINE_VISBACKENDS)

ESTIMATORS = Registry('estimator')
OP_SPEC_COUNTERS = Registry('op_counter')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better not use new registries without parents, it will be not used by other repos of OpenMMLab. You can use directly TASK_UTILS instead of 'estimator' and 'op_counter'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


>>> # calculate resources of mmrazor.models
NOTE: check 'ResourceEvaluatorLoop' in
engine.runner.resource_evaluator_val_loop for more details.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

engine.runner.resource_evaluator_val_loop -> mmrazor.engine.runner.resource_evaluator_val_loop
to avoid ambiguity

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.



@LOOPS.register_module()
class ResourceEvaluatorLoop(ValLoop):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This loop should be for a specific algorithm, you had better name it with the algorithm. It is easy to be misunderstood that ResourceEvaluatorLoop is universal.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ResourceEvaluatorLoop seems to be replaced with Hook, thus we need not maintain source valloop.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make this part a hook, done.

@gaoyang07
Copy link
Contributor Author

Now support counting flops with the specified scope. A list of scope names is required from users.

copied_model = copy.deepcopy(self.model)
load_fix_subnet(copied_model, fix_mutable)

estimator = ResourceEstimator()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use estimator_cfg to build ResourceEstimator, and not use input_shape as fixed kwargs for ::estimate().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

copied_model = copy.deepcopy(self.model)
load_fix_subnet(copied_model, fix_mutable)

estimator = ResourceEstimator()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use estimator_cfg to build ResourceEstimator, and not use input_shape as fixed kwargs for ::estimate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


def get_model_complexity_info(model,
input_shape,
spec_modules=[],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spec_modules -> custom_keys to support prefix, ref to mmcv::mmcv/runner/optimizer/default_constuctor.py

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now support counting flops with a specified scope, e.g. spec_modules = ['backbone']

if len(spec_modules):
spec_modules_resources = dict()
accumulate_sub_module_flops_params(flops_params_model)
for name, module in flops_params_model.architecture.named_modules():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not all the flops_params_model have the architecture attribute.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

precision=precision)) + ' ' + units + 'FLOPs'
params_string = str(
params_units_convert(
accumulated_num_params, units='M',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unify accumulated_flops_cost and accumulated_num_params with units

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A unit pair with FLOPs as 'G' and params as 'M' may be better.

import sys
from functools import partial

import torch
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unify units for params and flops under the scope of flops_params_counter.py

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cancel unit convert in accumulate_sub_module_flops_params, remain the rest as the origin version.

@sunnyxiaohu sunnyxiaohu merged commit 4b3f8ab into open-mmlab:dev-1.x Aug 23, 2022
humu789 pushed a commit to humu789/mmrazor that referenced this pull request Feb 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants