-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Add EvalHook which will be used in downstream projects #739
Conversation
It can not be merged util #695 is merged since it uses |
A related discuss, open-mmlab/mmaction2#395 (comment). Maybe we need to create rather than symlink |
How about adding an option for such a case? By default, we create a symlink, but in some cases we can cp a new ckpt. And we should also ensure that only one symlink is created over the whole training time. |
178f84d
to
c0fb2f8
Compare
Now I think open-mmlab/mmaction2#395 is a good proposal which cp the ckpt. Since mmcv hook has |
I am OK with that. |
Please enrich the PR message for this PR as this is a big refactoring for downstream tasks to allow more discussion and for future reference, including its intention, modification, and consequences. |
|
|
|
4148f60
to
97fc044
Compare
LGTM if the comments can all be resolved. |
Done |
See if @hellock have any comments. |
|
Validation has been done in mmcls and the performance remaines as it used to be. Thank you. |
Validation has been successfully done in mmdet3d with mmdet's PR: open-mmlab/mmdetection#4806 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have tested EvalHook & DistEvalHook on MMAction2.
The related PR "Use mmcv eval hook in mmaction2" will be raised after the release of the next version of mmcv.
Validated in MMSeg. Thx! |
This PR is merged as many downstream repos have verified its correctness. Thanks the efforts of all. |
Overview
This PR adds
mmcv/runner/hooks/eval.py
file andEvalHook
andDistEvalHook
to MMCV together with related unittest parts. Since those two hooks have been widely used in many MM repos, this PR implements and unifies common functions used in the evaluation part.Design
Model Evaluation performed in OpenMMLab uses
EvalHook
andDistEvalHook
during training.In detail, the those hooks are commonly registered in
apis/train.py
byrunner.register_hook(eval_hook(val_dataloader, **eval_cfg))
. Users can initialize their models with following steps:cfg.data.val
, buildingval_datasetloader
EvalHook
usingrunner.register_hook
.Once registering it, runner will periodically call the evaluate function to perform evaluation in a fixed mode (by epoch or by iteration, etc.). The high-level workflow of
EvalHook
in OpenMMLab is:register
EvalHook
-> define evaluation setting (start
,interval
,by_epoch
,save_best
, comparison rules, etc.) -> perform evaluationafter_train_iter
orafter_train_epoch
-> (save the best checkpoint) -> loop back toafter_train_xxx
APIs
EvalHook
is the base module, andDistEvalHook
is a child class to it. They both inheritHook
. Here is the APIs explanation forEvalHook
.Initialization
dataloader
: The PyTorch dataloader for evaluation. It can be built withbuild_dataloader
with provided dataloader setting.start
: Evaluation starting epoch. It enables evaluation before the training starts ifstart
<= the resuming epoch. If set toNone
, whether to evaluate is merely decided byinterval
.interval
: Evaluation interval.by_epoch
: Determine whether to perform evaluation by epoch or by iteration. If set to True, it will perform by epoch. Otherwise, by iteration.save_best
: If a metric is specified, it would measure the best checkpoint during evaluation. The information about best checkpoint would be save inrunner.meta['hook_msgs']
. Options are the evaluation metrics to the validation dataset. e.g.,bbox_mAP
,segm_mAP
for bbox detection and instance segmentation.AR@100
for proposal recall. Ifsave_best
isauto
, the first key of the returnedOrderedDict
result will be used. The interval ofCheckpointHook
should be divisible by that ofEvalHook
.rule
: Comparison rule for best score. If set toNone
, it will infer a reasonable rule. Keys such asacc
,top
, etc. will be inferred bygreater
rule. Keys containloss
will be inferred byless
rule. Options aregreater
,less
,None
.But Note: Since the rule for downstream repos are different, it may need to overwrite the
self.greater_keys
andself.less_keys
eval_kwargs
: Key word arguments for dataset evaluate function (def evaluate
), which will be fed into the evaluate function of the dataset.Other hardcoded inner variables:
rule_map
,init_value_map
,greater_keys
andless_keys
.Note that: Since the rule for downstream repos are different, it may need to overwrite these variable due to the specific task.
rule_map
: A dict containing comparison function, default as{'greater': lambda x, y: x > y, 'less': lambda x, y: x < y}
.init_value_map
: The initialized value for comparison, default as{'greater': -inf, 'less': inf}
.greater_keys
: A list containing some rule keys applied for greater function, which means in these rulesEvalHook
regards greater number as the better one. Note that: If a string is one of the rules' substring, it will be applied togreater
function. e.g., foracc
:top1_acc
,top5_acc
,mean_acc
are all applied togreater
rule.less_keys
: Similar togreater_keys
but it is forless
rules. e.g., forloss
:bce_loss
,bmn_loss
are all applied toless
rule.before_run
This part is to initialize
runner.meta.hook_msgs
, if users determine to save the best checkpoint.before_train_xxx
For
before_train_iter
andbefore_train_epoch
part. It is mainly to determine whether it is the right time to perform evaluation by examining:by_epoch
start
to perform evaluationAnd use
self.initial_flag
to indicate whether theEvalHook
gets into the normal evaluation loop. After getting into the normal evaluation loop,before_train_xxx
will be skipped.after_train_xxx
For
after_train_iter
andafter_train_epoch
part. It is mainly to inference the model and do evaluation, as well as save the best checkpoint ifsave_best
is specified. In detail, it will call_do_evaluate()
and_save_best()
._do_evaluate()
: inference the model by callingsingle_gpu_test()
, do evaluation and call_save_best()
_save_best()
: compare the score according to the rule, write info intorunner.meta['hook_msgs']
and save the best checkpoint in the work_dirFor
DistEvalHook
, besides the variable and function mentioned above, it inferences model by callingmulti_gpu_test()
and assignstmpdir
,gpu_collect
formulti_gpu_test()
Usages
Users adopt
EvalHook
by typing in--validate
for training command, it will callEvalHook
in training.Config
To use
EvalHook
with a specific setting, users need to modify theevaluation
variable in config files. like this:the key-value pairs in the dict are corresponding to
EvalHook
API.Migration
Since the key for determine
greater
anless
is related to the downstream task, downstream repos may need to overwrite theself.greater_keys
andself.less_keys
: