-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add benchmark regression test script with tmux #849
Conversation
Codecov Report
@@ Coverage Diff @@
## master #849 +/- ##
==========================================
+ Coverage 83.59% 83.64% +0.05%
==========================================
Files 176 178 +2
Lines 14145 14195 +50
Branches 2364 2367 +3
==========================================
+ Hits 11824 11874 +50
- Misses 1713 1714 +1
+ Partials 608 607 -1
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
modify the config and rename the filename
modify the script and rename the filename
using mmcv.load to avoid introducing the extra dependency on yaml
…st_benchmark_script
eval: mAP # evaluation metric, which depends on the dataset, e.g., "mAP" for MSCOCO | ||
fuse-conv-bn: | ||
gpu_collect: | ||
P0: # the priority of the models, P0: core, P1: important, P2: less important, P3: least important |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few suggestions about the config file:
- Rename this file "benchmark_regression_cfg_tmpl.yaml" or something, which serves as a template to show the full content that a config file could include. We will add a more compact config file with a full model list and only necessary arguments.
- Use
test
instead ofinfer
as the mode name. gpus_per_node
can be set to 8 for all models and modes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got these suggestions.
* Fix import and deprecation issues in unit tests (#871) * fix some bugs in the unit test of smpl model. * reorganize `tests/` to solve importing issue (PEP 420) * fix deprecation warnings in unit tests Co-authored-by: ly015 <[email protected]> * add benchmark regression test script with tmux (#849) * test the simple case using tmux to run multiple benchmark regression test tasks * modify and rename the config file and script * Delete config_list.yaml * modify the config and rename the filename * Delete test_benchmark_tmux.py * modify the script and rename the filename * Update setup.cfg * using mmcv.load to avoid introducing the extra dependency on yaml * fix some typo * refactor the config file and modify the script accordingly * modify the config and script * rename the config file * Correct dataset preparation guide of WFLW (#873) * add pr template (#875) * add CITATION.cff and update setup.py (#876) * Add copyright header and pre-commit hook (#872) * Add pre-commit hook to automatically add copyright file header * update files with copyright header * Limit copyright checking in the first 2 lines of a file * Exclude configs in demo/ * set max-header-lines as 5 * rebase to master and add copyright to new files * move benchmark_regression into .dev_scripts/benchmark * Translate tasks/2d_body_keypoint.md (#842) * 2rd PR remove poseval * fix lint * revise the CN version Co-authored-by: ly015 <[email protected]> * fix some bugs in the unit test of smpl model. * * reorganiz `tests/` to solve importing issue (PEP 420) * add dataset info * fix lint * * fix wrongly modified parts in previous rebase * fix lint * rename datasets/_base_ as datasets/base * resolve compatibility of pose_limb_color * Add dummy dataset base classes with old names for compatibility * * Rewrite relative unittest based on dataset_info * Add bc-breaking test for functions related to dataset_info * Rename DatasetInfo.dataset_info as DatasetInfo._dataset_info * Fix dataset_info of h36m dataset * Handle breaking change pose_limb_color -> pose_link_color * add unittest for old-fashioned dataset initialization without dataset_info * resolve naming conflict in unittests Co-authored-by: zengwang430521 <[email protected]> Co-authored-by: ly015 <[email protected]>
* Fix import and deprecation issues in unit tests (#871) * fix some bugs in the unit test of smpl model. * reorganize `tests/` to solve importing issue (PEP 420) * fix deprecation warnings in unit tests Co-authored-by: ly015 <[email protected]> * add benchmark regression test script with tmux (#849) * test the simple case using tmux to run multiple benchmark regression test tasks * modify and rename the config file and script * Delete config_list.yaml * modify the config and rename the filename * Delete test_benchmark_tmux.py * modify the script and rename the filename * Update setup.cfg * using mmcv.load to avoid introducing the extra dependency on yaml * fix some typo * refactor the config file and modify the script accordingly * modify the config and script * rename the config file * Correct dataset preparation guide of WFLW (#873) * add pr template (#875) * add CITATION.cff and update setup.py (#876) * Add copyright header and pre-commit hook (#872) * Add pre-commit hook to automatically add copyright file header * update files with copyright header * Limit copyright checking in the first 2 lines of a file * Exclude configs in demo/ * set max-header-lines as 5 * rebase to master and add copyright to new files * move benchmark_regression into .dev_scripts/benchmark * Translate tasks/2d_body_keypoint.md (#842) * 2rd PR remove poseval * fix lint * revise the CN version Co-authored-by: ly015 <[email protected]> * fix some bugs in the unit test of smpl model. * * reorganiz `tests/` to solve importing issue (PEP 420) * add dataset info * fix lint * * fix wrongly modified parts in previous rebase * fix lint * rename datasets/_base_ as datasets/base * resolve compatibility of pose_limb_color * Add dummy dataset base classes with old names for compatibility * * Rewrite relative unittest based on dataset_info * Add bc-breaking test for functions related to dataset_info * Rename DatasetInfo.dataset_info as DatasetInfo._dataset_info * Fix dataset_info of h36m dataset * Handle breaking change pose_limb_color -> pose_link_color * add unittest for old-fashioned dataset initialization without dataset_info * resolve naming conflict in unittests Co-authored-by: zengwang430521 <[email protected]> Co-authored-by: ly015 <[email protected]>
* add dataset info (#663) * Fix import and deprecation issues in unit tests (#871) * fix some bugs in the unit test of smpl model. * reorganize `tests/` to solve importing issue (PEP 420) * fix deprecation warnings in unit tests Co-authored-by: ly015 <[email protected]> * add benchmark regression test script with tmux (#849) * test the simple case using tmux to run multiple benchmark regression test tasks * modify and rename the config file and script * Delete config_list.yaml * modify the config and rename the filename * Delete test_benchmark_tmux.py * modify the script and rename the filename * Update setup.cfg * using mmcv.load to avoid introducing the extra dependency on yaml * fix some typo * refactor the config file and modify the script accordingly * modify the config and script * rename the config file * Correct dataset preparation guide of WFLW (#873) * add pr template (#875) * add CITATION.cff and update setup.py (#876) * Add copyright header and pre-commit hook (#872) * Add pre-commit hook to automatically add copyright file header * update files with copyright header * Limit copyright checking in the first 2 lines of a file * Exclude configs in demo/ * set max-header-lines as 5 * rebase to master and add copyright to new files * move benchmark_regression into .dev_scripts/benchmark * Translate tasks/2d_body_keypoint.md (#842) * 2rd PR remove poseval * fix lint * revise the CN version Co-authored-by: ly015 <[email protected]> * fix some bugs in the unit test of smpl model. * * reorganiz `tests/` to solve importing issue (PEP 420) * add dataset info * fix lint * * fix wrongly modified parts in previous rebase * fix lint * rename datasets/_base_ as datasets/base * resolve compatibility of pose_limb_color * Add dummy dataset base classes with old names for compatibility * * Rewrite relative unittest based on dataset_info * Add bc-breaking test for functions related to dataset_info * Rename DatasetInfo.dataset_info as DatasetInfo._dataset_info * Fix dataset_info of h36m dataset * Handle breaking change pose_limb_color -> pose_link_color * add unittest for old-fashioned dataset initialization without dataset_info * resolve naming conflict in unittests Co-authored-by: zengwang430521 <[email protected]> Co-authored-by: ly015 <[email protected]> * fix typo * fix typo Co-authored-by: Jas <[email protected]> Co-authored-by: zengwang430521 <[email protected]>
* test the simple case using tmux to run multiple benchmark regression test tasks * modify and rename the config file and script * Delete config_list.yaml * modify the config and rename the filename * Delete test_benchmark_tmux.py * modify the script and rename the filename * Update setup.cfg * using mmcv.load to avoid introducing the extra dependency on yaml * fix some typo * refactor the config file and modify the script accordingly * modify the config and script * rename the config file
* add dataset info (open-mmlab#663) * Fix import and deprecation issues in unit tests (open-mmlab#871) * fix some bugs in the unit test of smpl model. * reorganize `tests/` to solve importing issue (PEP 420) * fix deprecation warnings in unit tests Co-authored-by: ly015 <[email protected]> * add benchmark regression test script with tmux (open-mmlab#849) * test the simple case using tmux to run multiple benchmark regression test tasks * modify and rename the config file and script * Delete config_list.yaml * modify the config and rename the filename * Delete test_benchmark_tmux.py * modify the script and rename the filename * Update setup.cfg * using mmcv.load to avoid introducing the extra dependency on yaml * fix some typo * refactor the config file and modify the script accordingly * modify the config and script * rename the config file * Correct dataset preparation guide of WFLW (open-mmlab#873) * add pr template (open-mmlab#875) * add CITATION.cff and update setup.py (open-mmlab#876) * Add copyright header and pre-commit hook (open-mmlab#872) * Add pre-commit hook to automatically add copyright file header * update files with copyright header * Limit copyright checking in the first 2 lines of a file * Exclude configs in demo/ * set max-header-lines as 5 * rebase to master and add copyright to new files * move benchmark_regression into .dev_scripts/benchmark * Translate tasks/2d_body_keypoint.md (open-mmlab#842) * 2rd PR remove poseval * fix lint * revise the CN version Co-authored-by: ly015 <[email protected]> * fix some bugs in the unit test of smpl model. * * reorganiz `tests/` to solve importing issue (PEP 420) * add dataset info * fix lint * * fix wrongly modified parts in previous rebase * fix lint * rename datasets/_base_ as datasets/base * resolve compatibility of pose_limb_color * Add dummy dataset base classes with old names for compatibility * * Rewrite relative unittest based on dataset_info * Add bc-breaking test for functions related to dataset_info * Rename DatasetInfo.dataset_info as DatasetInfo._dataset_info * Fix dataset_info of h36m dataset * Handle breaking change pose_limb_color -> pose_link_color * add unittest for old-fashioned dataset initialization without dataset_info * resolve naming conflict in unittests Co-authored-by: zengwang430521 <[email protected]> Co-authored-by: ly015 <[email protected]> * fix typo * fix typo Co-authored-by: Jas <[email protected]> Co-authored-by: zengwang430521 <[email protected]>
…lab#849) * [Enhance] Ensure metrics is not empty when saving best ckpts * fix warn to warning * delete a unnecessary method
* test the simple case using tmux to run multiple benchmark regression test tasks * modify and rename the config file and script * Delete config_list.yaml * modify the config and rename the filename * Delete test_benchmark_tmux.py * modify the script and rename the filename * Update setup.cfg * using mmcv.load to avoid introducing the extra dependency on yaml * fix some typo * refactor the config file and modify the script accordingly * modify the config and script * rename the config file
* add dataset info (open-mmlab#663) * Fix import and deprecation issues in unit tests (open-mmlab#871) * fix some bugs in the unit test of smpl model. * reorganize `tests/` to solve importing issue (PEP 420) * fix deprecation warnings in unit tests Co-authored-by: ly015 <[email protected]> * add benchmark regression test script with tmux (open-mmlab#849) * test the simple case using tmux to run multiple benchmark regression test tasks * modify and rename the config file and script * Delete config_list.yaml * modify the config and rename the filename * Delete test_benchmark_tmux.py * modify the script and rename the filename * Update setup.cfg * using mmcv.load to avoid introducing the extra dependency on yaml * fix some typo * refactor the config file and modify the script accordingly * modify the config and script * rename the config file * Correct dataset preparation guide of WFLW (open-mmlab#873) * add pr template (open-mmlab#875) * add CITATION.cff and update setup.py (open-mmlab#876) * Add copyright header and pre-commit hook (open-mmlab#872) * Add pre-commit hook to automatically add copyright file header * update files with copyright header * Limit copyright checking in the first 2 lines of a file * Exclude configs in demo/ * set max-header-lines as 5 * rebase to master and add copyright to new files * move benchmark_regression into .dev_scripts/benchmark * Translate tasks/2d_body_keypoint.md (open-mmlab#842) * 2rd PR remove poseval * fix lint * revise the CN version Co-authored-by: ly015 <[email protected]> * fix some bugs in the unit test of smpl model. * * reorganiz `tests/` to solve importing issue (PEP 420) * add dataset info * fix lint * * fix wrongly modified parts in previous rebase * fix lint * rename datasets/_base_ as datasets/base * resolve compatibility of pose_limb_color * Add dummy dataset base classes with old names for compatibility * * Rewrite relative unittest based on dataset_info * Add bc-breaking test for functions related to dataset_info * Rename DatasetInfo.dataset_info as DatasetInfo._dataset_info * Fix dataset_info of h36m dataset * Handle breaking change pose_limb_color -> pose_link_color * add unittest for old-fashioned dataset initialization without dataset_info * resolve naming conflict in unittests Co-authored-by: zengwang430521 <[email protected]> Co-authored-by: ly015 <[email protected]> * fix typo * fix typo Co-authored-by: Jas <[email protected]> Co-authored-by: zengwang430521 <[email protected]>
Motivation
When releasing the new version of our codebase monthly or quarterly, we would like to conduct benchmark regression tests for the previously released models and algorithms, which can support different priorities.
The base feature is to read a config file containing a model list and runtime parameters, then run multiple tasks in different panes and windows controlled by tmux automatically.
The priority of the models is as follows. P0: core, P1: important, P2: less important, P3: least important. You can assign different priorities for each model and also decide the priority levels for inference and training tasks, respectively.
This script aims at running multiple benchmark regression tasks without the need to start lots of terminals manually and avoiding the possible inconvenience due to network interruption when running tasks on remote servers, which is quite common.
Modification
We added the folder
.dev_scripts
containing two files:benchmark_regression_cfg_tmpl.yaml
andbenchmark_regression.py
. Besides, in order to specify thework-dir
of the inference task, we added an additional argument--work-dir
to the script$mmpose/tools/test.py
and modified the code accordingly.Arguments
The script is based on
$mmpose/tools/slurm_test.sh
and$mmpose/tools/slurm_train.sh
. It supports running test and train tasks with custom priority and runtime setting parameters, which can be specified in the config file.To run the script, a config file containing multiple models is required. For example, the
benchmark_regression_cfg_tmpl.yaml
under the directory$mmpose/.dev_scripts
. The config file gives a template about different fields. It has amodel_list
field that contains different priorities. Under each priority level, there are multiple models.Specifically, the config file must indicate model priorities and paths to the config file and the corresponding checkpoint file. For example,
The field priority like
P0
andP1
is added so that you assign different priorities for different models. You can add more models as you need under the corresponding priority field.For a more detailed description of the arguments, please refer to the script
$mmpose/.dev_scripts/benchmark_regression.py
.Usage
Here is a simple example to run the script.
Note that the
${TEST_PRIORITY}
and${TRAIN_PRIORITY}
give the largest number of priorities of test and train tasks, respectively.Running the above script with default parameters, you will start a new tmux session with each pane running a task independently. Enjoy it!