Lightweight Super-Resolution Head for Human Pose Estimation arxiv
Lightweight Super-Resolution Head for Human Pose Estimation
Accepted by ACM MM 2023
Haonan Wang, Jie Liu, Jie Tang, Gangshan Wu
- [2023.08.03] The pretrained models are released in Google Drive!
- [2023.07.30] The codes for SRPose are released!
- [2023.07.29] Our paper ''Lightweight Super-Resolution Head for Human Pose Estimation'' has been accpeted by ACM MM 2023. If you find this repository useful please give it a star 🌟.
This is the official implementation of Lightweight Super-Resolution Head for Human Pose Estimation. We present a Lightweight Super-Resolution Head , which predicts heatmaps with a spatial resolution higher than the input feature maps (or even consistent with the input image) by super-resolution, to effectively reduce the quantization error and the dependence on further post-processing. Besides, we propose SRPose to gradually recover the HR heatmaps from LR heatmaps and degraded features in a coarse-to-fine manner. To reduce the training difficulty of HR heatmaps, SRPose applies SR heads to supervise the intermediate features in each stage. In addition, the SR head is a lightweight and generic head that applies to top-down and bottom-up methods.
Backbone | Scheme | GFLOPs | Params | w/ Post. | w/o Post. | |||
---|---|---|---|---|---|---|---|---|
Backbone | Other | AP | AR | AP | AR | |||
Top-down methods | ||||||||
Resnet-50 | Simple head | 5.46 | 23.51M | 10.49M | 71.7 | 77.3 | 69.8 | 75.8 |
SR head (ours) | 5.77 | 23.51M | 10.59M | 72.4 | 77.9 | 72.2 | 77.7 | |
SRPose (ours) | 4.61 | 23.51M | 1.29M | 73.3 | 78.8 | 73.1 | 78.6 | |
HRNet-W32 | Simple head | 7.70 | 28.54M | 0.00M | 74.5 | 79.9 | 72.3 | 78.2 |
SR head (ours) | 7.98 | 28.54M | 0.09M | 75.6 | 80.6 | 75.4 | 80.5 | |
SRPose (ours) | 8.28 | 29.30M | 0.65M | 75.9 | 81.0 | 75.7 | 80.9 | |
TransPose-R-A4 | Simple head | 8.91 | 4.93M | 1.06M | 71.8 | 77.3 | 69.7 | 75.5 |
SR head (ours) | 9.23 | 4.93M | 1.16M | 73.2 | 78.4 | 73.1 | 78.3 | |
SRPose (ours) | 6.26 | 4.93M | 0.55M | 73.5 | 78.9 | 73.4 | 78.7 | |
HRFormer-S | Simple head | 2.82 | 7.89M | 0.00M | 74.0 | 79.2 | 72.1 | 77.6 |
SR head (ours) | 3.09 | 7.89M | 0.09M | 75.0 | 80.1 | 74.8 | 80.0 | |
SRPose (ours) | 3.34 | 8.21M | 0.65M | 75.6 | 80.7 | 75.5 | 80.6 | |
Bottpm-up methods | ||||||||
Resnet-50 | Simple head | 29.20 | 23.51M | 10.49M | 46.7 | 55.1 | - | - |
SR head (ours) | 30.86 | 23.51M | 10.60M | 48.4 | 56.6 | - | - | |
HRNet-W32 | Simple head | 41.10 | 28.54M | 0.00M | 65.3 | 70.9 | - | - |
SR head (ours) | 42.57 | 28.54M | 0.09M | 67.1 | 71.7 | - | - |
- The resolution of input is 256x192 for top-down methods, 512x512 for bottom-up methods.
- Flip test is used.
- Person detector has person AP of 56.4 on COCO val2017 dataset for top-down methods.
- Post. = extra post-processing (empirical shift) towards refining the predicted keypoint coordinate.
Method | Backbone | [email protected] |
---|---|---|
SimBa | Resnet-50 | 88.2 |
HRNet | HRNet-W32 | 90.1 |
SimCC | HRNet-W32 | 90.0 |
SRPose (ours) | Resnet-50 | 89.1 |
SRPose (ours) | HRNet-W32 | 90.5 |
- Flip test is used.
Method | Backbone | AP | AP_E | AP_M | AP_H |
---|---|---|---|---|---|
SimBa | Resnet-50 | 63.7 | 73.9 | 65.0 | 50.6 |
HRNet | HRNet-W32 | 66.4 | 74.0 | 67.4 | 55.6 |
SimCC | HRNet-W32 | 66.7 | 74.1 | 67.8 | 56.2 |
SRPose (ours) | Resnet-50 | 64.7 | 74.9 | 65.8 | 52.3 |
SRPose (ours) | HRNet-W32 | 67.8 | 77.5 | 69.1 | 55.6 |
- Flip test is used.
Please refer to THIS to prepare the environment step by step.
Pretrained models are provided in our model zoo.
# for single machine
bash tools/dist_train.sh <Config PATH> <NUM GPUs> --cfg-options model.pretrained=<Pretrained PATH> --seed 0
# for multiple machines
python -m torch.distributed.launch --nnodes <Num Machines> --node_rank <Rank of Machine> --nproc_per_node <GPUs Per Machine> --master_addr <Master Addr> --master_port <Master Port> tools/train.py <Config PATH> --cfg-options model.pretrained=<Pretrained PATH> --launcher pytorch --seed 0
To test the pretrained models performance, please run
bash tools/dist_test.sh <Config PATH> <Checkpoint PATH> <NUM GPUs>
We acknowledge the excellent implementation from mmpose, HRNet and HRFormer.
If you use our code or models in your research, please cite with:
@article{wang2023lightweight,
title={Lightweight Super-Resolution Head for Human Pose Estimation},
author={Wang, Haonan and Liu, Jie and Tang, Jie and Wu, Gangshan},
journal={arXiv preprint arXiv:2307.16765},
year={2023}
}