Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MobileNetv2 config for YOLOv3 #5510

Merged
merged 10 commits into from
Jul 24, 2021

Conversation

ElectronicElephant
Copy link
Contributor

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.

Motivation

In #5450 , many people are interested in YOLOv3 based on MobileNetv2 backbone. So I just created such config file a few days earlier and trained this model on 4 A100 cards.

It's just a config file. No harm done. -:)

Pretrained models and log files will be added later.

@ElectronicElephant
Copy link
Contributor Author

ElectronicElephant commented Jul 2, 2021

@codecov
Copy link

codecov bot commented Jul 2, 2021

Codecov Report

Merging #5510 (eda096d) into master (475c6be) will increase coverage by 0.34%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #5510      +/-   ##
==========================================
+ Coverage   66.02%   66.36%   +0.34%     
==========================================
  Files         279      279              
  Lines       21889    21962      +73     
  Branches     3629     3650      +21     
==========================================
+ Hits        14452    14576     +124     
+ Misses       6675     6635      -40     
+ Partials      762      751      -11     
Flag Coverage Δ
unittests 66.34% <ø> (+0.35%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
mmdet/models/detectors/detr.py 45.45% <0.00%> (-11.69%) ⬇️
mmdet/models/utils/normed_predictor.py 83.33% <0.00%> (-9.53%) ⬇️
mmdet/datasets/pipelines/instaboost.py 15.00% <0.00%> (ø)
mmdet/models/roi_heads/sparse_roi_head.py 94.23% <0.00%> (+0.23%) ⬆️
mmdet/models/roi_heads/mask_heads/fcn_mask_head.py 70.34% <0.00%> (+0.58%) ⬆️
mmdet/models/roi_heads/scnet_roi_head.py 86.03% <0.00%> (+1.82%) ⬆️
mmdet/models/detectors/two_stage.py 70.58% <0.00%> (+2.35%) ⬆️
mmdet/models/roi_heads/htc_roi_head.py 71.61% <0.00%> (+2.49%) ⬆️
mmdet/models/roi_heads/standard_roi_head.py 65.21% <0.00%> (+3.10%) ⬆️
mmdet/models/roi_heads/grid_roi_head.py 70.65% <0.00%> (+4.34%) ⬆️
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 475c6be...eda096d. Read the comment docs.

@ZwwWayne
Copy link
Collaborator

ZwwWayne commented Jul 3, 2021

Hi @ElectronicElephant ,
Thanks for your kind contribution, would you like to also provide the json log?
Also, would you like to also update the readme.md in YOLOv3 and its corresponding metafile?

@ZwwWayne
Copy link
Collaborator

ZwwWayne commented Jul 3, 2021

20210627_170852.log

weights: https://drive.google.com/drive/folders/11WLBnaof-8ita0B2eSI_Wb7xbYVkxNOT?usp=sharing

Application of access has been sent. I will upload it to the server.

@ElectronicElephant
Copy link
Contributor Author

20210627_170852.log
weights: https://drive.google.com/drive/folders/11WLBnaof-8ita0B2eSI_Wb7xbYVkxNOT?usp=sharing

Application of access has been sent. I will upload it to the server.

my bad, I have granted the permission

@ZwwWayne ZwwWayne mentioned this pull request Jul 5, 2021
13 tasks
@ElectronicElephant
Copy link
Contributor Author

ElectronicElephant commented Jul 6, 2021

Hi @ZwwWayne , I have updated the readme file and fixed the lint issue. However, I do not have V100s, so I left the memory and inference time blank.

Also, IMHO, considering the batch-size, 8x8 v.s. 4x16 may make a difference.


A personal question: Are there ''standard'' ImageNet-pretrained weights for MobileNet?

@RangiLyu
Copy link
Member

RangiLyu commented Jul 7, 2021

Hi @ZwwWayne , I have updated the readme file and fixed the lint issue. However, I do not have V100s, so I left the memory and inference time blank.

Also, IMHO, considering the batch-size, 8x8 v.s. 4x16 may make a difference.


A personal question: Are there ''standard'' ImageNet-pretrained weights for MobileNet?

MobileNetV2 pretrained weights has been uploaded. Waiting for being merged open-mmlab/mmcv#1177

@ZwwWayne ZwwWayne requested a review from RangiLyu July 9, 2021 01:15
@RangiLyu
Copy link
Member

RangiLyu commented Jul 9, 2021

Hi, great thanks for your contribution! MobileNetV2 YoloV3 is a very popular model for mobile devices, we decide to support it in the next version.
I've read your config and training log and have some advice:

  1. 608x608 may be too large for mobilenet yolo because the FLOPs is very large and will slow down the inference speed on edge devices. 320x320 or 416x416 would be better.
  2. The channels of the neck and head are unbalanced on each level. This may lead to performance degradation.
  3. RepeatDataset can be used in training to speed up training and avoid evaluation and saving weights too frequently.

I modified your config and trained on 416x416 input size:

model = dict(
    type='YOLOV3',
    backbone=dict(
        type='MobileNetV2',
        out_indices=(2, 4, 6),
        act_cfg=dict(type='LeakyReLU', negative_slope=0.1)),
    neck=dict(
        type='YOLOV3Neck',
        num_scales=3,
        in_channels=[320, 96, 32],
        out_channels=[96, 96, 96]),
    bbox_head=dict(
        type='YOLOV3Head',
        num_classes=80,
        in_channels=[96, 96, 96],
        out_channels=[96, 96, 96]))
    ......

The FLOPs and Params are lower and mAP is higher:

Input shape: (3, 416, 416)
Flops: 2.86 GFLOPs
Params: 3.74 M
mAP: 23.9

I think there is still room for improvement.
Would you like to allow us to continue on this PR? We will upload a well-tuned baseline soon.

@ElectronicElephant
Copy link
Contributor Author

Hi @RangiLyu ,

I'm glad to hear that. Just go ahead please.

@RangiLyu
Copy link
Member

Updates:

I add MobileNetV2 YOLOV3 416 and 320:

Backbone Scale Lr schd Mem (GB) Inf time (fps) box AP
MobileNetV2 416 300e 5.3 23.9
MobileNetV2 320 300e 3.2 22.2

I'll test inference time later.

@ZwwWayne ZwwWayne merged commit 31b3a58 into open-mmlab:master Jul 24, 2021
@wjw6692353
Copy link

yolov3 mobileNetv2 could convert to onnx, but the result is wrong.

2021-12-28 10:12:17.186626216 [W:onnxruntime:, graph.cc:1237 Graph] Initializer 1824 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
Traceback (most recent call last):
File "tools/deployment/pytorch2onnx.py", line 345, in
skip_postprocess=args.skip_postprocess)
File "tools/deployment/pytorch2onnx.py", line 206, in pytorch2onnx
o_res, p_res, rtol=1e-03, atol=1e-05, err_msg=err_msg)
File "/home/wjw/.conda/envs/mmdetection/lib/python3.7/site-packages/numpy/testing/_private/utils.py", line 1531, in assert_allclose
verbose=verbose, header=header, equal_nan=equal_nan)
File "/home/wjw/.conda/envs/mmdetection/lib/python3.7/site-packages/numpy/testing/_private/utils.py", line 763, in assert_array_compare
raise AssertionError(msg)
AssertionError:
Not equal to tolerance rtol=0.001, atol=1e-05
The numerical values are different between Pytorch and ONNX, but it does not necessarily mean the exported ONNX model is problematic.
(shapes (100, 5), (3, 5) mismatch)
x: array([[ 3.109553e+01, 3.694163e+02, 4.543646e+01, 3.847349e+02,
2.622679e-01],
[ 3.687222e+02, 2.582237e+02, 4.469355e+02, 4.047763e+02,...
y: array([[3.109553e+01, 3.694163e+02, 4.543646e+01, 3.847349e+02,
2.622660e-01],
[3.687222e+02, 2.582236e+02, 4.469356e+02, 4.047764e+02,...

The right result is:
result [array([[1.8301238e+03, 6.4244287e+02, 2.4239041e+03, 1.2643510e+03,
9.3440014e-01],
[1.1269495e+03, 9.3734717e+02, 1.5567422e+03, 1.4623030e+03,
1.1949890e-02]], dtype=float32)] (2, 5)

my convert cmd
python tools/deployment/pytorch2onnx.py
hand_detect_dir/second_train/wjw_2_yolov3_mobilenetv2_320_300e_coco.py
hand_detect_dir/second_train/epoch_30.pth
--output-file my_hand_new.onnx
--input-img demo/hand1.jpg
--test-img tests/data/color.jpg
--shape 320 320
--verify
--dynamic-export
--cfg-options
model.test_cfg.deploy_nms_pre=-1

  • Linux 20.04
  • Python 3.7.11
  • PyTorch 1.8.0
  • CUDA 10.2
  • GCC 9.3.0
  • mmcv 1.4.1

@ElectronicElephant
Copy link
Contributor Author

yolov3 mobileNetv2 could convert to onnx, but the result is wrong.

2021-12-28 10:12:17.186626216 [W:onnxruntime:, graph.cc:1237 Graph] Initializer 1824 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py. Traceback (most recent call last): File "tools/deployment/pytorch2onnx.py", line 345, in skip_postprocess=args.skip_postprocess) File "tools/deployment/pytorch2onnx.py", line 206, in pytorch2onnx o_res, p_res, rtol=1e-03, atol=1e-05, err_msg=err_msg) File "/home/wjw/.conda/envs/mmdetection/lib/python3.7/site-packages/numpy/testing/_private/utils.py", line 1531, in assert_allclose verbose=verbose, header=header, equal_nan=equal_nan) File "/home/wjw/.conda/envs/mmdetection/lib/python3.7/site-packages/numpy/testing/_private/utils.py", line 763, in assert_array_compare raise AssertionError(msg) AssertionError: Not equal to tolerance rtol=0.001, atol=1e-05 The numerical values are different between Pytorch and ONNX, but it does not necessarily mean the exported ONNX model is problematic. (shapes (100, 5), (3, 5) mismatch) x: array([[ 3.109553e+01, 3.694163e+02, 4.543646e+01, 3.847349e+02, 2.622679e-01], [ 3.687222e+02, 2.582237e+02, 4.469355e+02, 4.047763e+02,... y: array([[3.109553e+01, 3.694163e+02, 4.543646e+01, 3.847349e+02, 2.622660e-01], [3.687222e+02, 2.582236e+02, 4.469356e+02, 4.047764e+02,...

The right result is: result [array([[1.8301238e+03, 6.4244287e+02, 2.4239041e+03, 1.2643510e+03, 9.3440014e-01], [1.1269495e+03, 9.3734717e+02, 1.5567422e+03, 1.4623030e+03, 1.1949890e-02]], dtype=float32)] (2, 5)

my convert cmd python tools/deployment/pytorch2onnx.py hand_detect_dir/second_train/wjw_2_yolov3_mobilenetv2_320_300e_coco.py hand_detect_dir/second_train/epoch_30.pth --output-file my_hand_new.onnx --input-img demo/hand1.jpg --test-img tests/data/color.jpg --shape 320 320 --verify --dynamic-export --cfg-options model.test_cfg.deploy_nms_pre=-1

* Linux 20.04

* Python 3.7.11

* PyTorch 1.8.0

* CUDA 10.2

* GCC 9.3.0

* mmcv 1.4.1

Hi, thanks for your report. Currently I do not have time to look into it. However, I would suggest you opening a new issue anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants