From fc514fdc1418c5d404c202f0c3dba557330ddc62 Mon Sep 17 00:00:00 2001 From: Tai-Wang Date: Mon, 1 Nov 2021 11:11:00 +0800 Subject: [PATCH] [Feature] PGD Benchmark on KITTI (#1014) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * [Refactor] Main code modification for coordinate system refactor (#677) * [Enhance] Add script for data update (#774) * Fixed wrong config paths and fixed a bug in test * Fixed metafile * Coord sys refactor (main code) * Update test_waymo_dataset.py * Manually resolve conflict * Removed unused lines and fixed imports * remove coord2box and box2coord * update dir_limit_offset * Some minor improvements * Removed some \s in comments * Revert a change * Change Box3DMode to Coord3DMode where points are converted * Fix points_in_bbox function * Fix Imvoxelnet config * Revert adding a line * Fix rotation bug when batch size is 0 * Keep sign of dir_scores as before * Fix several comments * Add a comment * Fix docstring * Add data update scripts * Fix comments * fix import (#839) * [Enhance] refactor iou_neg_piecewise_sampler.py (#842) * [Refactor] Main code modification for coordinate system refactor (#677) * [Enhance] Add script for data update (#774) * Fixed wrong config paths and fixed a bug in test * Fixed metafile * Coord sys refactor (main code) * Update test_waymo_dataset.py * Manually resolve conflict * Removed unused lines and fixed imports * remove coord2box and box2coord * update dir_limit_offset * Some minor improvements * Removed some \s in comments * Revert a change * Change Box3DMode to Coord3DMode where points are converted * Fix points_in_bbox function * Fix Imvoxelnet config * Revert adding a line * Fix rotation bug when batch size is 0 * Keep sign of dir_scores as before * Fix several comments * Add a comment * Fix docstring * Add data update scripts * Fix comments * fix import * refactor iou_neg_piecewise_sampler.py * add docstring * modify docstring Co-authored-by: Yezhen Cong <52420115+THU17cyz@users.noreply.github.com> Co-authored-by: THU17cyz * [Feature] Add roipooling cuda ops (#843) * [Refactor] Main code modification for coordinate system refactor (#677) * [Enhance] Add script for data update (#774) * Fixed wrong config paths and fixed a bug in test * Fixed metafile * Coord sys refactor (main code) * Update test_waymo_dataset.py * Manually resolve conflict * Removed unused lines and fixed imports * remove coord2box and box2coord * update dir_limit_offset * Some minor improvements * Removed some \s in comments * Revert a change * Change Box3DMode to Coord3DMode where points are converted * Fix points_in_bbox function * Fix Imvoxelnet config * Revert adding a line * Fix rotation bug when batch size is 0 * Keep sign of dir_scores as before * Fix several comments * Add a comment * Fix docstring * Add data update scripts * Fix comments * fix import * add roipooling cuda ops * add roi extractor * add test_roi_extractor unittest * Modify setup.py to install roipooling ops * modify docstring * remove enlarge bbox in roipoint pooling * add_roipooling_ops * modify docstring Co-authored-by: Yezhen Cong <52420115+THU17cyz@users.noreply.github.com> Co-authored-by: THU17cyz * [Refactor] Refactor code structure and docstrings (#803) * refactor points_in_boxes * Merge same functions of three boxes * More docstring fixes and unify x/y/z size * Add "optional" and fix "Default" * Add "optional" and fix "Default" * Add "optional" and fix "Default" * Add "optional" and fix "Default" * Add "optional" and fix "Default" * Remove None in function param type * Fix unittest * Add comments for NMS functions * Merge methods of Points * Add unittest * Add optional and default value * Fix box conversion and add unittest * Fix comments * Add unit test * Indent * Fix CI * Remove useless \\ * Remove useless \\ * Remove useless \\ * Remove useless \\ * Remove useless \\ * Add unit test for box bev * More unit tests and refine docstrings in box_np_ops * Fix comment * Add deprecation warning * [Feature] PointXYZWHLRBBoxCoder (#856) * support PointBasedBoxCoder * fix unittest bug * support unittest in gpu * support unittest in gpu * modified docstring * add args * add args * [Enhance] Change Groupfree3D config (#855) * All mods * PointSample * PointSample * [Doc] Add tutorials/data_pipeline Chinese version (#827) * [Doc] Add tutorials/data_pipeline Chinese version * refine doc * Use the absolute link * Use the absolute link Co-authored-by: Tai-Wang * [Doc] Add Chinese doc for `scannet_det.md` (#836) * Part * Complete * Fix comments * Fix comments * [Doc] Add Chinese doc for `waymo_det.md` (#859) * Add complete translation * Refinements * Fix comments * Fix a minor typo Co-authored-by: Tai-Wang * Remove 2D annotations on Lyft (#867) * Add header for files (#869) * Add header for files * Add header for files * Add header for files * Add header for files * [fix] fix typos (#872) * Fix 3 unworking configs (#882) * [Fix] Fix `index.rst` for Chinese docs (#873) * Fix index.rst for zh docs * Change switch language * [Fix] Centerpoint head nested list transpose (#879) * FIX Transpose nested lists without Numpy * Removed unused Numpy import * [Enhance] Update PointFusion (#791) * update point fusion * remove LIDAR hardcode * move get_proj_mat_by_coord_type to utils * fix lint * remove todo * fix lint * [Doc] Add nuscenes_det.md Chinese version (#854) * add nus chinese doc * add nuScenes Chinese doc * fix typo * fix typo * fix typo * fix typo * fix typo * [Fix] Fix RegNet pretrained weight loading (#889) * Fix regnet pretrained weight loading * Remove unused file * Fix centerpoint tta (#892) * [Enhance] Add benchmark regression script (#808) * Initial commit * [Feature] Support DGCNN (v1.0.0.dev0) (#896) * support dgcnn * support dgcnn * support dgcnn * support dgcnn * support dgcnn * support dgcnn * support dgcnn * support dgcnn * support dgcnn * support dgcnn * fix typo * fix typo * fix typo * del gf&fa registry (wo reuse pointnet module) * fix typo * add benchmark and add copyright header (for DGCNN only) * fix typo * fix typo * fix typo * fix typo * fix typo * support dgcnn * Change cam rot_3d_in_axis (#906) * [Doc] Add coord sys tutorial pic and change links to dev branch (#912) * Modify link branch and add pic * Fix pic * [Feature] add kitti AP40 evaluation metric (v1.0.0.dev0) (#927) * Add citation (#901) * [Feature] Add python3.9 in CI (#900) * Add python3.0 in CI * Add python3.0 in CI * Bump to v0.17.0 (#898) * Update README.md * Update README_zh-CN.md * Update version.py * Update getting_started.md * Update getting_started.md * Update changelog.md * Remove "recent" in the news * Remove "recent" in the news * Fix comments * [Docs] Fix the version of sphinx (#902) * Fix sphinx version * Fix sphinx version * Fix sphinx version * Fix sphinx version * Fix sphinx version * Fix sphinx version * Fix sphinx version * Fix sphinx version * Fix sphinx version * Fix sphinx version * Fix sphinx version * Fix sphinx version * add AP40 * add unitest * add unitest * seperate AP11 and AP40 * fix some typos Co-authored-by: dingchang Co-authored-by: Tai-Wang * [Feature] add smoke backbone neck (#939) * add smoke detecotor and it's backbone and neck * typo fix * fix typo * add docstring * fix typo * fix comments * fix comments * fix comments * fix typo * fix typo * fix * fix typo * fix docstring * refine feature * fix typo * use Basemodule in Neck * [Refactor] Refactor the transformation from image to camera coordinates (#938) * Refactor points_img2cam * Refine docstring * Support array converter and add unit tests * [Feature] FCOS3D BBox Coder (#940) * FCOS3D BBox Coder * Add unit tests * Change the value from long to float/double * Rename bbox_out as bbox * Add comments to forward returns * Support PGD BBox Coder * Refine docstring * Add uncertain l1 loss and its unit tests * [Feature] PGD BBox Coder (#948) * Support PGD BBox Coder * Refine docstring * PGD Head initialized * Refactor init methods, fix legacy variable names * [Feature] Support Uncertain L1 Loss (#950) * Add uncertain l1 loss and its unit tests * Remove mmcv.jit and refine docstrings * [Fix] Fix visualization in KITTI dataset (#956) * fix bug to support kitti vis * fix * Refine variable names and docstrings * Add unit tests and fix some minor bugs * Refine assertion messages * Fix typo in the docs_zh-CN * Use Pretrain init and remove unused init_cfg in FCOS3D * Fix the comments for the input_modality in the dataset config * Fix minor bugs in pgd_bbox_coder and incorrect setting for uncertain loss, use original init * Add explanations for code_weights * Add PGD README * Adjust the unit test for pgd bbox coder * Remove unused codes * Add mono3d metric into the gather_models and fix bugs * Update README.md * Update links * Update links * Involve the value assignment of loss_dict into the computing procedure * Fix incorrect loss_depth * Update README.md * Update README_zh-CN.md * Update PGD in the model_zoo.md * Update PGD in the model_zoo.md * Update metafiles Co-authored-by: Yezhen Cong <52420115+THU17cyz@users.noreply.github.com> Co-authored-by: Xi Liu <75658786+xiliu8006@users.noreply.github.com> Co-authored-by: THU17cyz Co-authored-by: Wenhao Wu <79644370+wHao-Wu@users.noreply.github.com> Co-authored-by: dingchang Co-authored-by: 谢恩泽 Co-authored-by: Robin Karlsson <34254153+robin-karlsson0@users.noreply.github.com> Co-authored-by: Danila Rukhovich Co-authored-by: ChaimZhu --- README.md | 2 ++ README_zh-CN.md | 2 ++ configs/pgd/README.md | 37 +++++++++++++++++++++++++++++++++++++ configs/pgd/metafile.yml | 29 +++++++++++++++++++++++++++++ configs/smoke/metafile.yml | 4 ++-- docs/model_zoo.md | 4 ++++ docs_zh-CN/model_zoo.md | 6 +++++- 7 files changed, 81 insertions(+), 3 deletions(-) create mode 100644 configs/pgd/README.md create mode 100644 configs/pgd/metafile.yml diff --git a/README.md b/README.md index 0981aa0d0e..d0fbc0880d 100644 --- a/README.md +++ b/README.md @@ -106,6 +106,7 @@ Support methods - [x] [PAConv (CVPR'2021)](configs/paconv/README.md) - [x] [DGCNN (TOG'2019)](configs/dgcnn/README.md) - [x] [SMOKE (CVPRW'2020)](configs/smoke/README.md) +- [x] [PGD (CoRL'2021)](configs/pgd/README.md) | | ResNet | ResNeXt | SENet |PointNet++ |DGCNN | HRNet | RegNetX | Res2Net | DLA | |--------------------|:--------:|:--------:|:--------:|:---------:|:---------:|:-----:|:--------:|:-----:|:---:| @@ -127,6 +128,7 @@ Support methods | PAConv | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ | DGCNN | ✗ | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | ✗ | ✗ | SMOKE | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✓ +| PGD | ✓ | ☐ | ☐ | ✗ | ✗ | ☐ | ☐ | ☐ | ✗ Other features - [x] [Dynamic Voxelization](configs/dynamic_voxelization/README.md) diff --git a/README_zh-CN.md b/README_zh-CN.md index e9d8a9aa52..3cc894478c 100644 --- a/README_zh-CN.md +++ b/README_zh-CN.md @@ -105,6 +105,7 @@ MMDetection3D 是一个基于 PyTorch 的目标检测开源工具箱, 下一代 - [x] [PAConv (CVPR'2021)](configs/paconv/README.md) - [x] [DGCNN (TOG'2019)](configs/dgcnn/README.md) - [x] [SMOKE (CVPRW'2020)](configs/smoke/README.md) +- [x] [PGD (CoRL'2021)](configs/pgd/README.md) | | ResNet | ResNeXt | SENet |PointNet++ |DGCNN | HRNet | RegNetX | Res2Net | DLA | |--------------------|:--------:|:--------:|:--------:|:---------:|:---------:|:-----:|:--------:|:-----:|:---:| @@ -126,6 +127,7 @@ MMDetection3D 是一个基于 PyTorch 的目标检测开源工具箱, 下一代 | PAConv | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ | DGCNN | ✗ | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | ✗ | ✗ | SMOKE | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✓ +| PGD | ✓ | ☐ | ☐ | ✗ | ✗ | ☐ | ☐ | ☐ | ✗ 其他特性 - [x] [Dynamic Voxelization](configs/dynamic_voxelization/README.md) diff --git a/configs/pgd/README.md b/configs/pgd/README.md new file mode 100644 index 0000000000..50bd212097 --- /dev/null +++ b/configs/pgd/README.md @@ -0,0 +1,37 @@ +# Probabilistic and Geometric Depth: Detecting Objects in Perspective + +## Introduction + + + +PGD, also can be regarded as FCOS3D++, is a simple yet effective monocular 3D detector. It enhances the FCOS3D baseline by involving local geometric constraints and improving instance depth estimation. + +We first release the code and model for KITTI benchmark, which is a good supplement for the original FCOS3D baseline (only supported on nuScenes). Models for nuScenes will be released soon. + +For clean implementation, our preliminary release supports base models with proposed local geometric constraints and the probabilistic depth representation. We will involve the geometric graph part in the future. + +``` +@inproceedings{wang2021pgd, + title={Probabilistic and Geometric Depth: Detecting Objects in Perspective}, + author={Wang, Tai and Zhu, Xinge and Pang, Jiangmiao and Lin, Dahua}, + booktitle={Conference on Robot Learning (CoRL) 2021}, + year={2021} +} +``` + +## Results + +### KITTI + +| Backbone | Lr schd | Mem (GB) | Inf time (fps) | mAP_11 / mAP_40 | Download | +| :---------: | :-----: | :------: | :------------: | :----: | :------: | +|[ResNet101](./pgd_r101_caffe_fpn_gn-head_3x4_4x_kitti-mono3d.py)|4x|9.07||18.33 / 13.23|[model](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pgd/pgd_r101_caffe_fpn_gn-head_3x4_4x_kitti-mono3d/pgd_r101_caffe_fpn_gn-head_3x4_4x_kitti-mono3d_20211022_102608-8a97533b.pth) | [log](https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pgd/pgd_r101_caffe_fpn_gn-head_3x4_4x_kitti-mono3d/pgd_r101_caffe_fpn_gn-head_3x4_4x_kitti-mono3d_20211022_102608.log.json) + +Detailed performance on KITTI 3D detection (3D/BEV) is as follows, evaluated by AP11 and AP40 metric: + +| | Easy | Moderate | Hard | +|-------------|:-------------:|:--------------:|:-------------:| +| Car (AP11) | 24.09 / 30.11 | 18.33 / 23.46 | 16.90 / 19.33 | +| Car (AP40) | 19.27 / 26.60 | 13.23 / 18.23 | 10.65 / 15.00 | + +Note: mAP represents Car moderate 3D strict AP11 / AP40 results. Because of the limited data for pedestrians and cyclists, the detection performance for these two classes is usually unstable. Therefore, we only list car detection results here. In addition, AP40 is a more recommended metric for reference due to its much better stability. diff --git a/configs/pgd/metafile.yml b/configs/pgd/metafile.yml new file mode 100644 index 0000000000..1f85083f7d --- /dev/null +++ b/configs/pgd/metafile.yml @@ -0,0 +1,29 @@ +Collections: + - Name: PGD + Metadata: + Training Data: KITTI + Training Techniques: + - SGD + Training Resources: 4x TITAN XP + Architecture: + - PGDHead + Paper: + URL: https://arxiv.org/abs/2107.14160 + Title: 'Probabilistic and Geometric Depth: Detecting Objects in Perspective' + README: configs/pgd/README.md + Code: + URL: https://github.com/open-mmlab/mmdetection3d/blob/v1.0.0.dev0/mmdet3d/models/dense_heads/pgd_head.py#17 + Version: v1.0.0 + +Models: + - Name: pgd_r101_caffe_fpn_gn-head_3x4_4x_kitti-mono3d + In Collection: PGD + Config: configs/pgd/pgd_r101_caffe_fpn_gn-head_3x4_4x_kitti-mono3d.py + Metadata: + Training Memory (GB): 9.1 + Results: + - Task: 3D Object Detection + Dataset: KITTI + Metrics: + mAP: 18.33 + Weights: https://download.openmmlab.com/mmdetection3d/v1.0.0_models/pgd/pgd_r101_caffe_fpn_gn-head_3x4_4x_kitti-mono3d/pgd_r101_caffe_fpn_gn-head_3x4_4x_kitti-mono3d_20211022_102608-8a97533b.pth diff --git a/configs/smoke/metafile.yml b/configs/smoke/metafile.yml index 86f1305f5a..df956e4963 100644 --- a/configs/smoke/metafile.yml +++ b/configs/smoke/metafile.yml @@ -13,8 +13,8 @@ Collections: Title: 'SMOKE: Single-Stage Monocular 3D Object Detection via Keypoint Estimation' README: configs/smoke/README.md Code: - URL: https://github.com/open-mmlab/mmdetection3d/blob/master/mmdet3d/models/detectors/smoke_mono3d.py#L7 - Version: v0.17.1 + URL: https://github.com/open-mmlab/mmdetection3d/blob/v1.0.0.dev0/mmdet3d/models/detectors/smoke_mono3d.py#L7 + Version: v1.0.0 Models: - Name: smoke_dla34_pytorch_dlaneck_gn-all_8x4_6x_kitti-mono3d diff --git a/docs/model_zoo.md b/docs/model_zoo.md index a8b495686f..9fa86d41ce 100644 --- a/docs/model_zoo.md +++ b/docs/model_zoo.md @@ -85,3 +85,7 @@ Please refer to [DGCNN](https://github.com/open-mmlab/mmdetection3d/tree/v1.0.0. ### SMOKE Please refer to [SMOKE](https://github.com/open-mmlab/mmdetection3d/tree/v1.0.0.dev0/configs/smoke) for details. We provide SMOKE baselines on KITTI dataset. + +### PGD + +Please refer to [PGD](https://github.com/open-mmlab/mmdetection3d/tree/v1.0.0.dev0/configs/pgd) for details. We provide PGD baselines on KITTI dataset. diff --git a/docs_zh-CN/model_zoo.md b/docs_zh-CN/model_zoo.md index 6ca0d011db..6e3dcf6e75 100644 --- a/docs_zh-CN/model_zoo.md +++ b/docs_zh-CN/model_zoo.md @@ -86,4 +86,8 @@ ### SMOKE -请考考 [SMOKE](https://github.com/open-mmlab/mmdetection3d/tree/v1.0.0.dev0/configs/smoke) 获取更多细节,我们在 KITTI 数据集上给出了相应的结果. +请参考 [SMOKE](https://github.com/open-mmlab/mmdetection3d/tree/v1.0.0.dev0/configs/smoke) 获取更多细节,我们在 KITTI 数据集上给出了相应的结果. + +### PGD + +请参考 [PGD](https://github.com/open-mmlab/mmdetection3d/tree/v1.0.0.dev0/configs/pgd) 获取更多细节,我们在 KITTI 数据集上给出了相应的结果.