Update dataset_prepare.md to fix download path for NYU dataset

Add a simple warmup strategy for sigloss as discussed in Issue #20 Enhance DPT and fix bugs reported in Issue #23 Fix typos in docs and add several introductions
zhyever · Jun 5, 2022 · 383bb89 · 383bb89
1 parent df9fe14
commit 383bb89
Show file tree

Hide file tree

Showing 30 changed files with 5,848 additions and 91 deletions.
diff --git a/README.md b/README.md
@@ -26,7 +26,7 @@ Thanks to MMSeg, we own these major features. :blush:
 
 ## Benchmark and model zoo
 
-Results and models are available in the [model zoo (TODO)](docs/model_zoo.md).
+Results and models are available in the [model zoo](docs/model_zoo.md).
 
 Supported backbones (partially release):
 - [x] ResNet (CVPR'2016)
@@ -68,20 +68,34 @@ This project is released under the [Apache 2.0 license](LICENSE).
 This repo benefits from awesome works of [mmsegmentation](https://github.com/open-mmlab/mmsegmentation), [Adabins](https://github.com/shariqfarooq123/AdaBins),
 [BTS](https://github.com/cleinc/bts). Please also consider citing them.
 
-
-## TODO
-
-- Some annotations in codes are futile, waiting to be rewritten.
-- I will release codes of BinsFormer soon.
-- I would like to include self-supervised depth estimation methods, such as MonoDepth2.
-
 ## Cite
+If you find this toolbox helpful for your projects or research, consider citing one of our works listed below. I may conduct a technique report based on this toolbox to discuss training details for supervised monocular depth estimation in the future.
 
 ```bibtex
+@article{li2022binsformer,
+  title={BinsFormer: Revisiting Adaptive Bins for Monocular Depth Estimation},
+  author={Li, Zhenyu and Wang, Xuyang and Liu, Xianming and Jiang, Junjun},
+  journal={arXiv preprint arXiv:2204.00987},
+  year={2022}
+}
+@article{li2022depthformer,
+  title={DepthFormer: Exploiting Long-Range Correlation and Local Information for Accurate Monocular Depth Estimation},
+  author={Li, Zhenyu and Chen, Zehui and Liu, Xianming and Jiang, Junjun},
+  journal={arXiv preprint arXiv:2203.14211},
+  year={2022}
+}
 @article{li2021simipu,
   title={SimIPU: Simple 2D Image and 3D Point Cloud Unsupervised Pre-Training for Spatial-Aware Visual Representations},
   author={Li, Zhenyu and Chen, Zehui and Li, Ang and Fang, Liangji and Jiang, Qinhong and Liu, Xianming and Jiang, Junjun and Zhou, Bolei and Zhao, Hang},
   journal={arXiv preprint arXiv:2112.04680},
   year={2021}
 }
 ```
+
+## Changelog
+- **Jun. 5, 2022**: Add support for custom dataset training. Add a warmup interface for sigloss to help convergence as discussed in Issue [#20](https://github.com/zhyever/Monocular-Depth-Estimation-Toolbox/issues/20). Enhance the DPT support and fix bugs in provided pre-trained models as reported in Issue [#23](https://github.com/zhyever/Monocular-Depth-Estimation-Toolbox/issues/23). 
+- **Apr. 16, 2022**: Finish most of docs and provide all pre-trained parameters. Release codes about BTS, Adabins, DPT, SimIPU, and DepthFormer. Support KITTI, NYU-v2, SUN RGB-D(eval), and CityScapes.
+
+## TODO
+- I will release codes of BinsFormer soon (Delaying).
+- I would like to include self-supervised depth estimation methods, such as MonoDepth2.
diff --git a/configs/_base_/datasets/kitti_benchmark.py b/configs/_base_/datasets/kitti_benchmark.py
@@ -7,6 +7,7 @@
 train_pipeline = [
     dict(type='LoadImageFromFile'),
     dict(type='DepthLoadAnnotations'),
+    dict(type='LoadKITTICamIntrinsic'),
     dict(type='KBCrop', depth=True),
     dict(type='RandomRotate', prob=0.5, degree=2.5),
     dict(type='RandomFlip', prob=0.5),
@@ -71,18 +72,6 @@
         eigen_crop=False,
         min_depth=1e-3,
         max_depth=88),
-    # test=dict(
-    #     type=dataset_type,
-    #     data_root=data_root,
-    #     img_dir='input',
-    #     ann_dir='gt_depth',
-    #     depth_scale=256,
-    #     split='benchmark_val.txt',
-    #     pipeline=test_pipeline,
-    #     garg_crop=True,
-    #     eigen_crop=False,
-    #     min_depth=1e-3,
-    #     max_depth=88)
     test=dict(
         type=dataset_type,
         data_root=data_root,

diff --git a/configs/adabins/README.md b/configs/adabins/README.md
@@ -33,15 +33,15 @@ We address the problem of estimating a high quality dense depth map from a singl
 | Method | Backbone | Train Epoch | Abs Rel (+flip) | RMSE (+flip) | Config | Download |
 | ------ | :--------: | :----: | :--------------: | :------: | :------: | :--------: |
 | Official | EfficientNetB5-AP   |  25   | 0.058 | 2.36 |  - | -
-| Adabins  |  EfficientNetB5-AP  |  24   | 0.058 | 2.33 |  [config](https://github.com/zhyever/Monocular-Depth-Estimation-Toolbox/blob/main/configs/adabins/adabins_efnetb5ap_kitti_24e.py) | [log](https://github.com/zhyever/Monocular-Depth-Estimation-Toolbox/blob/main/configs/adabins/resources/logs/adabins_efnetb5ap_kitti_24e.txt)
+| Adabins  |  EfficientNetB5-AP  |  24   | 0.058 | 2.33 |  [config](https://github.com/zhyever/Monocular-Depth-Estimation-Toolbox/blob/main/configs/adabins/adabins_efnetb5ap_kitti_24e.py) | [log](https://github.com/zhyever/Monocular-Depth-Estimation-Toolbox/blob/main/configs/adabins/resources/logs/adabins_efnetb5ap_kitti_24e.txt) \| [model](https://drive.google.com/file/d/17srI3mFoYLdnN1As4a2fRGrHA0UHuujX/view?usp=sharing)
 
 
 ### NYU
 
 | Method | Backbone | Train Epoch | Abs Rel (+flip) | RMSE (+flip) | Config | Download |
 | ------ | :--------: | :----: | :--------------: | :------: |  :------: | :--------: |
 | Official | EfficientNetB5-AP   |  25   | 0.103 | 0.364 |  - | -
-| Adabins  | EfficientNetB5-AP   |  24   | 0.106 | 0.368 |  [config](https://github.com/zhyever/Monocular-Depth-Estimation-Toolbox/blob/main/configs/adabins/adabins_efnetb5ap_nyu_24e.py) | [log](https://github.com/zhyever/Monocular-Depth-Estimation-Toolbox/blob/main/configs/adabins/resources/logs/adabins_efnetb5ap_nyu_24e.txt)
-| Adabins  | ResNet-50   |  24   | 0.141 | 0.451 |  [config](https://github.com/zhyever/Monocular-Depth-Estimation-Toolbox/blob/main/configs/adabins/adabins_r50_nyu_24e.py) | log
+| Adabins  | EfficientNetB5-AP   |  24   | 0.106 | 0.368 |  [config](https://github.com/zhyever/Monocular-Depth-Estimation-Toolbox/blob/main/configs/adabins/adabins_efnetb5ap_nyu_24e.py) | [log](https://github.com/zhyever/Monocular-Depth-Estimation-Toolbox/blob/main/configs/adabins/resources/logs/adabins_efnetb5ap_nyu_24e.txt) \| [model](https://drive.google.com/file/d/1NRTWApIrxOjeeN7FdNTTOXV3KOuo_-aC/view?usp=sharing)
+| Adabins  | ResNet-50   |  24   | 0.141 | 0.451 |  [config](https://github.com/zhyever/Monocular-Depth-Estimation-Toolbox/blob/main/configs/adabins/adabins_r50_nyu_24e.py) | [log](https://github.com/zhyever/Monocular-Depth-Estimation-Toolbox/blob/main/configs/adabins/resources/logs/adabins_r50_nyu_24e.txt) \| [model](https://drive.google.com/file/d/1cVvmJjot1rLk06FkAQl_PCeMOApt6-7x/view?usp=sharing)
 
 
diff --git a/configs/adabins/adabins_efnetb5ap_nyu_24e.py b/configs/adabins/adabins_efnetb5ap_nyu_24e.py
@@ -23,11 +23,9 @@
     weight_decay=0.1,
     paramwise_cfg=dict(
         custom_keys={
-            'decode_head': dict(lr_mult=10), # 10 lr
-            # 'adaptive_bins_layer': dict(lr_mult=10), # 10 lr
-            # 'decoder': dict(lr_mult=10), # 10 lr
-            # 'conv_out': dict(lr_mult=10), # 10 lr
+            'decode_head': dict(lr_mult=10), # x10 lr
         }))
+
 # learning policy
 lr_config = dict(
     policy='OneCycle',