Jung-Hee Kim*, Junwha Hur*, Tien Nguyen, and Seong-Gyun Jeong - NeurIPS 2022
Link to the paper: Link
@inproceedings{kimself,
title={Self-supervised surround-view depth estimation with volumetric feature fusion},
author={Kim, Jung Hee and Hur, Junhwa and Nguyen, Tien Phuoc and Jeong, Seong-Gyun},
booktitle={Advances in Neural Information Processing Systems (NeurIPS)}
year = {2022},
}
We introduce a volumetric feature representation for self-supervised surround-view depth approach, which not only outputs metric-scale depth and canonical camera motion, but also synthesizes a depth map at a novel viewpoint.
- Install required libraries using the
requirements.txt
file.
(Note that we leveragepackent-sfm
,dgp
as submodules and therefore need to install required libraries related to the submodules.) - To install both required library and submodules, you need to follow instruction below:
git submodule init
git submodule update
pip install -r requirements.txt
- DDAD dataset can be downloaded by running:
curl -s https://tri-ml-public.s3.amazonaws.com/github/DDAD/datasets/DDAD.tar
- Place the dataset in
input_data/DDAD/
- We manually created mask image for scene of ddad dataset and are provided in
dataset/ddad_mask
- Download NuScenes official dataset
- Place the dataset in
input_data/nuscenes/
- Scenes with backward and forward contexts are listed in
dataset/nuscenes/
- Scenes with low visibility are filtered in
dataset/nuscenes/val.txt
Data should be as follows:
├── input_data
│ ├── DDAD
│ │ ├── ddad_train_val
│ │ ├── ddad_test
│ ├── nuscenes
│ │ ├── maps
│ │ ├── samples
│ │ ├── sweeps
│ │ ├── v1.0-test
| | ├── v1.0-trainval
Model | Scale | Abs.Rel. | Sq.Rel. | RMSE | RMSElog | d1.25 | d1.252 | d1.253 |
DDAD | Metric | 0.221 | 4.001 | 13.406 | 0.340 | 0.688 | 0.868 | 0.932 |
Median | 0.221 | 3.884 | 13.225 | 0.328 | 0.692 | 0.877 | 0.939 | |
NuScenes | Metric | 0.285 | 6.662 | 7.472 | 0.347 | 0.741 | 0.883 | 0.936 |
Median | 0.258 | 4.282 | 7.226 | 0.329 | 0.735 | 0.883 | 0.937 |
Surround-view fusion depth estimation model can be trained from scratch.
- By default results are saved under
results/<config-name>
with trained model and tensorboard file for both training and validation.
Single-GPU
Training the model using single-GPU:
(Note that, due to usage of packnet-sfm submodule, userwarning repetitively occurs and therefore ignored while training.)
python -W ignore train.py --config_file='./configs/ddad/ddad_surround_fusion.yaml'
python -W ignore train.py --config_file='./configs/nuscenes/nusc_surround_fusion.yaml'
Multi-GPU
Training the model using Multi-GPU:
- Enable distributed data parallel(DDP), by setting ddp:ddp_enable to True in the config file
- Gpus and the worldsize(number of gpus) must be specified (ex. gpus = [0, 1, 2, 3], worldsize= 4)
- DDP address and port setting can be configured in ddp.py
python -W ignore train.py --config_file='./configs/ddad/ddad_surround_fusion_ddp.yaml'
python -W ignore train.py --config_file='./configs/nuscenes/nusc_surround_fusion_ddp.yaml'
To evaluate the trained model from scratch, run:
python -W ignore eval.py --config_file='./configs/<config-name>'
- The model weights need to be specified in
load: weights
of the config file.
Evaluation results using the pretrained model can be obtained by using the following command:
python -W ignore eval.py --config_file='./configs/<config-name>' \
--weight_path='<pretrained-weight-path>'
To obtain synthesized depth results, train the model from scratch by running:
python -W ignore train.py --config_file='./configs/ddad/ddad_surround_fusion_augdepth.yaml'
Then evaluate the model by running:
python -W ignore eval.py --config_file='./configs/ddad/ddad_surround_fusion_augdepth.yaml'
- The synthesized results are stored
results/<config-name>/syn_results
This repository is released under the Apach 2.0 license.