This repository provides the code for our CVPR 2021 paper Deep Two-View Structure-from-Motion Revisited.
We have provided the functions for training, validating, and visualization.
Updates:
- [Apr 23, 2024] Now we have a fully-batched differentiable implementation for essential matrix estimation. It is fully implemented in PyTorch and supports LORANSAC. Check VGGSfM for details.
Python = 3.6.x
Pytorch >= 1.6.0
CUDA >= 10.1
and the others could be installed by
pip install -r requirements.txt
Pytorch from 1.1.0 to 1.6.0 should also work well, but it will disenable mixed precision training, and we have not tested it.
To use the RANSAC five-point algorithm, you also need to
cd RANSAC_FiveP
python setup.py install --user
The CUDA extension would be installed as 'essential_matrix'. Tested under Ubuntu and CUDA 10.1.
Pretrained models are provided here.
To reproduce our results, please first download the KITTI dataset RAW data and 14GB official depth maps. Please first unzip the KITTI official depth maps (train and val) into a folder, and change the flag cfg.GT_DEPTH_DIR in kitti.yml to the folder name. You should also download the split files provided by us, and unzip them into the root of the KITTI raw data.
For training,
python main.py -b 32 --lr 0.0005 --nlabel 128 --fix_flownet \
--data PATH/TO/YOUR/KITTI/DATASET --cfg cfgs/kitti.yml \
--pretrained-depth depth_init.pth.tar --pretrained-flow flow_init.pth.tar
For evaluation,
python main.py -v -b 1 -p 1 --nlabel 128 \
--data PATH/TO/YOUR/KITTI/DATASET --cfg cfgs/kitti.yml \
--pretrained kitti.pth.tar"
The default evaluation split is Eigen, where the metric abs_rel should be around 0.053 and rmse should be close to 2.22 (if 'loading official ground truth depth').
If you would like to use the Eigen SfM split, please set cfg.EIGEN_SFM = True and cfg.KITTI_697 = False.
For fair comparison, we use a KITTI odometry evaluation toolbox as provided here. Please generate poses by sequence, and evaluate the results correspondingly.
Thanks Shihao Jiang and Dylan Campbell for sharing the implementation of the GPU-accelerated RANSAC Five-point algorithm. We really appreciate the valuable feedback from our area chairs and reviewers. We would like to thank Charles Loop for helpful discussions and Ke Chen for providing field test images from NVIDIA AV cars.
@article{wang2021deep,
title={Deep Two-View Structure-from-Motion Revisited},
author={Wang, Jianyuan and Zhong, Yiran and Dai, Yuchao and Birchfield, Stan and Zhang, Kaihao and Smolyanskiy, Nikolai and Li, Hongdong},
journal={CVPR},
year={2021}
}