PVNet: Pixel-wise Voting Network for 6DoF Pose Estimation
Sida Peng, Yuan Liu, Qixing Huang, Xiaowei Zhou, Hujun Bao
CVPR 2019 oral
Project Page
Any questions or discussions are welcomed!
Thanks Haotong Lin for providing the clean version of PVNet and reproducing the results.
The structure of this project is described in project_structure.md.
- Set up the python environment:
conda create -n pvnet python=3.7 conda activate pvnet # install torch 1.1 built from cuda 9.0 pip install torch==1.1.0 -f https://download.pytorch.org/whl/cu90/stable pip install Cython==0.28.2 sudo apt-get install libglfw3-dev libglfw3 pip install -r requirements.txt
- Compile cuda extensions under
lib/csrc
:ROOT=/path/to/clean-pvnet cd $ROOT/lib/csrc export CUDA_HOME="/usr/local/cuda-9.0" cd dcn_v2 python setup.py build_ext --inplace cd ../ransac_voting python setup.py build_ext --inplace cd ../nn python setup.py build_ext --inplace cd ../fps python setup.py build_ext --inplace # If you want to use the uncertainty-driven PnP cd ../uncertainty_pnp sudo apt-get install libgoogle-glog-dev sudo apt-get install libsuitesparse-dev sudo apt-get install libatlas-base-dev python setup.py build_ext --inplace
- Set up datasets:
ROOT=/path/to/clean-pvnet cd $ROOT/data ln -s /path/to/linemod linemod ln -s /path/to/linemod_orig linemod_orig ln -s /path/to/occlusion_linemod occlusion_linemod # the following is used for tless ln -s /path/to/tless tless ln -s /path/to/cache cache ln -s /path/to/SUN2012pascalformat sun
Download datasets which are formatted for this project:
- linemod
- linemod_orig: The dataset includes the depth for each image.
- occlusion linemod
- truncation linemod: Check TRUNCATION_LINEMOD.md for the information about the Truncation LINEMOD dataset.
- Tless:
cat tlessa* | tar xvf - -C .
. - Tless cache data: It is used for training and testing on Tless.
- SUN2012pascalformat
We provide the pretrained models of objects on Linemod, which can be found at here.
Take the testing on cat
as an example.
- Prepare the data related to
cat
:python run.py --type linemod cls_type cat
- Download the pretrained model of
cat
and put it to$ROOT/data/model/pvnet/cat/199.pth
. - Test:
python run.py --type evaluate --cfg_file configs/linemod.yaml model cat cls_type cat python run.py --type evaluate --cfg_file configs/linemod.yaml test.dataset LinemodOccTest model cat cls_type cat
- Test with icp:
python run.py --type evaluate --cfg_file configs/linemod.yaml model cat cls_type cat test.icp True python run.py --type evaluate --cfg_file configs/linemod.yaml test.dataset LinemodOccTest model cat cls_type cat test.icp True
- Test with the uncertainty-driven PnP:
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:./lib/csrc/uncertainty_pnp/lib python run.py --type evaluate --cfg_file configs/linemod.yaml model cat cls_type cat test.un_pnp True python run.py --type evaluate --cfg_file configs/linemod.yaml test.dataset LinemodOccTest model cat cls_type cat test.un_pnp True
We provide the pretrained models of objects on Tless, which can be found at here.
- Download the pretrained models and put them to
$ROOT/data/model/pvnet/
. - Test:
python run.py --type evaluate --cfg_file configs/tless/tless_01.yaml # or python run.py --type evaluate --cfg_file configs/tless/tless_01.yaml test.vsd True
Take the cat
as an example.
- Prepare the data related to
cat
:python run.py --type linemod cls_type cat
- Download the pretrained model of
cat
and put it to$ROOT/data/model/pvnet/cat/199.pth
. - Visualize:
python run.py --type visualize --cfg_file configs/linemod.yaml model cat cls_type cat
If setup correctly, the output will look like
Visualize:
python run.py --type visualize --cfg_file configs/tless/tless_01.yaml
# or
python run.py --type visualize --cfg_file configs/tless/tless_01.yaml test.det_gt True
- Prepare the data related to
cat
:python run.py --type linemod cls_type cat
- Train:
python train_net.py --cfg_file configs/linemod.yaml model mycat cls_type cat
The training parameters can be found in project_structure.md.
Train:
python train_net.py --cfg_file configs/tless/tless_01.yaml
tensorboard --logdir data/record/pvnet
If setup correctly, the output will look like
An example dataset can be downloaded at here.
- Create a dataset using https://github.com/F2Wang/ObjectDatasetTools
- Organize the dataset as the following structure:
├── /path/to/dataset │ ├── model.ply │ ├── camera.txt │ ├── diameter.txt // the object diameter, whose unit is meter │ ├── rgb/ │ │ ├── 0.jpg │ │ ├── ... │ │ ├── 1234.jpg │ │ ├── ... │ ├── mask/ │ │ ├── 0.png │ │ ├── ... │ │ ├── 1234.png │ │ ├── ... │ ├── pose/ │ │ ├── pose0.npy │ │ ├── ... │ │ ├── pose1234.npy │ │ ├── ... │ │ └──
- Create a soft link pointing to the dataset:
ln -s /path/to/custom_dataset data/custom
- Process the dataset:
python run.py --type custom
- Train:
python train_net.py --cfg_file configs/custom.yaml train.batch_size 4
- Watch the training curve:
tensorboard --logdir data/record/pvnet
- Visualize:
python run.py --type visualize --cfg_file configs/custom.yaml
- Test:
python run.py --type evaluate --cfg_file configs/custom.yaml
An example dataset can be downloaded at here.
If you find this code useful for your research, please use the following BibTeX entry.
@inproceedings{peng2019pvnet,
title={PVNet: Pixel-wise Voting Network for 6DoF Pose Estimation},
author={Peng, Sida and Liu, Yuan and Huang, Qixing and Zhou, Xiaowei and Bao, Hujun},
booktitle={CVPR},
year={2019}
}
This work is affliated with ZJU-SenseTime Joint Lab of 3D Vision, and its intellectual property belongs to SenseTime Group Ltd.
Copyright (c) ZJU-SenseTime Joint Lab of 3D Vision. All Rights Reserved.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.