[ArXiv][Project page][Video][Poster] [Open Access]
This repository is the implementation of our ICCV 2023 paper: 2D-3D Interlaced Transformer for Point Cloud Segmentation with Scene-Level Supervision.
Build the conda environment by
conda env create -f mit_env.yaml
We implement our MIT by using MinkowskiEngine. Please follow the installation instruction from their GitHub. We also utilize the third-party point cloud process library from Ji-Jia Wu.
Download the ScanNet here.
-
We follow BPNet to prepare the 2D and 3D data.
-
Donwload the unsupervised pre-computed supervoxel by WYPR
The data sctructure should be like:
├── data_root
│ ├── train
│ │ ├── scene0000_00.pth
│ │ ├── scene0000_01.pth
│ │── val
│ │ ├── scene0011_00.pth
│ │ ├── scene0011_01.pth
│ ├── 2D
│ │ ├── scene0000_00
│ │ | ├── color
│ │ | ├── label
Start training: sh tool/train.sh $EXP_NAME$ $/PATH/TO/CONFIG$ $NUMBER_OF_THREADS$
sh tool/train.sh configs/ICCV23/config.yaml mit 8
Our code is based on MinkowskiEngine. We also referred to BPNet.
If you find our work useful in your research, please consider citing our paper:
@inproceedings{yang20232d,
title={2D-3D Interlaced Transformer for Point Cloud Segmentation with Scene-Level Supervision},
author={Yang, Cheng-Kun and Chen, Min-Hung and Chuang, Yung-Yu and Lin, Yen-Yu},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
pages={977--987},
year={2023}
}