This repository is for CMDFusion introduced in the following paper "CMDFusion: Bidirectional Fusion Network with Cross-modality Knowledge Distillation for LIDAR Semantic Segmentation", Jun CEN, Shiwei Zhang, Yixuan Pei, Kun Li, Hang Zheng, Maochun Luo, Yingya Zhang, Qifeng Chen
- pytorch >= 1.8
- yaml
- easydict
- pyquaternion
- lightning (tested with pytorch_lightning==1.3.8 and torchmetrics==0.5)
- torch-scatter (pip install torch-scatter -f https://data.pyg.org/whl/torch-1.9.0+${CUDA}.html)
- nuScenes-devkit (optional for nuScenes)
- spconv (tested with spconv==2.1.16 and cuda==11.1, pip install spconv-cu111==2.1.16)
Please download the files from the SemanticKITTI website and additionally the color data from the Kitti Odometry website. Extract everything into the same folder.
./dataset/
├──
├── ...
└── SemanticKitti/
├──sequences
├── 00/
│ ├── velodyne/
| | ├── 000000.bin
| | ├── 000001.bin
| | └── ...
│ └── labels/
| | ├── 000000.label
| | ├── 000001.label
| | └── ...
| └── image_2/
| | ├── 000000.png
| | ├── 000001.png
| | └── ...
| calib.txt
├── 08/ # for validation
├── 11/ # 11-21 for testing
└── 21/
└── ...
Please download the Full dataset (v1.0) from the NuScenes website with lidarseg and extract it.
./dataset/
├──
├── ...
└── nuscenes/
├──v1.0-trainval
├──v1.0-test
├──samples
├──sweeps
├──maps
├──lidarseg
You can run the training with
cd <root dir of this repo>
python main.py --log_dir CMDFusion_semkitti --config config/CMDFusion-semantickitti-g.yaml --gpu 0 1
The output will be written to logs/SemanticKITTI/CMDFusion_semkitti
by default.
After training the model on the SemanticKITTI dataset, we use additional instance-level augmentation from PolarMix to finetune the model for better performance. Note that followeing 2DPASS, the validation set is also included in the training set during finetuning.
cd <root dir of this repo>
python main.py --log_dir CMDFusion_semkitti_ft --config config/CMDFusion-semantickitti-g-all-ft.yaml --gpu 0 1 --checkpoint /path_to_trained_checkpoint --fine_tune
Uncomment Line 180 and comment Line 181 in network/arch_cmd_fusion.py
. Open Line 171 in network/base_model_distil.py
. Then run the training with
cd <root dir of this repo>
python main.py --log_dir CMDFusion_semkitti_o --config config/CMDFusion-semantickitti-g.yaml --gpu 0 1
Uncomment Line 180 and comment Line 181 in network/arch_cmd_fusion.py
. Then run the training with
cd <root dir of this repo>
python main.py --log_dir CMDFusion_nusc --config config/CMDFusion-nuscenese-g-L.yaml --gpu 0 1
We use two A100 (80G) GPUs and train for 4 days.
You can run the validation with
cd <root dir of this repo>
python main.py --config config/CMDFusion-semantickitti-g.yaml --gpu 0 --test --checkpoint <dir for the pytorch checkpoint>
Note that for SemanticKITTI-O, open Line 171 in network/base_model_distil.py
.
You can run the testing with
cd <root dir of this repo>
python main.py --config config/CMDFusion-semantickitti-g.yaml --gpu 0 --test --num_vote 12 --checkpoint <dir for the pytorch checkpoint> --submit_to_server
Here, num_vote
is the number of views for the test-time-augmentation (TTA). We set this value to 12 as default following 2DPASS.
You can download the models with the scores below from this Google drive folder.
Model (validation) | Val (mIoU) | Test (mIoU) |
---|---|---|
CMDFusion-SemanticKITTI-O | 67.6% | - |
CMDFusion-SemanticKITTI | 66.2% | 68.6% |
CMDFusion-SemanticKITTI-polarmix | 86.4% | 71.6% |
CMDFusion-NuScenes | 77.3% | 80.8% |
Code is built based on 2DPASS, SPVNAS, Cylinder3D, xMUDA and SPCONV.
See Apache-2.0 License.