A Multi-Modal Feature Fusion Network for 3D Object Detection
Linux (tested on Ubuntu 22.04)
Python 3.8
PyTorch 1.10 + CUDA-11.3
To deploy this project run
git clone https://github.com/faziii0/LumiNet
conda create -n liard python==3.8
conda activate liard
conda install pytorch==1.10.0 torchvision==0.11.0 torchaudio==0.10.cudatoolkit=11.3 -c pytorch -c conda-forge
conda install -c conda-forge cudatoolkit-dev
pip install -r requirements.txt
sh build_and_install.sh
We use MiDaS pretrained model to covert image_2 into depth images or download it from here Google. You can clone their repo and run this command
python run.py --model_type dpt_beit_large_512 --input_path image_2 --output_path depth
Please download the official KITTI 3D object detection dataset and train mask from Epnet++
LiARD
├── data
│ ├── KITTI
│ │ ├── ImageSets
│ │ ├── object
│ │ │ ├──training
│ │ │ ├──calib & velodyne & label_2 & image_2 & depth & train_mask
│ │ │ ├──testing
│ │ │ ├──calib & velodyne & image_2 & depth
├── lib
├── pointdep_lirad
├── tools
Objects | Easy | Moderate | Hard |
---|---|---|---|
Car |
91.67% | 83.32% | 78.29% |
Pedestrian |
00.0% | 00.0% | 0.00% |
Cyclist |
00.0% | 00.0% | 00.0% |
3D Predicted labels are avialable from the above Google
Thanks to all the contributors and authors of the project PointRCNN, EPNet++, EPNet,MiDaS