This repository contains the official implementation of the paper T-MAE: Temporal Masked Autoencoders for Point Cloud Representation Learning by Weijie Wei, Fatemeh Karimi Najadasl, Theo Gevers and Martin R. Oswald.
- [2024/09/19] The code will be released soon.
- [2024/09/22] Release the code of evaluation on ONCE dataset.
- [2024/09/25] Training code on ONCE dataset released as well as the pretrained and finetuned weights.
- Release ONCE evaluation code.
- Release ONCE training code.
- Release Waymo training code and inference code.
We test this environment with NVIDIA A100 GPUs and Linux RHEL 8.
conda create -n t-mae python=3.8
conda activate t-mae
conda install -y pytorch==1.11.0 torchvision==0.12.0 torchaudio=0.11.0 cudatoolkit=11.3 -c pytorch
conda install -y -c fvcore -c iopath -c conda-forge fvcore iopath
pip install "git+https://github.com/facebookresearch/[email protected]"
pip install numpy==1.19.5 protobuf==3.19.4 scikit-image==0.19.2 spconv-cu113 numba scipy pyyaml easydict fire tqdm shapely matplotlib opencv-python addict pyquaternion awscli open4d pandas future pybind11 tensorboardX tensorboard Cython
pip install torch-scatter -f https://data.pyg.org/whl/torch-1.11.0+cu113.html
pip install pycocotools
pip install SharedArray
pip install tensorflow-gpu==2.5.0
pip install protobuf==3.20
git clone https://github.com/codename1995/T-MAE
cd T-MAE && python setup.py develop --user
cd pcdet/ops/dcn && python setup.py develop --user
Please follow the instruction of OpenPCDet to prepare the dataset. For the Waymo dataset, we use the evaluation toolkits to evaluate detection results, where the compute_detection_metrics_main
file comes from Waymo-open-dataset API (Mar 2023) and its source code is C++ based.
data
│── waymo
│ │── ImageSets/
│ │── raw_data
│ │ │── segment-xxxxxxxx.tfrecord
│ │ │── ...
│ │── waymo_processed_data
│ │ │── segment-xxxxxxxx/
│ │ │── ...
│ │── waymo_processed_data_gt_database_train_sampled_1/
│ │── waymo_processed_data_waymo_dbinfos_train_sampled_1.pkl
│ │── waymo_processed_data_infos_test.pkl
│ │── waymo_processed_data_infos_train.pkl
│ │── waymo_processed_data_infos_val.pkl
│ │── compute_detection_metrics_main
│ │── gt.bin
│── once
│ │── ImageSets/
│ │── data
│ │ │── 000000/
│ │ │── ...
│ │── gt_database/
│ │── once_dbinfos_train.pkl
│ │── once_infos_raw_large.pkl
│ │── once_infos_raw_medium.pkl
│ │── once_infos_raw_small.pkl
│ │── once_infos_train.pkl
│ │── once_infos_val.pkl
│── ckpts
│ │── once_tmae_weights.pth
│ │── ...
# t-mae pretrain & finetune on ONCE
bash scripts/once_train.sh
# Load provided pretrained model and finetune on ONCE
bash scripts/once_finetune_only.sh
# test
bash scripts/once_test.sh
Reproduced results to be updated soon. We could not provide the pretrained weights due to Waymo Dataset License Agreement.
mAP | Vehicle | Pedestrian | Cyclist | Weights | |
---|---|---|---|---|---|
T-MAE (Pretrained) | - | - | - | - | once_tmae_pretrained.pth |
T-MAE (Finetuned) | 67.41 | 77.53 | 54.81 | 69.90 | once_tmae_weights.pth |
If you find this repository useful, please consider citing our paper.
@inproceedings{wei2024tmae,
title={T-MAE: Temporal Masked Autoencoders for Point Cloud Representation Learning},
author={Weijie Wei, Fatemeh Karimi Najadasl, Theo Gevers and Martin R. Oswald},
booktitle={European Conference on Computer Vision (ECCV)},
year={2024}
}
This project is mainly based on the following repositories:
We would like to thank the authors for their great work.