Official code repository for Frequency-Adaptive Low-Latency Object Detection Using Events and Frames.
💌 Some of the data is uploading right now, and it will be finished before 15th, December. If you encounter any problems reproducing our code, please don't hesitate to raise an issue and we will resolve it as soon as possible.
⭐ The advantages of this repository in dealing object detection using both Events and Frames
-
We follow the data format of RVT, and all datasets are now easier to handle, smaller, and faster to read and write. We appreciate the excellent work of Mr magehrig and the RPG. If you are familiar with the RVT, it will be easy to follow this project.
-
Our model is very lightweight, small in size, fast, and can be trained end-to-end on a GPU with 24G of memory.
-
We do not perform any additional post-processing (except NMS) during training and testing, and we used all categories of the dataset during training to ensure fair evaluation.
-
We provide all the data files, including the files before and after the frame building, as well as the pre-trained model. You can flexibly adjust and add your own design.
We recommend using cuda11.8 to avoid unnecessary environmental problems.
conda create -y -n faod python=3.11
conda activate faod
pip install torch==2.1.1 torchvision==0.16.1 torchdata==0.7.1 torchaudio==2.1.1 --index-url https://download.pytorch.org/whl/cu118
pip install wandb pandas plotly opencv-python tabulate pycocotools bbox-visualizer StrEnum hydra-core einops torchdata tqdm numba h5py hdf5plugin lovely-tensors tensorboardX pykeops scikit-learn ipdb timm opencv-python-headless pytorch_lightning==1.8.6 numpy==1.26.3
pip install openmim
mim install mmcv
We provide datasets with the similar format of RVT for easy implements.
Noth that the following datasets are paired Event-RGB. Trying to evaluate Event-RGB Mismatch
and Train-Infer Mismatch
?
Following these instructions to create unpaired Event-RGB datasets.
Download Links | PKU-DAVIS-SOD | DSEC-Detection |
---|
PKU-DAVIS-SOD (Time Shift) | PKU-DAVIS-SOD | DSEC-Detection |
---|---|---|
mAP = 29.7 | mAP = 30.5 | mAP = 42.5 |
Define the DATASET ['pku_fusion', 'dsec']
, DATA_PATH
, CHECKPOINT
, use_test_set [True, False]
, and then run the following command:
python validation.py dataset={DATASET} dataset.path={DATA_PATH} checkpoint={CHECKPOINT} use_test_set={use_test_set} +experiment/{DATASET}='base.yaml'
Other settings like use_test_set
, training.precision
, batch_size.eval
, hardware.num_workers
can be set in file config/val.yaml
and config/experiment/{DATASET}/default.yaml
conveniently.
Define the DATASET
, DATA_PATH
, and then run the following command:
python train.py dataset={DATASET} dataset.path={DATA_PATH} +experiment/{DATASET}='base.yaml'
Other settings like training.precision
, batch_size.train
, hardware.num_workers
can be set in file config/train.yaml
and config/experiment/{DATASET}/default.yaml
conveniently.
Training FAOD with/without Time Shift? Following this instruction.
The relevant content is in demo.py
.
You need to set mode = ['pre', 'gt']
, and show_mode = ['event','rgb','mixed']
.
And indicate the sequence you want to visualize, e.g., PKU-H5-Process/freq_1_1/test/001_test_low_light
.
Then run the code :
python demo.py dataset={DATASET} dataset.path={DATA_PATH} checkpoint={CHECKPOINT} +experiment/{DATASET}='base.yaml'
The results will be saved in ./gt
or ./predictions
. You can also ajust the destination path by yourself.
Please cite our paper if you find it useful in your research:
@misc{FAOD_2024_zhang,
title={Frequency-Adaptive Low-Latency Object Detection Using Events and Frames},
author={Haitian Zhang and Xiangyuan Wang and Chang Xu and Xinya Wang and Fang Xu and Huai Yu and Lei Yu and Wen Yang},
year={2024},
eprint={2412.04149},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2412.04149},
}