Cheng-Che Cheng1 Min-Xuan Qiu1 Chen-Kuo Chiang2 Shang-Hong Lai1
1National Tsing Hua University, Taiwan 2National Chung Cheng University, Taiwan
- 2023.8 Code release
- 2023.7 Our paper is accepted to ICCV 2023!
ReST, a novel reconfigurable graph model, that first associates all detected objects across cameras spatially before reconfiguring it into a temporal graph for Temporal Association. This two-stage association approach enables us to extract robust spatial and temporal-aware features and address the problem with fragmented tracklets. Furthermore, our model is designed for online tracking, making it suitable for real-world applications. Experimental results show that the proposed graph model is able to extract more discriminating features for object tracking, and our model achieves state-of-the-art performance on several public datasets.
-
Clone the project and create virtual environment
git clone https://github.com/chengche6230/ReST.git conda create --name ReST python=3.8 conda activate ReST
-
Install (follow instructions):
- torchreid
- DGL (also check PyTorch/CUDA compatibility table below)
- warmup_scheduler
- py-motmetrics
- Reference commands:
# torchreid git clone https://github.com/KaiyangZhou/deep-person-reid.git cd deep-person-reid/ pip install -r requirements.txt conda install pytorch torchvision cudatoolkit=9.0 -c pytorch python setup.py develop # other packages (in /ReST) conda install -c dglteam/label/cu117 dgl pip install git+https://github.com/ildoonet/pytorch-gradual-warmup-lr.git pip install motmetrics
-
Install other requirements
pip install -r requirements.txt
-
Download pre-trained ReID model
- Place datasets in
./datasets/
as:
./datasets/
├── CAMPUS/
│ ├── Garden1/
│ │ └── view-{}.txt
│ ├── Garden2/
│ │ └── view-HC{}.txt
│ ├── Parkinglot/
│ │ └── view-GL{}.txt
│ └── metainfo.json
├── PETS09/
│ ├── S2L1/
│ │ └── View_00{}.txt
│ └── metainfo.json
├── Wildtrack/
│ ├── sequence1/
│ │ └── src/
│ │ ├── annotations_positions/
│ │ └── Image_subsets/
│ └── metainfo.json
└── {DATASET_NAME}/ # for customized dataset
├── {SEQUENCE_NAME}/
│ └── {ANNOTATION_FILE}.txt
└── metainfo.json
- Prepare all
metainfo.json
files (e.g. frames, file pattern, homography) - Run for each dataset:
Check
python ./src/datasets/preprocess.py --dataset {DATASET_NAME}
./datasets/{DATASET_NAME}/{SEQUENCE_NAME}/output
if there is anything missing:/output/ ├── gt_MOT/ # for motmetrics │ └── c{CAM}.txt ├── gt_train.json ├── gt_eval.json ├── gt_test.json └── {DETECTOR}_test.json # if you want to use other detector, e.g. yolox_test.json
- Prepare all image frames as
{FRAME}_{CAM}.jpg
in/output/frames
.
Download trained weights if you need, and modify TEST.CKPT_FILE_SG
& TEST.CKPT_FILE_TG
in ./configs/{DATASET_NAME}.yml
.
Dataset | Spatial Graph | Temporal Graph |
---|---|---|
Wildtrack | sequence1 | sequence1 |
CAMPUS | Garden1 Garden2 Parkinglot |
Garden1 Garden2 Parkinglot |
PETS-09 | S2L1 | S2L1 |
To train our model, basically run the command:
python main.py --config_file ./configs/{DATASET_NAME}.yml
In {DATASET_NAME}.yml
:
- Modify
MODEL.MODE
to 'train' - Modify
SOLVER.TYPE
to train specific graphs. - Make sure all settings are suitable for your device, e.g.
DEVICE_ID
,BATCH_SIZE
. - You can also directly append attributes after the command for convenience, e.g.:
python main.py --config_file ./configs/Wildtrack.yml MODEL.DEVICE_ID "('1')" SOLVER.TYPE "SG"
python main.py --config_file ./configs/{DATASET_NAME}.yml
In {DATASET_NAME}.yml
:
- Modify
MODEL.MODE
to 'test'. - Select what input detection you want, and modify
MODEL.DETECTION
.- You need to prepare
{DETECTOR}_test.json
in./datasets/{DATASET_NAME}/{SEQUENCE_NAME}/output/
by your own first.
- You need to prepare
- Make sure all settings in
TEST
are configured.
If you find this code useful for your research, please cite our paper
@InProceedings{Cheng_2023_ICCV,
author = {Cheng, Cheng-Che and Qiu, Min-Xuan and Chiang, Chen-Kuo and Lai, Shang-Hong},
title = {ReST: A Reconfigurable Spatial-Temporal Graph Model for Multi-Camera Multi-Object Tracking},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
month = {October},
year = {2023},
pages = {10051-10060}
}