PyTorch implementation of "Modeling the Relative Visual Tempo for Self-supervised Skeleton-based Action Recognition", ICCV 2023 [PDF]
- We use NTU RGB+D and NTU RGB+D 120 as our datasets.
# generate raw database for NTU-RGB+D
python tools/ntu_gendata.py --data_path <path to nturgbd+d_skeletons>
# preprocess the above data for our method
python feeder/preprocess_ntu.py
- Example for unsupervised pretraining of RVTCLR. You can change .yaml files in config/ntu60/pretext folder.
# train on NTU RGB+D xsub joint stream
python main.py pretrain_skeletonclr --config config/ntu60/pretext/pretext_skeletonclr.yaml
# train on NTU RGB+D xsub bone stream
python main.py pretrain_skeletonclr --config config/ntu60/pretext/pretext_skeletonclr_bone.yaml
# train on NTU RGB+D xsub motion stream
python main.py pretrain_skeletonclr --config config/ntu60/pretext/pretext_skeletonclr_motion.yaml
- Example for linear evaluation of RVTCLR. You can change .yaml files in config/ntu60/linear_eval folder.
# Linear_eval on NTU RGB+D xsub joint stream
python main.py linear_evaluation --config config/ntu60/linear_eval/linear_eval_skeleton.yaml
# Linear_eval on NTU RGB+D xsub bone stream
python main.py linear_evaluation --config config/ntu60/linear_eval/linear_eval_skeleton_bone.yaml
# Linear_eval on NTU RGB+D xsub motion stream
python main.py linear_evaluation --config config/ntu60/linear_eval/linear_eval_skeleton_motion.yaml
This repo is based on
Please cite this work if you find it useful:
@InProceedings{Zhu_2023_ICCV,
author = {Zhu, Yisheng and Han, Hu and Yu, Zhengtao and Liu, Guangcan},
title = {Modeling the Relative Visual Tempo for Self-supervised Skeleton-based Action Recognition},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
month = {October},
year = {2023},
pages = {13913-13922}
}