SiT-MLP

This is the official implementation of our paper SiT-MLP: A Simple MLP with Point-wise Topology Feature Learning for Skeleton-based Action Recognition

Note: Our approch is MLP-based and GCN-free. The graph folder is adopted for different modality.

Abstract

Graph convolution networks (GCNs) have achieved remarkable performance in skeleton-based action recognition. However, previous GCN-based methods rely on elaborate human priors excessively and construct complex feature aggregation mechanisms, which limits the generalizability and effectiveness of networks. To solve these problems, we propose a novel Spatial Topology Gating Unit (STGU), an MLP-based variant without extra priors, to capture the co-occurrence topology features that encode the spatial dependency across all joints. In STGU, to learn the point-wise topology features, a new gate-based feature interaction mechanism is introduced to activate the features point-to-point by the attention map generated from the input sample. Based on the STGU, we propose the first MLP-based model, SiT-MLP, for skeleton-based action recognition in this work. Compared with previous methods on three large-scale datasets, SiT-MLP achieves competitive performance. In addition, SiT-MLP reduces the parameters significantly with favorable results.

Experiment

Model	Parameters	FLOPs	Accuracy
2S-AGCN	3.5M	3.9G	95.1
CTR-GCN	1.4M	1.8G	96.8
InfoGCN	1.5M	1.7G	96.7
SiT-MLP	0.6M	0.7G	96.8

Comparsion of performance and parameter size on X-sub benchmark of NTU RGB+D 60 dataset. We report the accuracy as performance on the vertical axis. The closer to the top-left, the better.

SiT-MLP architecture

Prerequisites

You can install all dependencies by running pip install -r requirements.txt
Then, you need to install torchlight by running pip install -e torchlight

Data Preparation

Download datasets.

There are 3 datasets to download:

NTU RGB+D 60 Skeleton
NTU RGB+D 120 Skeleton
NW-UCLA

NTU RGB+D 60 and 120

Request dataset here: https://rose1.ntu.edu.sg/dataset/actionRecognition
Download the skeleton-only datasets:
1. nturgbd_skeletons_s001_to_s017.zip (NTU RGB+D 60)
2. nturgbd_skeletons_s018_to_s032.zip (NTU RGB+D 120)
3. Extract above files to ./data/nturgbd_raw

NW-UCLA

Download dataset from here
Move all_sqe to ./data/NW-UCLA

Data Processing

Directory Structure

Put downloaded data into the following directory structure:

- data/
  - NW-UCLA/
    - all_sqe
      ... # raw data of NW-UCLA
  - ntu/
  - ntu120/
  - nturgbd_raw/
    - nturgb+d_skeletons/     # from `nturgbd_skeletons_s001_to_s017.zip`
      ...
    - nturgb+d_skeletons120/  # from `nturgbd_skeletons_s018_to_s032.zip`
      ...

Generating Data

Generate NTU RGB+D 60 or NTU RGB+D 120 dataset:

 cd ./data/ntu # or cd ./data/ntu120
 # Get skeleton of each performer
 python get_raw_skes_data.py
 # Remove the bad skeleton 
 python get_raw_denoised_data.py
 # Transform the skeleton to the center of the first frame
 python seq_transformation.py

Training & Testing

Training

To train model on NTU60/120

# Example: training SiT-MLP on NTU RGB+D cross subject joint modality
CUDA_VISIBLE_DEVICES=0,1 python main.py --config config/nturgbd-cross-subject/mlp_joint.yaml 
# Example: training SiT-MLP on NTU RGB+D cross subject bone modality
CUDA_VISIBLE_DEVICES=0,1 python main.py --config config/nturgbd-cross-subject/mlp_bone.yaml 
# Example: training SiT-MLP on NTU RGB+D 120 cross subject joint modality
CUDA_VISIBLE_DEVICES=0,1 python main.py --config config/nturgbd120-cross-subject/mlp_joint.yaml 
# Example: training SiT-MLP on NTU RGB+D 120 cross subject bone modality
CUDA_VISIBLE_DEVICES=0,1 python main.py --config config/nturgbd120-cross-subject/mlp_bone.yaml

To train model on NW-UCLA

CUDA_VISIBLE_DEVICES=0,1 python main.py --config config/ucla/mlp_joint.yaml

Testing

To test the trained models saved in <work_dir>, run the following command:

python main.py --config <work_dir>/config.yaml --work-dir <work_dir> --phase test --save-score True --weights <work_dir>/xxx.pt

To ensemble the results of different modalities, run

# Example: ensemble four modalities of SiT-MLP on NTU RGB+D 120 cross subject
python ensemble.py --datasets ntu120/xsub --position_ckpts work_dir/ntu120/xsub/mlp/joint work_dir/ntu120/xsub/mlp/bone --motion_ckpts work_dir/ntu120/xsub/mlp/joint_vel work_dir/ntu120/xsub/mlp/bone_vel

Acknowledgements

This repo is based on CTR-GCN. The data processing is borrowed from SGN and HCN. The code for different modality is adopted from InfoGCN.

Citation

Please cite this work if you find it useful:

@article{zhang2023sitmlp,
    author={Zhang, Shaojie and Yin, Jianqin and Dang, Yonghao and Fu, Jiajun},
    journal={IEEE Transactions on Circuits and Systems for Video Technology}, 
    title={SiT-MLP: A Simple MLP with Point-wise Topology Feature Learning for Skeleton-based Action Recognition}, 
    year={2024},
    doi={10.1109/TCSVT.2024.3386553}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SiT-MLP

Abstract

Experiment

SiT-MLP architecture

Prerequisites

Data Preparation

Download datasets.

There are 3 datasets to download:

NTU RGB+D 60 and 120

NW-UCLA

Data Processing

Directory Structure

Generating Data

Training & Testing

Training

Testing

Acknowledgements

Citation

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
config		config
data		data
feeders		feeders
graph		graph
model		model
resource		resource
torchlight		torchlight
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
ensemble.py		ensemble.py
main.py		main.py
requirements.txt		requirements.txt

License

zhshj0110/SiT-MLP

Folders and files

Latest commit

History

Repository files navigation

SiT-MLP

Abstract

Experiment

SiT-MLP architecture

Prerequisites

Data Preparation

Download datasets.

There are 3 datasets to download:

NTU RGB+D 60 and 120

NW-UCLA

Data Processing

Directory Structure

Generating Data

Training & Testing

Training

Testing

Acknowledgements

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages