This is the official implementation of the GASLT paper.
git clone https://github.com/YinAoXiong/GASLT
cd GASLT
conda env create -f env.yaml
conda activate gaslt
For the RWTH-PHOENIX-Weather 2014 T dataset, we provide processed data for download.
Since the public link will expire after a period of time, if the link expires, please contact me via email [email protected] to get a new access link.
For other datasets, please refer to the following steps for processing because we do not have permission to distribute them.
- For the RWTH-PHOENIX-Weather 2014 T dataset, directly download the visual features extracted from the TSPNet project, and select the version with a window of 8 and a stride of 2.
- For the CSL-Daily and SP-10 datasets, download the pre-trained I3D model weights and feature extraction code from the WLASL project, and extract features in a sliding window with a window of 8 and a stride of 2.
Follow the format of the slt project to package the visual features. Specifically, the python list object is first serialized using pickle and then gzip compressed.
We use the distiluse-base-multilingual-cased-v1 model from the Sentence-Transformers project to calculate the similarity between texts.
First, make sure that the structure under the project data folder is as follows,
data
└── pht
├── bpe
│ ├── de.wiki.bpe.vs25000.d300.w2v.txt
│ ├── de.wiki.bpe.vs25000.d300.w2v.txt.pt
│ └── de.wiki.bpe.vs25000.model
├── data
│ ├── phoenix14t.pami0.dev
│ ├── phoenix14t.pami0.test
│ └── phoenix14t.pami0.train
└── sim
├── cos_sim.pkl
└── name_to_video_id.json
...
and then run the command to train the model.
python -m signjoey train configs/train_pht.yaml --gpu_id 0
Run the following command to test the model.
python -m signjoey test configs/test_pht.yaml --ckpt <path_to_ckpt> --output_path <path_to_output> --gpu_id 0
If you find this project useful, please cite our paper:
@inproceedings{yin2023gloss,
title={Gloss attention for gloss-free sign language translation},
author={Yin, Aoxiong and Zhong, Tianyun and Tang, Li and Jin, Weike and Jin, Tao and Zhao, Zhou},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={2551--2562},
year={2023}
}
Our codes are based on the following repos: