Speech2AffectiveGestures: Synthesizing Co-Speech Gestures with Generative Adversarial Affective Expression Learning
This is the readme to use the official code for the paper Speech2AffectiveGestures: Synthesizing Co-Speech Gestures with Generative Adversarial Affective Expression Learning. Please use the following citation if you find our work useful:
@inproceedings{bhattacharya2021speech2affectivegestures,
author = {Bhattacharya, Uttaran and Childs, Elizabeth and Rewkowski, Nicholas and Manocha, Dinesh},
title = {Speech2AffectiveGestures: Synthesizing Co-Speech Gestures with Generative Adversarial Affective Expression Learning},
year = {2021},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
booktitle = {Proceedings of the 29th ACM International Conference on Multimedia},
series = {MM '21}
}
Our scripts have been tested on Ubuntu 18.04 LTS with
- Python 3.7
- Cuda 10.2
- cudNN 7.6.5
- PyTorch 1.5
- Clone this repository.
We use $BASE to refer to the base directory for this project (the directory containing main_v2.py
). Change present working directory to $BASE.
- [Optional but recommended] Create a conda envrionment for the project and activate it.
conda create s2ag-env python=3.7
conda activate s2ag-env
- Install
espeak
.
sudo apt-get update && sudo apt-get install espeak
-
Install PyTorch following the official instructions.
-
Install all other package requirements.
pip install -r requirements.txt
Note: You might need to manually uninstall and reinstall numpy
for torch
to work. You might need to manually uninstall and reinstall matplotlib
and kiwisolver
for them to work.
-
The Ted Gestures dataset is available for download here, originally hosted at https://github.com/ai4r/Gesture-Generation-from-Trimodal-Context.
-
The Trinity Gesture dataset is available for download on submitting an access request here.
Run the main_v2.py
file with the appropriate command line arguments.
python main_v2.py <args list>
The full list of arguments is available inside main_v2.py
.
For any argument not specificed in the command line, the code uses the default value for that argument.
On running main_v2.py
, the code will train the network and generate sample gestures post-training.
We also provide a pretrained model for download. If using this model, save it inside the directory $BASE/models/ted_db
(create the directory if it does not exist). Set the command-line argument --train-s2ag
to False
to skip training and use this model directly for evaluation. The generated samples are stored in the automatically created render
directory.
Additionally, we provide the pre-trained weights of the embedding network required to estimate the Fréchet Gesture Distance between the ground-truth and the synthesized gestures. If using these weights, store them in the directory $BASE/outputs
.