This repo uses the WLASL dataset along with google's mediapipe repo to train a model that can convert speach to sign language.
You can do this part on a desktop, but for speed's sake I used an IBM Virtual Server with 8 vCPUs.
- Create Virtual Server
- Follow the instructions here to install bazel
- Install opencv:
$ apt update
$ sudo apt-get install libopencv-core-dev libopencv-highgui-dev \
libopencv-calib3d-dev libopencv-features2d-dev \
libopencv-imgproc-dev libopencv-video-dev
- Install pip:
sudo apt-get install python3-pip
- Install opencv-python:
pip3 install python3-opencv
- Install youtube-dl
- Install ffmpeg:
apt install ffmpeg
- Depending on system make sure python symlinks to python3 (i.e. Ubuntu):
sudo ln -s /usr/bin/python3 /usr/local/bin/python
- Clone the repo:
git clone
- Run the download script. This will download and preprocess the videos using the WLASL repo:
$ cd speach-to-sign-language/
$ bash scripts/
- Install tensorflow
pip3 install tensorflow