Skip to content

Tensorflow implementation of models trained for language classification on Mozilla Common Voice

License

Notifications You must be signed in to change notification settings

dsalaj/common-voice-tf

Repository files navigation

common-voice-tf

Tensorflow implementation of models trained for language classification on Mozilla Common Voice using tf.Dataset API.

TFRecord method

The offline_process.py script converts the .mp3 files of dataset to a .tfrecord file per language. Each of the .tfrecord files contain an array of tuples of an MFCC spectrogram for audio clip and the corresponding label string.

This method saves a lot of computation since the mp3 decoding and pre-processing is done only once.

Online method using tf.data.Dataset.list_files

The dataset.py implements the dataset pipeline where mp3 files are decoded and processed to features on demand. This is very computationally expensive. If you plan to do multiple training runs or hyper-parameter tuning, please use the TFRecord method above.

Example

Tested on tensorflow==2.1.0. To train an LSTM model run:

python3 train.py

This should achieve ~85% accuracy in 10 epochs.

About

Tensorflow implementation of models trained for language classification on Mozilla Common Voice

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published