common-voice-tf

Tensorflow implementation of models trained for language classification on Mozilla Common Voice using tf.Dataset API.

TFRecord method

The offline_process.py script converts the .mp3 files of dataset to a .tfrecord file per language. Each of the .tfrecord files contain an array of tuples of an MFCC spectrogram for audio clip and the corresponding label string.

This method saves a lot of computation since the mp3 decoding and pre-processing is done only once.

Online method using `tf.data.Dataset.list_files`

The dataset.py implements the dataset pipeline where mp3 files are decoded and processed to features on demand. This is very computationally expensive. If you plan to do multiple training runs or hyper-parameter tuning, please use the TFRecord method above.

Example

Tested on tensorflow==2.1.0. To train an LSTM model run:

python3 train.py

This should achieve ~85% accuracy in 10 epochs.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
dataset.py		dataset.py
offline_parallel_process.sh		offline_parallel_process.sh
offline_process.py		offline_process.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

common-voice-tf

TFRecord method

Online method using `tf.data.Dataset.list_files`

Example

About

Releases

Packages

Languages

License

dsalaj/common-voice-tf

Folders and files

Latest commit

History

Repository files navigation

common-voice-tf

TFRecord method

Online method using tf.data.Dataset.list_files

Example

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Online method using `tf.data.Dataset.list_files`

Packages