Issues applying fine-tuning to existing public model #2121

GeckoEidechse · 2022-02-22T17:19:21Z

GeckoEidechse
Feb 22, 2022

(Question orignally asked on gitter, putting it here together with answer for the sake of future documentation)

I'm trying to fine-tune an existing STT model to my own voice for better recognition but I'm struggling to "import" it.

The model I'm trying to fine tune is German STT v0.9.0 (Aashish Agarwal).

As described in the docs, to fine-tune an existing model one simply has to --checkpoint_dir and point it to the checkpoints. So in my case that would be:

python -m coqui_stt_training.train --auto_input_dataset /mnt/mydata/data.csv --checkpoint_dir /mnt/mydata/German\ STT\ v0.9.0\ \(Aashish\ Agarwal\)/

So running this inside the Docker container for training I get

root@60888aad4bf8:/code# python -m coqui_stt_training.train --auto_input_dataset /mnt/mydata/data.csv --checkpoint_dir /mnt/mydata/pre-existing-model/
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
I Processing --auto_input_dataset input: /mnt/mydata/data.csv...
I Saved generated alphabet with characters ([' ', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'k', 'l', 'm', 'n', 'o', 'p', 'r', 's', 't', 'u', 'v', 'w', 'z', 'ä', 'ö', 'ü']) into /mnt/mydata/alphabet.txt
I Generated train set size: 3 samples.
I Generated validation set size: 2 samples.
I Generated test set size: 2 samples.
I Writing train set to /mnt/mydata/train.csv
I Writing dev set to /mnt/mydata/dev.csv
I Writing test set to /mnt/mydata/test.csv
I Performing dummy training to check for memory problems.
I If the following process crashes, you likely have batch sizes that are too big for your available system memory (or GPU memory).
I Could not find best validating checkpoint.
I Could not find most recent checkpoint.
I Initializing all variables.
I STARTING Optimization
...

where

I Could not find best validating checkpoint.
I Could not find most recent checkpoint.

are the lines of interest.

So as far as I understand, it fails to detect the existing model, as it was exported and as such no longer contains any checkpoint information.
I.e. comparing the release folder structure

$ tree German\ STT\ v0.9.0\ \(Aashish\ Agarwal\)/
German STT v0.9.0 (Aashish Agarwal)/
├── alphabet.txt
├── de-aashishag-1-prune-kenlm.scorer
├── LICENSE
├── MODEL_CARD
├── model.pbmm
├── model.tflite
└── scorer.LICENSE

vs starting from scratch and running for 1-2 epochs

$ tree my-sample-checkpoints/
my-sample-checkpoints/
├── alphabet.txt
├── best_dev-3.data-00000-of-00001
├── best_dev-3.index
├── best_dev-3.meta
├── best_dev_checkpoint
├── checkpoint
├── flags.txt
├── summaries
│   ├── dev
│   │   └── events.out.tfevents.1645541148.60888aad4bf8
│   ├── metrics
│   └── train
│       └── events.out.tfevents.1645540958.60888aad4bf8
├── train-3.data-00000-of-00001
├── train-3.index
├── train-3.meta
├── train-6.data-00000-of-00001
├── train-6.index
└── train-6.meta

So with all that said, how would I go about fine-tuning the German STT v0.9.0 (Aashish Agarwal) model to my own voice samples? Is that even an option or would I have to start from scratch? Or can I simply "import" the model to somehow get the necessary checkpoint information?

TL;DR: How do I fine-tune an exported model that contains no checkpoint information? Coqui STT fails to detect it.

Answered by GeckoEidechse

Feb 22, 2022

(Answer from reuben on gitter, slightly modified by me)

You can't. For the Coqui English models we release the checkpoints in the STT release page on GitHub, and many model creators who contributed their models to the Model Zoo also have checkpoints available somewhere, but unfortunately we don't host the checkpoints or link to them consistently yet.

In this case the best bet is to reach out to Aashish (the author of the model described above). I think the German STT model might have its own repo as well: https://github.com/AASHISHAG/deepspeech-german

Looks like there's a link to the checkpoint on GDrive there.

View full answer

GeckoEidechse · 2022-02-22T17:19:34Z

GeckoEidechse
Feb 22, 2022
Author

(Answer from reuben on gitter, slightly modified by me)

You can't. For the Coqui English models we release the checkpoints in the STT release page on GitHub, and many model creators who contributed their models to the Model Zoo also have checkpoints available somewhere, but unfortunately we don't host the checkpoints or link to them consistently yet.

In this case the best bet is to reach out to Aashish (the author of the model described above). I think the German STT model might have its own repo as well: https://github.com/AASHISHAG/deepspeech-german

Looks like there's a link to the checkpoint on GDrive there.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issues applying fine-tuning to existing public model #2121

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Issues applying fine-tuning to existing public model #2121

GeckoEidechse Feb 22, 2022

Replies: 1 comment

GeckoEidechse Feb 22, 2022 Author

GeckoEidechse
Feb 22, 2022

GeckoEidechse
Feb 22, 2022
Author