Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WARNING:vits.dataset:Skipped X utterance(s) #663

Closed
coffeecodeconverter opened this issue Nov 30, 2024 · 0 comments
Closed

WARNING:vits.dataset:Skipped X utterance(s) #663

coffeecodeconverter opened this issue Nov 30, 2024 · 0 comments

Comments

@coffeecodeconverter
Copy link

coffeecodeconverter commented Nov 30, 2024

if you get a warning when loading the dataset reporting "Skipped X Utterance(s)", like these:
(see last line of each snippet)

DEBUG:piper_train:Checkpoints will be saved every 5 epoch(s)
DEBUG:piper_train:0 Checkpoints will be saved
DEBUG:vits.dataset:Loading dataset: /content/drive/MyDrive/colab/piper/Jarvis/dataset.jsonl
WARNING:vits.dataset:Skipped 5 utterance(s)
DEBUG:piper_train:Checkpoints will be saved every 100 epoch(s)
DEBUG:piper_train:0 Checkpoints will be saved
DEBUG:vits.dataset:Loading dataset: /testing/piper-training/dataset.jsonl
WARNING:vits.dataset:Skipped 31 utterance(s)

you have a formatting problem with your dataset.

NOTE:
if ALL utterances in your dataset are skipped, you get this error:

Trainer.fit stopped: No training batches.

(because nothing loaded correctly for it to train on)

You might find only some of your utterances are skipped, in which case, the training carries on.
However, you might miss the "Skipped X Utterance(s)" warning, and later wonder why your resulting voice model is poor quality.

now, this might not fix everyone's issue, but it did for me....

My problem was due to the the size (length) of each transcription/wav line being too long.
if i viewed my metadata.csv in notepad,
my transcriptions were wrapping round and taking up 4-5 lines - but were still technically a single huge line for each utterance.
i chopped my wavs and dataset into smaller pieces,
breaking them up by natural pauses or single sentences instead, whatever was shorter.
(beforehand, i was keeping them between 10-15 seconds in length, but it was a fast speaker, so even 10 seconds contained a lot of words and i think that was the issue)

I re-transcribed everything,
and made sure that i didn't have any one line that was long enough to wrap around
(im not 100% sure this is an exact pre-requisite of the data, but it fixed my issue at least)

deleted the previous cache, lightning-logs, config.json and dataset.jsonl (if applicable)
as this may contain data from when your dataset was skipping utterances, and they'll be garbage.
start fresh

re-ran the piper_train.preprocess

python3 -m piper_train.preprocess --language en --input-dir ~/dataprep --output-dir ~/train-me --dataset-format ljspeech --single-speaker --sample-rate 220502 --max-workers 1

then re-ran the training:

python3 -m piper_train --dataset-dir ~/train-me --accelerator 'gpu' --gpus 1 --batch-size 32 --validation-split 0.0 --num-test-examples 0 --max_epochs 700 --resume_from_checkpoint "~/checkpoint-files/cori-med-640.ckpt" --checkpoint-epochs 100 --precision 32 --max-phoneme-ids 400 --quality medium

and FINALLY it no longer skips any utterances

GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
DEBUG:piper_train:Checkpoints will be saved every 100 epoch(s)
DEBUG:vits.dataset:Loading dataset: /testing/piper-training/dataset.jsonl
/PiperTTS/.venv/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py:1906: LightningDeprecationWarning: `trainer.resume_from_checkpoint` is deprecated in v1.5 and will be removed in v2.0. Specify the fit checkpoint path with `trainer.fit(ckpt_path=)` instead.
  rank_zero_deprecation(
This was referenced Dec 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant