Skip to content

Training and Testing

Eren Golge edited this page Feb 12, 2019 · 5 revisions

Setting up config.json

Everything you need to set about your model is defined in '''config.json'''. So set your data related paths and the hyper-parameters. Even though we provide default parameters, for the best performance you might need to perform a hyper-parameter search on your dataset. Be aware that, each dataset has its own unique attributed, therefore they might perform differently with the same set of hyper-parameters.

How to Run TTS

  • Training

python train.py --config_path config.json

  • Finetuning: It continues the training of a previously saved model with the new parameters defined in given config.json. Also if there is an architectural mismatch between the saved model and the new code, it only initializes compatible layers and randomly initialize the others.

python train.py --config_path config.json --restore_path your/model/path.pth.tar

  • Distributed training (in development): Following command by default uses all the available GPUs defined by CUDA_VISIBLE_DEVICES.

CUDA_VISIBLE_DEVICES="0,1,2" python distribute.py --config_path config.json

Inspecting Training

Throughout the training, there are different ways to inspect model performance. On the terminal, you see basic model stats like loss values, step times, gradient norms etc. However, the best way is to use Tensorboard. If you enable validation iteration (run_eval in config.json), the best way to see your model training is to look at validation losses. The second important indicator is attention alignment. Sometimes model reduce the losses but attention might be a problem. The third and best indicator is to listen to test audio synthesis. Keep in mind that, all other audio examples, except test audios, are synthesized with teacher force. Therefore, test audios are the best performance indicators for real-life model performance.

Stopping Training

Stop the training, if your model starts to overfit (validation loss increases as training loss stays the same or decreases). Sometimes, the attention module overfits as well without noticing from the loss plots. It is observed when the attention alignment is misaligned at test examples but train and validation examples. If your final model does not work well at this stage, you can retrain the model with a higher weight decay.

Testing Model

Currently, there are two good ways to test your trained model.

  • Benchmark.ipynb: It runs the model on benchmark sentences to compare with the other available text2speech implementations. It also provides a useful set of visualizations about your model performance.
  • Demo Server: You can run the demo server by setting its config file and test your model on a simple web interface. It is also useful if you like to share your results with others in your team.
Clone this wiki locally