[TTS] Add tutorial for TTS data prep scripts #6922

rlangman · 2023-06-26T21:29:28Z

What does this PR do ?

Add a tutorial demonstrating how to do the end to end data preparation and training with the new TTS preprocessing scripts and data loader.

Collection: [TTS]

Changelog

Create tutorial

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you add or update any necessary documentation?
Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
- Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

New Feature
Bugfix
Documentation

tutorials/tts/FastPitch_Data_Preparation.ipynb

XuesongYang

Sorry for the late review. LGTM, added some neat-picks.

XuesongYang · 2023-07-08T06:49:02Z

tutorials/tts/FastPitch_Data_Preparation.ipynb

+      "source": [
+        "In this tutorial, we will prepare a dataset using our [TTS Dataset Processing Scripts](https://github.com/NVIDIA/NeMo/tree/main/scripts/dataset_processing/tts) and use it for training a FastPitch model.\n",
+        "\n",
+        "**This tutorial uses a different workflow than all other existing TTS tutorials. The scripts and classes used are all experimental and not yet ready for production**"


nit: missing a period at the end of the sentence.

XuesongYang · 2023-07-08T06:51:03Z

tutorials/tts/FastPitch_Data_Preparation.ipynb

+    {
+      "cell_type": "markdown",
+      "source": [
+        "# Dataset Prepration"


s/Prepration/Preparation/

XuesongYang · 2023-07-08T06:59:30Z

tutorials/tts/FastPitch_Data_Preparation.ipynb

+      "cell_type": "code",
+      "source": [
+        "import IPython.display as ipd\n",
+        "from matplotlib.pyplot import imshow"


remove since no other places use it.

XuesongYang · 2023-07-08T07:03:12Z

tutorials/tts/FastPitch_Data_Preparation.ipynb

+      "source": [
+        "We can use [create_speaker_map.py](https://github.com/NVIDIA/NeMo/blob/main/scripts/dataset_processing/tts/create_speaker_map.py) to easily create a mapping from speaker ID strings to integer indices that will be used at training time.\n",
+        "\n",
+        "The script will simply sort the speaker IDs and assign them numbers [0, num_speakers) in alphabetical order."


s/[0, num_speakers)/[0, num_speakers)/

XuesongYang · 2023-07-08T07:06:06Z

tutorials/tts/FastPitch_Data_Preparation.ipynb

+    {
+      "cell_type": "markdown",
+      "source": [
+        "Before training FastPitch, we need to compute some features for every audio file. The default [config file](https://github.com/NVIDIA/NeMo/blob/main/examples/tts/conf/feature/feature_44100.yaml) we will use has parameters for computing the **pitch** and **energy** of every audio frame."


nit: I saw inconsistent font formats for pitch and energy, and sometimes pitch and energy.

I went through the tutorial to try to the formatting more consistent.

Use bold when it is the first time an important vocab term is mentioned.

Use code when it refers to specific code, variable name, file, etc.

Use italics to emphasize any other key words.

XuesongYang · 2023-07-08T07:07:50Z

tutorials/tts/FastPitch_Data_Preparation.ipynb

+        "For training it is beneficial for us to *normalize* our features. The most standard approach is to apply **mean-variance normalization** so that each feature has a mean of 0 and variance of 1. To do this we need to compute the *dataset statistics* with the mean and variance of each feature.\n",
+        "\n",
+        "For TTS it also helps\n",
+        "*   Normalize features using speaker-level statistics\n",


missing a period.

tutorials/tts/FastPitch_Data_Preparation.ipynb

Signed-off-by: Ryan <[email protected]>

rlangman requested review from XuesongYang, redoctopus, racoiaws and subhankar-ghosh June 26, 2023 21:29

github-actions bot added the TTS label Jun 26, 2023

redoctopus reviewed Jun 30, 2023

View reviewed changes

tutorials/tts/FastPitch_Data_Preparation.ipynb Outdated Show resolved Hide resolved

XuesongYang previously approved these changes Jul 8, 2023

View reviewed changes

rlangman added 2 commits July 11, 2023 10:12

[TTS] Add tutorial for TTS data prep scripts

0de2dff

Signed-off-by: Ryan <[email protected]>

[TTS] Fix tutorial typos

49f01a3

Signed-off-by: Ryan <[email protected]>

rlangman dismissed XuesongYang’s stale review via 207a14f July 11, 2023 23:51

rlangman force-pushed the tts_tutorial branch 2 times, most recently from 207a14f to c3c0b5b Compare July 11, 2023 23:53

[TTS] Fix formatting, punctuation

a65ca3d

Signed-off-by: Ryan <[email protected]>

rlangman force-pushed the tts_tutorial branch from c3c0b5b to a65ca3d Compare July 11, 2023 23:56

XuesongYang approved these changes Jul 12, 2023

View reviewed changes

Merge branch 'main' into tts_tutorial

471e7ec

XuesongYang approved these changes Jul 12, 2023

View reviewed changes

XuesongYang merged commit 728403d into main Jul 12, 2023

XuesongYang deleted the tts_tutorial branch July 12, 2023 21:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TTS] Add tutorial for TTS data prep scripts #6922

[TTS] Add tutorial for TTS data prep scripts #6922

rlangman commented Jun 26, 2023

XuesongYang left a comment

XuesongYang Jul 8, 2023

XuesongYang Jul 8, 2023

XuesongYang Jul 8, 2023

XuesongYang Jul 8, 2023

XuesongYang Jul 8, 2023

rlangman Jul 11, 2023

XuesongYang Jul 8, 2023

[TTS] Add tutorial for TTS data prep scripts #6922

[TTS] Add tutorial for TTS data prep scripts #6922

Conversation

rlangman commented Jun 26, 2023

What does this PR do ?

Changelog

Before your PR is "Ready for review"

XuesongYang left a comment

Choose a reason for hiding this comment

XuesongYang Jul 8, 2023

Choose a reason for hiding this comment

XuesongYang Jul 8, 2023

Choose a reason for hiding this comment

XuesongYang Jul 8, 2023

Choose a reason for hiding this comment

XuesongYang Jul 8, 2023

Choose a reason for hiding this comment

XuesongYang Jul 8, 2023

Choose a reason for hiding this comment

rlangman Jul 11, 2023

Choose a reason for hiding this comment

XuesongYang Jul 8, 2023

Choose a reason for hiding this comment