Inference of Multiband MelGAN (v2) with ForwardTacotron #346

prajwaljpj · 2022-03-21T12:34:07Z

I have trained a ForwardTacotron text2mel model and I would like to integrate it into Parallelwavegan.
For now we have extracted the genrated mel_post (Melspectrogram after Postnet) from here and saving it as a .npy file (alifiya_esp_1.npy.zip). Then we use StandardScalar to normalize the data from here (1 and 2) and infer through here.
This is the sample output of the corresponding numpy file (alifiya_esp_1.wav.zip)
The same mel is working fine with GriffinLim.

Where am I going wrong?

kan-bayashi · 2022-03-22T08:41:40Z

Not sure but the following points might be different:

log basis (I use log10 as a default)
fmin and fmax for mel basis (I use 80-7600 as a default)
normalization (I use mean var normalization using stats of training data)

If you use the same mel basis, maybe log is different.
https://github.com/as-ideas/ForwardTacotron/blob/3bcaf3569ea2379ff995403b31f280720df3f03d/utils/dsp.py#L71-L87
https://github.com/as-ideas/ForwardTacotron/blob/3bcaf3569ea2379ff995403b31f280720df3f03d/utils/dsp.py#L105-L107

You can change the log basis to match the feature extraction condition.
Relate: #169 (comment)

prajwaljpj · 2022-03-22T11:37:10Z

Is there a way to circumvent without re-training the text2mel model?

redhood95 · 2022-03-22T11:46:32Z

log basis (I use log10 as a default)

normalization (I use mean var normalization using stats of training data)

we have tried to fix this by using np.log10(np.exp(ft_mel_output)) and normalize it using the mean var normalization with standardScaler
are there any other changes we can make at the inference time for fixing this ?

kan-bayashi · 2022-03-22T12:07:46Z

If fmin / fmax are different, there is no way to use pretrained model.

kan-bayashi added the question Further information is requested label Mar 22, 2022

kan-bayashi closed this as completed Mar 28, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inference of Multiband MelGAN (v2) with ForwardTacotron #346

Inference of Multiband MelGAN (v2) with ForwardTacotron #346

prajwaljpj commented Mar 21, 2022

kan-bayashi commented Mar 22, 2022 •

edited

Loading

prajwaljpj commented Mar 22, 2022

redhood95 commented Mar 22, 2022

kan-bayashi commented Mar 22, 2022

Inference of Multiband MelGAN (v2) with ForwardTacotron #346

Inference of Multiband MelGAN (v2) with ForwardTacotron #346

Comments

prajwaljpj commented Mar 21, 2022

kan-bayashi commented Mar 22, 2022 • edited Loading

prajwaljpj commented Mar 22, 2022

redhood95 commented Mar 22, 2022

kan-bayashi commented Mar 22, 2022

kan-bayashi commented Mar 22, 2022 •

edited

Loading