About the multi_speaker implemention. #37

LeoniusChen · 2021-03-03T13:12:10Z

Hi, I read about your multi_speaker implemention of Tacotron2. It means different speakers correspond to different text inputs, and you did not use the speaker embedding. Am i right ? If so, the speaker information is involved in the text which is unnecessary.

begeekmyfriend · 2021-03-04T02:40:51Z

I just expended the symbol table and each of the symbol offset represents one speaker as implicit embedding.

LeoniusChen · 2021-03-04T02:49:49Z

I just expended the symbol table and each of the symbol offset represents one speaker as implicit embedding.

Thanks for your reply! I understand what you have done. I think this implemention may introduce unnecessary trouble if I want to preserve the prosody of the reference utterence (from speaker A) and have the timbre of speaker B. Do you know some other implemention of multi_speaker tacotron ?

begeekmyfriend · 2021-03-04T06:41:14Z

Well, this project does not implement the prosody memory of speakers. In other words, the prosody of every speaker are independent with each other. If you want to refer the prosody of other speakers, extra explicit prosody embedding is needed. Unfortunately current implementation on deep learning is not perfect enough for this issue as I know. Global style token (aka. GST) on Tacotron is one kind of it. A good project on PyTorch is provided and it is based on Tacotron 1 though. I do not know if this project suits you.

LeoniusChen · 2021-03-04T06:55:34Z

Thanks for your kind help ! I've read this GST project before, it only uses the dataset of a single speaker. My issue is that a multi_speaker tacotron is needed where the speaker embedding is explicitly given 🤣 Anyway, I'll try to implement it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About the multi_speaker implemention. #37

About the multi_speaker implemention. #37

LeoniusChen commented Mar 3, 2021

begeekmyfriend commented Mar 4, 2021

LeoniusChen commented Mar 4, 2021

begeekmyfriend commented Mar 4, 2021

LeoniusChen commented Mar 4, 2021

About the multi_speaker implemention. #37

About the multi_speaker implemention. #37

Comments

LeoniusChen commented Mar 3, 2021

begeekmyfriend commented Mar 4, 2021

LeoniusChen commented Mar 4, 2021

begeekmyfriend commented Mar 4, 2021

LeoniusChen commented Mar 4, 2021