Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chinese support #9

Open
MonolithFoundation opened this issue Dec 13, 2024 · 1 comment
Open

Chinese support #9

MonolithFoundation opened this issue Dec 13, 2024 · 1 comment

Comments

@MonolithFoundation
Copy link

would consider support Chinese?

@cantabile-kwok
Copy link
Owner

That is a good question. We want to see how this model works on Chinese, but the core problem is not about model or dataset; it is about the speech tokens. Since in the paper we use vq-wav2vec, which is only trained on English Librispeech corpus, we don't expect it to generalize very well to Chinese. We need to find another token which contains limited timbre information and enough prosody information for Chinese, which seems a bit hard. Training a vq-wav2vec on Chinese dataset is also a larger project. Hence, we would not train this on Chinese unless there is a satisfactory speech token ready to use.

Nevertheless, the language restriction is only on the source speech. For the target reference, any language is feasible (i.e. no problem from English content to Chinese speaker).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants