Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Synthesis: VITS voices have various issues related to model training #14

Open
rotemdan opened this issue Jul 28, 2023 · 0 comments
Open
Labels
bug Something isn't working external Issues that are related to external sources synthesis Issue related to speech synthesis

Comments

@rotemdan
Copy link
Member

rotemdan commented Jul 28, 2023

For example, when the default English voice (Amy / Low) gets an utterance that is a single word, like "two", it seems to mispronounce it as something that sounds closer to "ten". Other voices have much more serious issues. For example, the Greek voice may produce bizarre, nonsensical utterances when given English text (most likely it hasn't been trained for English, or Latin characters in general, and doesn't know what to do).

This is an issue with the training of the models, not related to the code itself.

These models are trained as part of the Piper speech system, mostly by Michael Hansen. You can check out the Piper issue tracker to give feedback on these sorts of problems.

Echogarden doesn't actually use the Piper system, but reimplements it in JavaScript, with several enhancements that are not present in the original C++ code. Only the ONNX models are shared.

The original ONNX models are published on the piper-voices Hugging Face repository. I repackage them as tar.gz archives and upload them to the echogarden-packages Hugging Face repository, from which they (and all other packages) are downloaded when needed.

@rotemdan rotemdan added bug Something isn't working synthesis Issue related to speech synthesis external Issues that are related to external sources labels Jul 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working external Issues that are related to external sources synthesis Issue related to speech synthesis
Projects
None yet
Development

No branches or pull requests

1 participant