-
Notifications
You must be signed in to change notification settings - Fork 376
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suggestion: support GPT-SoVITS as TTS (Fast voice clone - so users can talk to his/her favorite voice other than generic AI voice). #92
Comments
Did you ever take a look at StyleTTS2? |
It looks promising. I am in dire need for voice clone and multilanguage support. Here is a supplement of the issue. Use scenario The TTS has to excel in voice clone. A pre-trained voice won't do because every audience don't want that voice, they need his/her particularly favorite ones. And the TTS should support multilanguage scenarios, especially Chinese, English, Italian (the game has a heated character with Italian background) and if possible, Hindi (for an AI bot - I don't know why a bot is popular in a Gacha game, but it happens) To expand this topic a bit. For professional use cases, like medicine consulting, a pre-trained voice will do, because the key is not the voice, but the accuracy of the content. But for everyday use cases, emotional engagement comes in. It won't limit to Gacha game. Limitation Current Solution Current options It looks like StyleTTS2 could be my savior after all.
|
Hey, I would be more than ok adding support for this TTS. If you want to do it I think it would be cool, I would review it 👍 We are still discussing a bit where to take this library next, thank you for sharing your ideas! |
Right now I am using silly tavern, kobold, and GPT-Sovits to do a kind of speech-to-speech (with the voice I cloned). But it's slow even on a 3090, maybe 4090 can do better? I have tried this HF speech to speech on mac, it is a much better experience. Wherever you are heading, may fortune favor your path. |
Thanks for this awesome project. Based on the similar pipeline, we have released a Chinese Speech-to-Speech project named CleanS2S, supporting more interesting and streaming interactions. Here is a snapshot of this project: Looking forward to more advices and feedbacks! |
Congratulate and many thanks first! I think the project has great potential into becoming a popular foundation.
If you deem appropriate, would you support GPT-SoVITS as well?
I know there has already been lots of TTS support so far, but GPT-SoVITS has something different. It allows users to clone his/her favorite voice in a very efficient way.
Talking to AI is inspiring, but enjoying response from a particular voice is what intrigues people, and it could be one of the ultimate goals when people are willing to talk to a machine. GPT-SoVITS can do a decent voice clone with a few clips in a few minutes, thus making it an ideal addition to the existing TTS solutions.
Best wishes!
The text was updated successfully, but these errors were encountered: