All voices have a "female pitch" #449

DaEmpty · 2024-12-07T14:11:17Z

Describe the bug
I started using Alltalk beta as standalone last month and work with a commit from 29th November 2024 at 04:41.
Everything was fine.

Today i tried to package my projects and tested the current version (luckily i a new folder).
But all voices do now sound like a female .
I have gone back to the old version, but the problem still exists with new "installations" of the old project (checking out, executing setup, starting).

I tried copying over the "confignew.json" but i only changed the delete_output_wavs anyways.

Testing was done with a simple request from postman (identical for both instances.

It feels like a configuration error, but i'm not sure how to track down the problem as i'm not able to get a proper output with a fresh setup.

Text/logs
Here the starting output with the same version from november. (but problem occurs with the current version also)
old:

[AllTalk TTS] Config file update: No Updates required
[AllTalk TTS] Start-up Mode : Standalone mode
[AllTalk TTS] WAV file deletion : Disabled
[AllTalk TTS] Github updated : 29th November 2024 at 04:41 Branch: alltalkbeta
[AllTalk ENG] Transcoding : ffmpeg found
[AllTalk ENG] DeepSpeed version : 0.14.0+ce78a63
[AllTalk ENG] Python Version : 3.11.10
[AllTalk ENG] PyTorch Version : 2.2.2+cu121
[AllTalk ENG] CUDA Version : 12.1
[AllTalk ENG]
[AllTalk ENG] Model/Engine : xttsv2_2.0.3 loading into cuda
[AllTalk ENG] Model License: https://coqui.ai/cpml.txt
[AllTalk ENG] Load time : 10.25 seconds.

new:
[AllTalk TTS] Config file update: No Updates required
[AllTalk TTS] Start-up Mode : Standalone mode
[AllTalk TTS] WAV file deletion : Disabled
[AllTalk TTS] Github updated : 29th November 2024 at 04:41 Branch: alltalkbeta
[AllTalk ENG] Transcoding : ffmpeg found
[AllTalk ENG] DeepSpeed version : 0.14.0+ce78a63
[AllTalk ENG] Python Version : 3.11.11
[AllTalk ENG] PyTorch Version : 2.2.1
[AllTalk ENG] CUDA Version : 12.1
[AllTalk ENG]
[AllTalk ENG] Model/Engine : xttsv2_2.0.3 loading into cuda
[AllTalk ENG] Model License: https://coqui.ai/cpml.txt
[AllTalk ENG] Load time : 9.92 seconds.

Desktop (please complete the following information):
AllTalk was updated: 29th November 2024 at 04:41
Also tested with the current version.
Custom Python environment: no
Text-generation-webUI was updated: standalone

Additional context
Windows Environment

erew123 · 2024-12-07T18:47:12Z

Hi @DaEmpty I am away travelling, see here for details #377

As such I am currently unable to test, but replying with a couple of things you can ask/check.

are you using the same XTTS model between the two versions? XTTS 2.0.2 and XTTS 2.0.3 do sound somewhat different. Check the model version in the models folder name.
Are you using XTTS or API generation method on both version? Can confirm this on the "load different model" dropdown https://github.com/erew123/alltalk_tts/wiki/AllTalk-V2-QuickStart-Guide#4-generate-tts-tab
The people whom actually maintain the Coqui TTS engine https://github.com/idiap/coqui-ai-TTS/releases did a version update a few days ago. You can generate a diagnostics file for both versions of your installations with start_diagnostics which will create a diagnostics log file for both versions. You can downgrade the coqui tts engine verison with start_environment and then pip install --force-reinstall coqui-tts==0.24.2 and see if that makes a difference.
The only other possible difference I can think of it that recent version will only precompile the latent the 1x rather than every generation. This shouldnt cause a female sound, but you can delete the latent file that matches your wav file in the voices folder and it will re-calculate the generation on the next TTS generation for that voice.

Those are my only inital thoughts. AllTalk is effectively handing the text over to the Coqui TTS engine and assuming you havnt enabled RVC voices or pitch adjustment, should just generate absolutely normally.

As mentioned, I am travelling. If you want to possibly look at the above as possibilities, but still feel there is an issue, would you please upload a diagnostics log for both your old and new build of alltalk. Also if you wish to upload what you consider bad generations, you can upload a couple of samples here https://easyupload.io/ for me to listen to.

Thanks

erew123 · 2024-12-08T03:09:56Z

Found the issue to be a problem with the latest Coqui TTS engine (not something Ive done thankully).

Downgrade by running:

start_environment.bat (or .sh if on Linux)
pip install --force-reinstall coqui-tts==0.24.3

All should be working fine again after that. Have set this in the reqirements files for new installations, so future installations shouldnt be an issue.

Thanks

erew123 closed this as completed Dec 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

All voices have a "female pitch" #449

All voices have a "female pitch" #449

DaEmpty commented Dec 7, 2024 •

edited

Loading

erew123 commented Dec 7, 2024

erew123 commented Dec 8, 2024

All voices have a "female pitch" #449

All voices have a "female pitch" #449

Comments

DaEmpty commented Dec 7, 2024 • edited Loading

erew123 commented Dec 7, 2024

erew123 commented Dec 8, 2024

DaEmpty commented Dec 7, 2024 •

edited

Loading