Feel free to post any issues or questions about Cognitive TTS service here! #128

szhaomsft · 2019-08-21T06:17:15Z

we encourage developers to post issues / questions in this forum.

It is monitored regularly

szhaomsft · 2019-08-21T06:19:41Z

Welcome post like

Question about the API
Feedbacks about the service
Feature request.
Sample request.

phly95 · 2019-09-09T21:23:27Z

It seems the output is limited to 10 minutes of audio (at least using the nural option). What if I want to process a long text file, like a required reading or a chapter of a book?

bodyzatva · 2019-09-10T15:14:05Z

I´m using voice pt-PT-HeliaRUS with language pt_PT for a chatbot in a project for a client.
We are facing issues when the bot speaks email addresses.
When i send this text using ssmlSpeak:

"O seu e-mail é <say-as interpret-as="characters">[email protected]"

The email is not being spelled.

I tried other voices and languages like : pt-BR-HeloisaRUS pt_BR and en-US-JessaRUS en-US.

Only in the voice "en-US-Jessa24kRUS" he spells the name.

Can you tell me why ?

By the way , we have a workaround to force spelling that is separate the email text with spaces:
"O seu e-mail é <say-as interpret-as="characters">a n a r e b e l o @ s a p o . p t < / say-as>"

Is this a problem with pt-PT language? And how can i have better results spelling emails correctly ?

shoutbomb · 2019-10-08T16:59:20Z

How do I control the pace of the generated speech. I need to slow it down by 10%.
(en-US, JessaNeural)
X-Microsoft-OutputFormat: riff-24khz-16bit-mono-pcm

hannabonert · 2021-11-24T11:21:25Z

Hello,
I followed the sample from here, and can successfully send 16 kHz audio to the service, and receive a valid response.
How can I use a sampling rate of 44100 hz?

I have tried both of the below, but I get "InitialSilenceTimeout" for every recording that I try.
connection.setRequestProperty("Content-Type", "audio/raw;encoding=signed-integer;bits=16;rate=44100");
connection.setRequestProperty("Content-Type", "audio/wav; codecs=audio/pcm; samplerate=44100");

As a test, I ran the same recording through at 16 kHz, and got a "RecognitionStatus" of "Success". I then resampled it to 44100 kHz, and I got "InitialSilenceTimeout".

I have an example using the speech SDK that works, but now I need to use 44 kHz audio data with the REST API.

Any advice would be greatly appreciated.

Thank you!

isabirahmed · 2022-11-20T13:12:11Z

Text to Speech does not set the right pitch if two pitches are set in one request.

Sample SSML:

<speak xmlns="http://www.w3.org/2001/10/synthesis" xmlns:mstts="http://www.w3.org/2001/mstts" xmlns:emo="http://www.w3.org/2009/10/emotionml" version="1.0" xml:lang="en-US"><voice name="en-US-SaraNeural">Much against his will Reddy obeyed. <prosody rate="default" pitch="28%" volume="default">“It isn’t the least bit of use,”</prosody> he grumbled, as he trotted towards the Big River. <prosody rate="default" pitch="28%" volume="default">“There won’t be anything there. It is just a waste of time.”</prosody></voice></speak>

I have a sentence with two parts of it set to pitch=28%.
The first part "It isn’t the least bit of use," sounds more like pitch=8% even though its set to 28%
The second part "There won’t be anything there. It is just a waste of time." sounds correct at pitch=28%

Please note this is happening with all the voices and looks like a major bug.
It only happens when you set more than one sentence of the pitch.

Please test this in US East region.
Sample audio file: https://fliki.ai/share/audio/microsoft-pitch-issue-637b4b26dde64016ddbd2a51

gchiarapa · 2022-12-16T21:03:15Z

I'm using the Rest API, to synthetize text to speech, but I'd like to know to play the response. Any ideas how to convert and play the response?

My request:

    uri = 'https://brazilsouth.tts.speech.microsoft.com/cognitiveservices/v1';
                    method = 'POST';
                    $http({
                        "method": method,
                        "url": uri,
                        "headers": {
                            "Content-type": "application/ssml+xml",
                            "X-Microsoft-OutputFormat": "audio-16khz-64kbitrate-mono-mp3",
                            "Host":"brazilsouth.tts.speech.microsoft.com",
                            "User-Agent": "",
                            "Authorization": "Bearer " + res.data
                        },
                        data: ''

Chukarslan · 2023-10-20T20:26:21Z

This pertains to commit: d457a6d
GPT Streaming response based on TTS text splitter for each sentence.

Is it possible to share the Python / Node version of the code?

Thanks!

szhaomsft added the Annoucement label Aug 21, 2019

boltomli mentioned this issue Sep 10, 2019

It seems the output is limited to 10 minutes of audio #132

Closed

boltomli mentioned this issue Sep 11, 2019

pt-PT and pt-BR say-as spell support issue #134

Closed

boltomli mentioned this issue Oct 9, 2019

[Neural] How do I control the pace of the generated speech #142

Closed

szhaomsft self-assigned this Dec 30, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feel free to post any issues or questions about Cognitive TTS service here! #128

Feel free to post any issues or questions about Cognitive TTS service here! #128

szhaomsft commented Aug 21, 2019

szhaomsft commented Aug 21, 2019

phly95 commented Sep 9, 2019 •

edited

Loading

bodyzatva commented Sep 10, 2019 •

edited

Loading

shoutbomb commented Oct 8, 2019

hannabonert commented Nov 24, 2021 •

edited

Loading

isabirahmed commented Nov 20, 2022 •

edited

Loading

gchiarapa commented Dec 16, 2022

Chukarslan commented Oct 20, 2023

Feel free to post any issues or questions about Cognitive TTS service here! #128

Feel free to post any issues or questions about Cognitive TTS service here! #128

Comments

szhaomsft commented Aug 21, 2019

szhaomsft commented Aug 21, 2019

phly95 commented Sep 9, 2019 • edited Loading

bodyzatva commented Sep 10, 2019 • edited Loading

shoutbomb commented Oct 8, 2019

hannabonert commented Nov 24, 2021 • edited Loading

isabirahmed commented Nov 20, 2022 • edited Loading

gchiarapa commented Dec 16, 2022

Chukarslan commented Oct 20, 2023

phly95 commented Sep 9, 2019 •

edited

Loading

bodyzatva commented Sep 10, 2019 •

edited

Loading

hannabonert commented Nov 24, 2021 •

edited

Loading

isabirahmed commented Nov 20, 2022 •

edited

Loading