Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Predictions Speech to text doesn't work in fr-FR, en-AU, en-GB and fr-CA #7808

Closed
SebSchwartz opened this issue Feb 23, 2021 · 5 comments
Closed
Labels
bug Something isn't working Predictions Related to Predictions category

Comments

@SebSchwartz
Copy link
Contributor

Describe the bug
The speech to text cannot work in fr-FR, en-AU, en-GB and fr-CA as the sample rate for the streaming transcription is limited to 8 kHz and amplify always send 16 kHz in the request.
As explained in the doc -> https://docs.aws.amazon.com/transcribe/latest/dg/streaming.html

To Reproduce

  1. Take the example from -> https://docs.amplify.aws/lib/predictions/sample/q/platform/js#sample-react-app
  2. Specify the language in one of the listed one.
  3. You will receive 2 errors
    1. One that should be fixed in Missing return in AmazonAIConvertPredictionsProvider.ts #7803 -> missing return in code
    2. The real error -> The requested language doesn't support the specified sample rate. Use the correct sample rate then try again.
  4. You could try to resample to 8 kHz as requested in the aws doc with wavefile library for example (code is working for en-US)
let wav = new WaveFile();
wav.fromBuffer(new Uint8Array(fileBlob));
wav.toSampleRate(8000);
  1. But the request made to Streaming Transcription endpoint is hardcoded to 16 kHz as you can see in the code (L402):
    private generateTranscribeUrl({ credentials, region, languageCode }): string {
    const url = [
    `wss://transcribestreaming.${region}.amazonaws.com:8443`,
    '/stream-transcription-websocket?',
    `media-encoding=pcm&`,
    `sample-rate=16000&`,

Expected behavior
Amplify should detect the sample rate or allow user to pass it as param (more easy to do).

Code Snippet

const speechToTextOutput = await Predictions.convert({
                      transcription: {
                        source: {
                          bytes: rawPcm,
                        },
                        language: 'fr-FR',
                      },
                    });
@SebSchwartz SebSchwartz added the to-be-reproduced Used in order for Amplify to reproduce said issue label Feb 23, 2021
@iartemiev iartemiev added the Predictions Related to Predictions category label Feb 23, 2021
@wlee221 wlee221 added bug Something isn't working and removed to-be-reproduced Used in order for Amplify to reproduce said issue labels Feb 23, 2021
@wlee221
Copy link
Contributor

wlee221 commented Feb 23, 2021

Hi @SebSchwartz, thanks for your research! You are right, and we should downsample the buffer to 8khz depending on the language code. We already have code for downsampling that should make this fix easier:

and we were hard setting the target downsample samplerate to 8khz here

We would just need to set that to 8khz if the target code is one of the four you mentioned (maybe add that as a parameter to downsampleBuffer). Then finally we would dynamically set

to 8khz or 16khz. Do you think you can work on creating a PR for this?

@SebSchwartz
Copy link
Contributor Author

@wlee221 I've pushed my PR #7835 ;)

@wlee221
Copy link
Contributor

wlee221 commented Mar 3, 2021

Thanks a lot @SebSchwartz ! I'll take a look :)

@sammartinez
Copy link
Contributor

From the above pull request being merged, I am going to close this issue.

@github-actions
Copy link

This issue has been automatically locked since there hasn't been any recent activity after it was closed. Please open a new issue for related bugs.

Looking for a help forum? We recommend joining the Amplify Community Discord server *-help channels or Discussions for those types of questions.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 19, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working Predictions Related to Predictions category
Projects
None yet
Development

No branches or pull requests

4 participants