Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When Speech-to-Text remains silent for the first 10 minutes after starting, the library fails to callback when stopContinuousRecognitionAsync is called #658

Closed
shareefalis opened this issue Apr 4, 2023 · 5 comments

Comments

@shareefalis
Copy link

shareefalis commented Apr 4, 2023

Our team is currently developing an app that includes a feature which enables users to access Microsoft speech to text (STT) output. We have implemented a timeout mechanism in the app which triggers stopImpl() when there has been no STT output for a certain period of time while the Microsoft STT engine is running.

However, we have encountered an issue when the timeout is set to 10 minutes and there is no speech input during the first 10 minutes (i.e. complete silence) resulting in no STT output from the engine. In this scenario, the app becomes stuck in SpeechToTextClientStates.STOPPING_STATE. Although we have implemented a workaround by lowering the timeout to 9.5 minutes, we wanted to bring this issue to your attention.

Here is our stopImpl():

 stopImpl(): void {
    if (this.recognizer && (this.isStarted() || this.isStarting())) {
      this.setState(SpeechToTextClientStates.STOPPING_STATE);
      this.recognizer.stopContinuousRecognitionAsync(
        () => {
          this.setState(SpeechToTextClientStates.STOPPED_STATE);
         // more stuff here
        },
        (err: string) => {
           this.setState(SpeechToTextClientStates.STOPPED_STATE, new Error(err));
         // more stuff
        },
      );
    } else {
       this.setState(SpeechToTextClientStates.STOPPED_STATE);
       // more stufff
    }
  }
@rhurey
Copy link
Member

rhurey commented Apr 11, 2023

Thanks for pointing this out.

There was an unfortunate circumstance at exactly the 10 minute point where the Speech Service would ask the client to disconnect, not fully complete the disconnect handshake, and if that also happened to be at the very end of the audio being sent to the service the client would get stuck.

While we have a work item to harden the client against this in a future version, there is also a service fix rolling out to complete the handshake which should mitigate this for existing SDK's.

@glharper
Copy link
Member

Resolving, as service team has indicated this fix has rolled out.

@shareefalis
Copy link
Author

@glharper @rhurey This still isn't fixed. Can you confirm that the fix in your service side?

@shareefalis
Copy link
Author

@glharper @rhurey any update to this issue?

1 similar comment
@shareefalis
Copy link
Author

@glharper @rhurey any update to this issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants