Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[server][asr]more accuracy decoding somthing #2128

Merged
merged 3 commits into from
Jul 12, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions paddlespeech/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,3 +14,5 @@
import _locale

_locale._getdefaultlocale = (lambda *args: ['en_US', 'utf8'])


7 changes: 4 additions & 3 deletions paddlespeech/server/engine/asr/online/ctc_endpoint.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,10 +39,10 @@ class OnlineCTCEndpoingOpt:

# rule1 times out after 5 seconds of silence, even if we decoded nothing.
rule1: OnlineCTCEndpointRule = OnlineCTCEndpointRule(False, 5000, 0)
# rule4 times out after 1.0 seconds of silence after decoding something,
# rule2 times out after 1.0 seconds of silence after decoding something,
# even if we did not reach a final-state at all.
rule2: OnlineCTCEndpointRule = OnlineCTCEndpointRule(True, 1000, 0)
# rule5 times out after the utterance is 20 seconds long, regardless of
# rule3 times out after the utterance is 20 seconds long, regardless of
# anything else.
rule3: OnlineCTCEndpointRule = OnlineCTCEndpointRule(False, 0, 20000)

Expand Down Expand Up @@ -102,7 +102,8 @@ def endpoint_detected(self,

assert self.num_frames_decoded >= self.trailing_silence_frames
assert self.frame_shift_in_ms > 0


decoding_something = (self.num_frames_decoded > self.trailing_silence_frames) and decoding_something
utterance_length = self.num_frames_decoded * self.frame_shift_in_ms
trailing_silence = self.trailing_silence_frames * self.frame_shift_in_ms

Expand Down