-
Notifications
You must be signed in to change notification settings - Fork 0
Home
The main aim of this project is to build an user friendly automatic speech recognition evaluation system, which could evaluate multiple speech recognition engines on a given dataset.
The main window shows the user to choose between two models:
-
Recognize & Evaluate
-
Performance calculator
The descriptions of both the models are also provided.
Recognise & evaluate component executes each speech recognition system to recognise audio files from the speech database and then evaluates the recognition output with the reference output. This component requires various speech recognition system’s SDK, it’s related models, a speech database consisting of audio files and its respective transcriptions and to perform a complete recognition and evaluation system. Configuration of the models completely depend upon the speech recognition system’s API. Once the recognition output is obtained, the obtained text is aligned with the reference text and result is provided in terms of performance metrics.
Performance calculator is used, when a speech recognition system’s output and the reference text of the recognised speech files are already available. This component compares both the reference text and hypothesis text by an alignment process and provides result in terms of performance metrics. This is a subset of Recognise & evaluate component.
In both the models for evaluating the output text with the reference text, viterbi alignment algorithm is applied which penalizes for substitution, deletion and insertion of words to the reference text.