Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create per.py #7538

Merged
merged 153 commits into from
Oct 7, 2023
Merged

Create per.py #7538

merged 153 commits into from
Oct 7, 2023

Conversation

ssh-meister
Copy link
Collaborator

Script for calculation Punctuation Error Rate and related rates (correct rate, deletions rate, etc.)
@karpnv @ekmb @vsl9 @KunalDhawan

@jubick1337
Copy link
Collaborator

And we have a package for ASR-related metrics.

Why not move the metric there?

@karpnv
Copy link
Collaborator

karpnv commented Sep 27, 2023

move it from nemo/collections/common/metrics/per.py to nemo/collections/asr/metrics/

@ssh-meister
Copy link
Collaborator Author

@jubick1337 @karpnv because this metric is applicable for ASR and NLP PC models both (there is noticed in the paper that PER allows to compare prediction accuracy between PC models in general)

nemo/collections/common/metrics/per.py Outdated Show resolved Hide resolved
nemo/collections/common/metrics/per.py Outdated Show resolved Hide resolved
nemo/collections/common/metrics/per.py Outdated Show resolved Hide resolved
nemo/collections/common/metrics/per.py Outdated Show resolved Hide resolved
nemo/collections/common/metrics/per.py Outdated Show resolved Hide resolved
nemo/collections/common/metrics/per.py Outdated Show resolved Hide resolved
nemo/collections/common/metrics/per.py Outdated Show resolved Hide resolved
nemo/collections/common/metrics/per.py Outdated Show resolved Hide resolved
@itzsimpl
Copy link
Contributor

move it from nemo/collections/common/metrics/per.py to nemo/collections/asr/metrics/

Note that coneptually this implementation of PER will give "useful" results only for NLP, where the input is the text that is to be punctuated and is immutable (i.e. the reference is this same text punctuated by hand; a clear ground truth to compare against).

For the ASR case, eg. with hybrid_pc models, the input is audio, and the reference is the punctuated transcription (typically obtained by hand). If recognition is flawless then PER will give the correct information (the quality of the punctuation subtask), if recognition is not flawless punctuation might be affected by WER, and PER might be affected as well (eg. a word that is typically followed by a comma gets deleted). In certain occasions the punctuation of a imperfect transcription might still be flawless from the standpoint of punctuation, but since the metric is computed agains the original reference punctuated transcription, the PER score will be affected as well. There is no simple solution to this except to manually re-punctuate the transcription obtained by the ASR task and use that as reference, which might not be viable, but is something to keep in mind when reporting or. using the PER metric.

@github-actions github-actions bot added the ASR label Sep 29, 2023
@jubick1337
Copy link
Collaborator

Could you please add tests for PER here?

Copy link
Collaborator

@jubick1337 jubick1337 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor format changes, otherwise looks good to me

@ssh-meister
Copy link
Collaborator Author

Could you please add tests for PER here?

@jubick1337, I've added tests, could you please review them again?

@ekmb ekmb requested a review from karpnv October 2, 2023 20:58
Copy link
Collaborator

@karpnv karpnv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets move rate_punctuation.py to
https://github.com/NVIDIA/NeMo/blob/main/examples/asr/speech_to_text_eval.py by adding use_per=False option.

@ssh-meister
Copy link
Collaborator Author

Lets move rate_punctuation.py to https://github.com/NVIDIA/NeMo/blob/main/examples/asr/speech_to_text_eval.py by adding use_per=False option.

This type of modification will result in a reduction of certain features. For instance, in the speech_to_text_eval.py script, there is currently no provision for computing the supported metrics (WER/CER) separately for each sample and then saving them to the manifest. This capability could be essential when considering the use of punctuation rates (correct rate / deletions rate / etc.) as thresholds for further sample filtering (e.g. SDE).

It may be advisable to consider adding a feature to speech_to_text_eval.py to calculate only the total PER value on a dataset, but in addition to the existing rate_punctuation.py functionality, rather than replacing it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.