-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Initial support for ContextNet Encoder and CTC Decoder #630
Conversation
Signed-off-by: smajumdar <[email protected]>
Signed-off-by: smajumdar <[email protected]>
Signed-off-by: smajumdar <[email protected]>
Signed-off-by: smajumdar <[email protected]>
Signed-off-by: smajumdar <[email protected]>
Signed-off-by: smajumdar <[email protected]>
Signed-off-by: smajumdar <[email protected]>
Signed-off-by: smajumdar <[email protected]>
Signed-off-by: smajumdar <[email protected]>
Signed-off-by: smajumdar <[email protected]>
Signed-off-by: smajumdar <[email protected]>
Signed-off-by: smajumdar <[email protected]>
Signed-off-by: smajumdar <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
few small comments
nemo/collections/asr/contextnet.py
Outdated
logging = nemo.logging | ||
|
||
|
||
class ContextNetEncoder(TrainableNM): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this inherit from JasperEncoder ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second thought, it probably should not inherit JasperEncoder. While yes currently they share exactly same functionality, in the future they will not. In that case, the __init__
call will instantiate multiple JasperBlocks before ContextNetEncoder starts to instantiate its own values.
While there is duplication for now, it is cleaner to separate the two modules
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
Signed-off-by: smajumdar <[email protected]>
Signed-off-by: smajumdar <[email protected]>
Signed-off-by: smajumdar <[email protected]>
This pull request introduces 1 alert when merging 8c81303 into a22d325 - view on LGTM.com new alerts:
|
Signed-off-by: smajumdar <[email protected]>
|
||
# (ContextNet uses the Jasper baseline encoder and decoder) | ||
encoder = nemo_asr.ContextNetEncoder( | ||
feat_in=contextnet_params["AudioToMelSpectrogramPreprocessor"]["features"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a note that you can add this inside the yaml itself.
See https://confluence.atlassian.com/bitbucket/yaml-anchors-960154027.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the hint !
Signed-off-by: smajumdar <[email protected]>
This pull request introduces 1 alert when merging 81330ba into a22d325 - view on LGTM.com new alerts:
|
Signed-off-by: smajumdar <[email protected]>
* Add SE + context SE support Signed-off-by: smajumdar <[email protected]> * Add contextnet components Signed-off-by: smajumdar <[email protected]> * Add ContextNet support Signed-off-by: smajumdar <[email protected]> * Add config files Signed-off-by: smajumdar <[email protected]> * Correct configs Signed-off-by: smajumdar <[email protected]> * Add streaming speech command Signed-off-by: smajumdar <[email protected]> * Add kernel size factor argument Signed-off-by: smajumdar <[email protected]> * Add docstrings Signed-off-by: smajumdar <[email protected]> * Update CHANGELOG.md Signed-off-by: smajumdar <[email protected]> * Add integration tests Signed-off-by: smajumdar <[email protected]> * Style fixes and add docstrings for se_reduction_ratio Signed-off-by: smajumdar <[email protected]> * Style fixes in tests Signed-off-by: smajumdar <[email protected]> * Correct CHANGELOG.md Signed-off-by: smajumdar <[email protected]> * Correctios to docstrings Signed-off-by: smajumdar <[email protected]> * Add WandB support to contextnet.py Signed-off-by: smajumdar <[email protected]> * Style fixes Signed-off-by: smajumdar <[email protected]> * Remove unused import Signed-off-by: smajumdar <[email protected]> * Refactor ContextNetEncoder to subclass JasperEncoder Signed-off-by: smajumdar <[email protected]> * Remove unused imports Signed-off-by: smajumdar <[email protected]> Signed-off-by: ZeroCool <[email protected]>
Use a single jinja template for the prompts with and without a document. Also remove the conditionals checking for te presence of a document. Fixes NVIDIA#629 Signed-off-by: Derek Higgins <[email protected]>
Changelog
Added
stride_last
flag which allowsstride
andrepeat
flags to be used simultaneously. It will perform the strided convolution at the final Conv-BN-ReLU sub-block.swish
as optional activation functionzero_infinity
flag toCTCLoss
, default to False.Modified
se_reduction_ratio
to 8 instead of 16.SpecAugment
now supports either an integer or floating point value fortime_width
.Note: Currently,
examples/asr/contextnet.py
uses JasperDecoderForCTC instead of ContextNetDecoderForCTC. This will be updated in a future PR once full support is present.