Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Initial support for ContextNet Encoder and CTC Decoder #630

Merged
merged 19 commits into from
May 16, 2020

Conversation

titu1994
Copy link
Collaborator

@titu1994 titu1994 commented May 13, 2020

Changelog

Added

  • Add ContextNetEncoder, ContextNetDecoderForCTC neural modules to ASR collection
  • Add stride_last flag which allows stride and repeat flags to be used simultaneously. It will perform the strided convolution at the final Conv-BN-ReLU sub-block.
  • Add swish as optional activation function
  • Add zero_infinity flag to CTCLoss, default to False.
  • Adds integration test for ContextNetEncoder and ContextNetDecoderForCTC

Modified

  • Update Squeeze and Excitation sub-module to support different context sizes, support different activation
    • Change default se_reduction_ratio to 8 instead of 16.
  • SpecAugment now supports either an integer or floating point value for time_width.
    • If float is passed, adaptively uses it as percentage of current timesteps that should be cut.

Note: Currently, examples/asr/contextnet.py uses JasperDecoderForCTC instead of ContextNetDecoderForCTC. This will be updated in a future PR once full support is present.

@okuchaiev okuchaiev requested review from blisc and okuchaiev May 13, 2020 21:19
Copy link
Member

@okuchaiev okuchaiev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

few small comments

logging = nemo.logging


class ContextNetEncoder(TrainableNM):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this inherit from JasperEncoder ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point

Copy link
Collaborator Author

@titu1994 titu1994 May 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought, it probably should not inherit JasperEncoder. While yes currently they share exactly same functionality, in the future they will not. In that case, the __init__ call will instantiate multiple JasperBlocks before ContextNetEncoder starts to instantiate its own values.

While there is duplication for now, it is cleaner to separate the two modules

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

nemo/collections/asr/parts/jasper.py Show resolved Hide resolved
nemo/collections/asr/contextnet.py Show resolved Hide resolved
nemo/collections/asr/losses.py Show resolved Hide resolved
nemo/collections/asr/parts/jasper.py Outdated Show resolved Hide resolved
titu1994 added 3 commits May 13, 2020 15:35
Signed-off-by: smajumdar <[email protected]>
Signed-off-by: smajumdar <[email protected]>
@lgtm-com
Copy link

lgtm-com bot commented May 14, 2020

This pull request introduces 1 alert when merging 8c81303 into a22d325 - view on LGTM.com

new alerts:

  • 1 for Unused import

Signed-off-by: smajumdar <[email protected]>
blisc
blisc previously approved these changes May 14, 2020

# (ContextNet uses the Jasper baseline encoder and decoder)
encoder = nemo_asr.ContextNetEncoder(
feat_in=contextnet_params["AudioToMelSpectrogramPreprocessor"]["features"],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note that you can add this inside the yaml itself.
See https://confluence.atlassian.com/bitbucket/yaml-anchors-960154027.html

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the hint !

@lgtm-com
Copy link

lgtm-com bot commented May 14, 2020

This pull request introduces 1 alert when merging 81330ba into a22d325 - view on LGTM.com

new alerts:

  • 1 for Unused import

Signed-off-by: smajumdar <[email protected]>
@titu1994 titu1994 merged commit 99ef493 into NVIDIA:master May 16, 2020
@titu1994 titu1994 deleted the se_context_support branch May 16, 2020 05:16
ZeroCool940711 pushed a commit to ZeroCool940711/NeMo that referenced this pull request May 16, 2020
* Add SE + context SE support

Signed-off-by: smajumdar <[email protected]>

* Add contextnet components

Signed-off-by: smajumdar <[email protected]>

* Add ContextNet support

Signed-off-by: smajumdar <[email protected]>

* Add config files

Signed-off-by: smajumdar <[email protected]>

* Correct configs

Signed-off-by: smajumdar <[email protected]>

* Add streaming speech command

Signed-off-by: smajumdar <[email protected]>

* Add kernel size factor argument

Signed-off-by: smajumdar <[email protected]>

* Add docstrings

Signed-off-by: smajumdar <[email protected]>

* Update CHANGELOG.md

Signed-off-by: smajumdar <[email protected]>

* Add integration tests

Signed-off-by: smajumdar <[email protected]>

* Style fixes and add docstrings for se_reduction_ratio

Signed-off-by: smajumdar <[email protected]>

* Style fixes in tests

Signed-off-by: smajumdar <[email protected]>

* Correct CHANGELOG.md

Signed-off-by: smajumdar <[email protected]>

* Correctios to docstrings

Signed-off-by: smajumdar <[email protected]>

* Add WandB support to contextnet.py

Signed-off-by: smajumdar <[email protected]>

* Style fixes

Signed-off-by: smajumdar <[email protected]>

* Remove unused import

Signed-off-by: smajumdar <[email protected]>

* Refactor ContextNetEncoder to subclass JasperEncoder

Signed-off-by: smajumdar <[email protected]>

* Remove unused imports

Signed-off-by: smajumdar <[email protected]>
Signed-off-by: ZeroCool <[email protected]>
dcurran90 pushed a commit to dcurran90/NeMo that referenced this pull request Oct 15, 2024
Use a single jinja template for the prompts with and
without a document. Also remove the conditionals
checking for te presence of a document.

Fixes NVIDIA#629

Signed-off-by: Derek Higgins <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants