-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Text Normalization Update #2356
Conversation
ekmb
commented
Jun 14, 2021
- added support for fractional numbers
- added support for roman numbers up to 1000 (audio-based normalization only)
- parallel normalization of manifests
- bug fixes and pre/post-processing updates to improve normalization coverage
Signed-off-by: ekmb <[email protected]>
Signed-off-by: ekmb <[email protected]>
This pull request introduces 2 alerts when merging 31c220a into fbfdc1b - view on LGTM.com new alerts:
|
@@ -1,6 +1,10 @@ | |||
Ph.D. p h d | |||
Hon. honorable | |||
& and | |||
&Co. and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what happends to Co? could you delete this entry?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated and moved this to the alternative list
class RomanFst(GraphFst): | ||
""" | ||
Finite state transducer for verbalizing electronic | ||
e.g. tokens { electronic { username: "cdf1" domain: "abc.edu" } } -> c d f one at a b c dot e d u |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
adjust doc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Signed-off-by: ekmb <[email protected]>
This pull request introduces 3 alerts when merging bd37b1e into fbfdc1b - view on LGTM.com new alerts:
|
Signed-off-by: ekmb <[email protected]>
Signed-off-by: ekmb <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you please fix lgtm?
fixed already |
* upper cased date support Signed-off-by: ekmb <[email protected]> * update whitelist, change roman weights Signed-off-by: ekmb <[email protected]> * docstrings, space fix, init file Signed-off-by: ekmb <[email protected]> * lgtm Signed-off-by: ekmb <[email protected]> * fraction with measure class Signed-off-by: ekmb <[email protected]> Signed-off-by: Mike Chrzanowski <[email protected]>
* upper cased date support Signed-off-by: ekmb <[email protected]> * update whitelist, change roman weights Signed-off-by: ekmb <[email protected]> * docstrings, space fix, init file Signed-off-by: ekmb <[email protected]> * lgtm Signed-off-by: ekmb <[email protected]> * fraction with measure class Signed-off-by: ekmb <[email protected]> Signed-off-by: Micha Livne <[email protected]>
* upper cased date support Signed-off-by: ekmb <[email protected]> * update whitelist, change roman weights Signed-off-by: ekmb <[email protected]> * docstrings, space fix, init file Signed-off-by: ekmb <[email protected]> * lgtm Signed-off-by: ekmb <[email protected]> * fraction with measure class Signed-off-by: ekmb <[email protected]> Signed-off-by: mchrzanowski <[email protected]>
* Audio Norm (#2285) * add jenkins test, refactoring Signed-off-by: ekmb <[email protected]> * update test Signed-off-by: ekmb <[email protected]> * fix new test Signed-off-by: ekmb <[email protected]> * add serial to the default normalizer, add tests Signed-off-by: ekmb <[email protected]> * manifest test added Signed-off-by: ekmb <[email protected]> * expose more params, new test cases Signed-off-by: ekmb <[email protected]> * fix jenkins, serial clean, exclude range from cardinal Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins dollar sign format Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins dollar sign format Signed-off-by: ekmb <[email protected]> * addressed review comments Signed-off-by: ekmb <[email protected]> * fix decimal in measure Signed-off-by: ekmb <[email protected]> * move serial in cardinal Signed-off-by: ekmb <[email protected]> * clean up Signed-off-by: ekmb <[email protected]> * update for SH zero -> oh Signed-off-by: ekmb <[email protected]> * change n_tagger default Signed-off-by: ekmb <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * bumping version to 1.0.1 Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Add check for numba regardless of device Signed-off-by: smajumdar <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * upper bound for webdataset Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Correct Dockerfile Signed-off-by: smajumdar <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * update readmes Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * update README (#2332) Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * ddp translate GPU allocation fix (#2312) * fixed branch in IR tutorial Signed-off-by: AlexGrinch <[email protected]> * ddp translate GPU allocation fix Signed-off-by: AlexGrinch <[email protected]> * map_location instead of set_device Signed-off-by: AlexGrinch <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Shallow fusion (#2315) * fixed branch in IR tutorial Signed-off-by: AlexGrinch <[email protected]> * shallow fusion init commit Signed-off-by: AlexGrinch <[email protected]> * debug info removed Signed-off-by: AlexGrinch <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * [BUGFIX] Add upper bound to hydra for 1.0.x (#2337) * upper bound hydra Signed-off-by: ericharper <[email protected]> * upper bound hydra Signed-off-by: ericharper <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * update version number Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * update package version Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * sparrowhawk tests + punctuation post processing for pynini TN (#2320) * add jenkins test, refactoring Signed-off-by: ekmb <[email protected]> * update test Signed-off-by: ekmb <[email protected]> * fix new test Signed-off-by: ekmb <[email protected]> * add serial to the default normalizer, add tests Signed-off-by: ekmb <[email protected]> * manifest test added Signed-off-by: ekmb <[email protected]> * expose more params, new test cases Signed-off-by: ekmb <[email protected]> * fix jenkins, serial clean, exclude range from cardinal Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins dollar sign format Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins dollar sign format Signed-off-by: ekmb <[email protected]> * addressed review comments Signed-off-by: ekmb <[email protected]> * fix decimal in measure Signed-off-by: ekmb <[email protected]> * move serial in cardinal Signed-off-by: ekmb <[email protected]> * sh tests init Signed-off-by: ekmb <[email protected]> * sparrowhawk container tests support added Signed-off-by: ekmb <[email protected]> * add post process to normalize.py, update tests Signed-off-by: ekmb <[email protected]> * remove duplication Signed-off-by: ekmb <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Update notebooks to 1.0.2 release (#2338) Signed-off-by: smajumdar <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Update ranges for omegaconf and hydra (#2336) * Update ranges Signed-off-by: smajumdar <[email protected]> * Updates for Hydra and OmegaConf updates Signed-off-by: smajumdar <[email protected]> * Style fixes Signed-off-by: smajumdar <[email protected]> * Correct tests and revert patch for model utils Signed-off-by: smajumdar <[email protected]> * Correct docstring Signed-off-by: smajumdar <[email protected]> * Revert unnecessary change Signed-off-by: smajumdar <[email protected]> * Revert unnecessary change Signed-off-by: smajumdar <[email protected]> * Guard scheduler for None Signed-off-by: smajumdar <[email protected]> * default to 0.0 if bpe_dropout is None Signed-off-by: ericharper <[email protected]> * Correctly log class that was restored Signed-off-by: smajumdar <[email protected]> * Root patch *bpe_dropout Signed-off-by: smajumdar <[email protected]> Co-authored-by: ericharper <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Update FastPitch Export (#2355) Signed-off-by: Jason <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * byt5 unicode implementation, first cut Signed-off-by: mchrzanowski <[email protected]> * add bytelevel tokenizer Signed-off-by: mchrzanowski <[email protected]> * update out_dir to not collide (#2358) Signed-off-by: ericharper <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Update container version to 21.05 (#2309) * Update container version Signed-off-by: smajumdar <[email protected]> * Temporarily change export format of waveglow Signed-off-by: smajumdar <[email protected]> * Add conda update for numba Signed-off-by: smajumdar <[email protected]> * Update numba compat via global flag for strictness level `--relax_numba_compat`, remove pytorchlightning.metrics, refactor out numba utils to core, update tests Signed-off-by: smajumdar <[email protected]> * Correct order of numba minimum verion, remove wrong flag from test Signed-off-by: smajumdar <[email protected]> * Double test of cuda numba Signed-off-by: smajumdar <[email protected]> * Double test of cuda numba Signed-off-by: smajumdar <[email protected]> * Enable RNNT tests Signed-off-by: smajumdar <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Text Normalization Update (#2356) * upper cased date support Signed-off-by: ekmb <[email protected]> * update whitelist, change roman weights Signed-off-by: ekmb <[email protected]> * docstrings, space fix, init file Signed-off-by: ekmb <[email protected]> * lgtm Signed-off-by: ekmb <[email protected]> * fraction with measure class Signed-off-by: ekmb <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * address comment Signed-off-by: mchrzanowski <[email protected]> * Add ASR CTC tutorial on fine-tuning on another language (#2346) * Add ASR CTC Language finetuning notebook Signed-off-by: smajumdar <[email protected]> * Add to documentation Signed-off-by: smajumdar <[email protected]> * Improve documentation Signed-off-by: smajumdar <[email protected]> * Correct name of the dataset Signed-off-by: smajumdar <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Correct colab link to notebook (#2366) Signed-off-by: smajumdar <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * sgdqa update data directories for testing (#2323) * sgdqa update data directories for testing Signed-off-by: Yang Zhang <[email protected]> * fix syntax Signed-off-by: Yang Zhang <[email protected]> * check if data dir exists Signed-off-by: Yang Zhang <[email protected]> * fix Signed-off-by: Yang Zhang <[email protected]> * adding pretrained model Signed-off-by: Yang Zhang <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Added documentation for export() (#2330) * Added export document Signed-off-by: Boris Fomitchev <[email protected]> * Addressed review comments Signed-off-by: Boris Fomitchev <[email protected]> Co-authored-by: Eric Harper <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Update Citrinet model card info (#2369) * Update model card info Signed-off-by: smajumdar <[email protected]> * Cleanup Docs Signed-off-by: smajumdar <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * [NMT] Model Parallel Megatron Encoders (#2238) * add megatron encoder Signed-off-by: ericharper <[email protected]> * added megatron to get_nmt_tokenizer Signed-off-by: ericharper <[email protected]> * add vocab_size and hidden_size to megatron bert Signed-off-by: ericharper <[email protected]> * add megatron encoder module Signed-off-by: ericharper <[email protected]> * fixed horrible typo Signed-off-by: ericharper <[email protected]> * fix typo and add default Signed-off-by: ericharper <[email protected]> * updating nlp overrides for mp nmt Signed-off-by: ericharper <[email protected]> * move some logic back to nlpmodel from overrides Signed-off-by: ericharper <[email protected]> * add checkpoint_file property Signed-off-by: ericharper <[email protected]> * fix property Signed-off-by: ericharper <[email protected]> * num_tokentypes=0 Signed-off-by: ericharper <[email protected]> * typo Signed-off-by: ericharper <[email protected]> * typo Signed-off-by: ericharper <[email protected]> * find_unused_parameters=True Signed-off-by: ericharper <[email protected]> * typo Signed-off-by: ericharper <[email protected]> * style Signed-off-by: ericharper <[email protected]> * get instead of pop Signed-off-by: ericharper <[email protected]> * remove token type ids from megatron input example Signed-off-by: ericharper <[email protected]> * pop vocab_size Signed-off-by: ericharper <[email protected]> * fix checkpointing for model parallel Signed-off-by: ericharper <[email protected]> * fix bug in non model parallel Signed-off-by: ericharper <[email protected]> * convert cfg.trainer to dict Signed-off-by: ericharper <[email protected]> * make num_tokentypes configurable for nmt Signed-off-by: ericharper <[email protected]> * update checkpoint_file when using named megatron model in nemo Signed-off-by: ericharper <[email protected]> * make vocab_file configurable Signed-off-by: ericharper <[email protected]> * dataclass can't have mutable default Signed-off-by: ericharper <[email protected]> * style Signed-off-by: ericharper <[email protected]> * unused imports Signed-off-by: ericharper <[email protected]> * revert input example Signed-off-by: ericharper <[email protected]> * check that checkpoint version is not None Signed-off-by: ericharper <[email protected]> * add mp jenkins test Signed-off-by: ericharper <[email protected]> * update docstring Signed-off-by: ericharper <[email protected]> * add docs for pretrained encoders with nemo nmt Signed-off-by: ericharper <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Add notebook with recommendations for 8 kHz speech (#2326) * Added a notebook with best practices for telephony speech * Added datasets detaiils * Added training recommendations * Emptied out cells with results * Added tutorial to docs Signed-off-by: jbalam <[email protected]> * Addressed review comments Signed-off-by: jbalam <[email protected]> * Added a line to note original sampling rate of an4 Signed-off-by: jbalam <[email protected]> * Made changes suggested in review Signed-off-by: jbalam <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Add FastEmit support for RNNT Losses (#2374) * Temp commit Signed-off-by: smajumdar <[email protected]> * Initial code for fastemit forward pass Signed-off-by: smajumdar <[email protected]> * Correct return reg value Signed-off-by: smajumdar <[email protected]> * Initial cpu impl Signed-off-by: smajumdar <[email protected]> * Try gpu impl Signed-off-by: smajumdar <[email protected]> * Try gpu impl Signed-off-by: smajumdar <[email protected]> * Correct few impl Signed-off-by: smajumdar <[email protected]> * Update fastemit scaling Signed-off-by: smajumdar <[email protected]> * Cleanup fastemit Signed-off-by: smajumdar <[email protected]> * Finalize FastEmit regularization PR Signed-off-by: smajumdar <[email protected]> * Refactor code to support fastemit regularization Signed-off-by: smajumdar <[email protected]> Co-authored-by: Samuel Kriman <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * byt5 unicode implementation, first cut Signed-off-by: Mike Chrzanowski <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * add bytelevel tokenizer Signed-off-by: Mike Chrzanowski <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * update styling Signed-off-by: Mike Chrzanowski <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * avoid circular import Signed-off-by: Mike Chrzanowski <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * fix bugs in hifigan code (#2392) Signed-off-by: Oktai Tatanov <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Update setup.py (#2394) Signed-off-by: Jason <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Update bytelevel_tokenizer.py Signed-off-by: mchrzanowski <[email protected]> * Update bytelevel_tokenizer.py Signed-off-by: mchrzanowski <[email protected]> * typo Signed-off-by: mchrzanowski <[email protected]> * missed one Signed-off-by: mchrzanowski <[email protected]> * bug fixes Signed-off-by: mchrzanowski <[email protected]> * style fix Signed-off-by: mchrzanowski <[email protected]> * bytelevelprocessor is now generic. Signed-off-by: mchrzanowski <[email protected]> * style fix Signed-off-by: mchrzanowski <[email protected]> * update checkpointing (#2396) Signed-off-by: Jason <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * style Signed-off-by: ericharper <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * woops, didnt merge jenkinsfile the right way * add newline Signed-off-by: mchrzanowski <[email protected]> * undo changes to enja processor Signed-off-by: mchrzanowski <[email protected]> * processor selection decision fix Signed-off-by: mchrzanowski <[email protected]> * newline fix Signed-off-by: mchrzanowski <[email protected]> Co-authored-by: Evelina <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Aleksey Grinchuk (Oleksii Hrinchuk) <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Jason <[email protected]> Co-authored-by: mchrzanowski <[email protected]> Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: Boris Fomitchev <[email protected]> Co-authored-by: Jagadeesh Balam <[email protected]> Co-authored-by: Samuel Kriman <[email protected]> Co-authored-by: Oktai Tatanov <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: root <[email protected]>
* Add notebook with recommendations for 8 kHz speech (#2326) * Added a notebook with best practices for telephony speech * Added datasets detaiils * Added training recommendations * Emptied out cells with results * Added tutorial to docs Signed-off-by: jbalam <[email protected]> * Addressed review comments Signed-off-by: jbalam <[email protected]> * Added a line to note original sampling rate of an4 Signed-off-by: jbalam <[email protected]> * Made changes suggested in review Signed-off-by: jbalam <[email protected]> Signed-off-by: Tuan Lai <[email protected]> * Add FastEmit support for RNNT Losses (#2374) * Temp commit Signed-off-by: smajumdar <[email protected]> * Initial code for fastemit forward pass Signed-off-by: smajumdar <[email protected]> * Correct return reg value Signed-off-by: smajumdar <[email protected]> * Initial cpu impl Signed-off-by: smajumdar <[email protected]> * Try gpu impl Signed-off-by: smajumdar <[email protected]> * Try gpu impl Signed-off-by: smajumdar <[email protected]> * Correct few impl Signed-off-by: smajumdar <[email protected]> * Update fastemit scaling Signed-off-by: smajumdar <[email protected]> * Cleanup fastemit Signed-off-by: smajumdar <[email protected]> * Finalize FastEmit regularization PR Signed-off-by: smajumdar <[email protected]> * Refactor code to support fastemit regularization Signed-off-by: smajumdar <[email protected]> Co-authored-by: Samuel Kriman <[email protected]> Signed-off-by: Tuan Lai <[email protected]> * Implement inference functions of TN models Signed-off-by: Tuan Lai <[email protected]> * Minor Fix Signed-off-by: Tuan Lai <[email protected]> * fix bugs in hifigan code (#2392) Signed-off-by: Oktai Tatanov <[email protected]> Signed-off-by: Tuan Lai <[email protected]> * Update setup.py (#2394) Signed-off-by: Jason <[email protected]> Signed-off-by: Tuan Lai <[email protected]> * update checkpointing (#2396) Signed-off-by: Jason <[email protected]> Signed-off-by: Tuan Lai <[email protected]> * byt5 unicode implementation (#2365) * Audio Norm (#2285) * add jenkins test, refactoring Signed-off-by: ekmb <[email protected]> * update test Signed-off-by: ekmb <[email protected]> * fix new test Signed-off-by: ekmb <[email protected]> * add serial to the default normalizer, add tests Signed-off-by: ekmb <[email protected]> * manifest test added Signed-off-by: ekmb <[email protected]> * expose more params, new test cases Signed-off-by: ekmb <[email protected]> * fix jenkins, serial clean, exclude range from cardinal Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins dollar sign format Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins dollar sign format Signed-off-by: ekmb <[email protected]> * addressed review comments Signed-off-by: ekmb <[email protected]> * fix decimal in measure Signed-off-by: ekmb <[email protected]> * move serial in cardinal Signed-off-by: ekmb <[email protected]> * clean up Signed-off-by: ekmb <[email protected]> * update for SH zero -> oh Signed-off-by: ekmb <[email protected]> * change n_tagger default Signed-off-by: ekmb <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * bumping version to 1.0.1 Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Add check for numba regardless of device Signed-off-by: smajumdar <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * upper bound for webdataset Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Correct Dockerfile Signed-off-by: smajumdar <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * update readmes Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * update README (#2332) Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * ddp translate GPU allocation fix (#2312) * fixed branch in IR tutorial Signed-off-by: AlexGrinch <[email protected]> * ddp translate GPU allocation fix Signed-off-by: AlexGrinch <[email protected]> * map_location instead of set_device Signed-off-by: AlexGrinch <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Shallow fusion (#2315) * fixed branch in IR tutorial Signed-off-by: AlexGrinch <[email protected]> * shallow fusion init commit Signed-off-by: AlexGrinch <[email protected]> * debug info removed Signed-off-by: AlexGrinch <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * [BUGFIX] Add upper bound to hydra for 1.0.x (#2337) * upper bound hydra Signed-off-by: ericharper <[email protected]> * upper bound hydra Signed-off-by: ericharper <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * update version number Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * update package version Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * sparrowhawk tests + punctuation post processing for pynini TN (#2320) * add jenkins test, refactoring Signed-off-by: ekmb <[email protected]> * update test Signed-off-by: ekmb <[email protected]> * fix new test Signed-off-by: ekmb <[email protected]> * add serial to the default normalizer, add tests Signed-off-by: ekmb <[email protected]> * manifest test added Signed-off-by: ekmb <[email protected]> * expose more params, new test cases Signed-off-by: ekmb <[email protected]> * fix jenkins, serial clean, exclude range from cardinal Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins dollar sign format Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins dollar sign format Signed-off-by: ekmb <[email protected]> * addressed review comments Signed-off-by: ekmb <[email protected]> * fix decimal in measure Signed-off-by: ekmb <[email protected]> * move serial in cardinal Signed-off-by: ekmb <[email protected]> * sh tests init Signed-off-by: ekmb <[email protected]> * sparrowhawk container tests support added Signed-off-by: ekmb <[email protected]> * add post process to normalize.py, update tests Signed-off-by: ekmb <[email protected]> * remove duplication Signed-off-by: ekmb <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Update notebooks to 1.0.2 release (#2338) Signed-off-by: smajumdar <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Update ranges for omegaconf and hydra (#2336) * Update ranges Signed-off-by: smajumdar <[email protected]> * Updates for Hydra and OmegaConf updates Signed-off-by: smajumdar <[email protected]> * Style fixes Signed-off-by: smajumdar <[email protected]> * Correct tests and revert patch for model utils Signed-off-by: smajumdar <[email protected]> * Correct docstring Signed-off-by: smajumdar <[email protected]> * Revert unnecessary change Signed-off-by: smajumdar <[email protected]> * Revert unnecessary change Signed-off-by: smajumdar <[email protected]> * Guard scheduler for None Signed-off-by: smajumdar <[email protected]> * default to 0.0 if bpe_dropout is None Signed-off-by: ericharper <[email protected]> * Correctly log class that was restored Signed-off-by: smajumdar <[email protected]> * Root patch *bpe_dropout Signed-off-by: smajumdar <[email protected]> Co-authored-by: ericharper <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Update FastPitch Export (#2355) Signed-off-by: Jason <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * byt5 unicode implementation, first cut Signed-off-by: mchrzanowski <[email protected]> * add bytelevel tokenizer Signed-off-by: mchrzanowski <[email protected]> * update out_dir to not collide (#2358) Signed-off-by: ericharper <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Update container version to 21.05 (#2309) * Update container version Signed-off-by: smajumdar <[email protected]> * Temporarily change export format of waveglow Signed-off-by: smajumdar <[email protected]> * Add conda update for numba Signed-off-by: smajumdar <[email protected]> * Update numba compat via global flag for strictness level `--relax_numba_compat`, remove pytorchlightning.metrics, refactor out numba utils to core, update tests Signed-off-by: smajumdar <[email protected]> * Correct order of numba minimum verion, remove wrong flag from test Signed-off-by: smajumdar <[email protected]> * Double test of cuda numba Signed-off-by: smajumdar <[email protected]> * Double test of cuda numba Signed-off-by: smajumdar <[email protected]> * Enable RNNT tests Signed-off-by: smajumdar <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Text Normalization Update (#2356) * upper cased date support Signed-off-by: ekmb <[email protected]> * update whitelist, change roman weights Signed-off-by: ekmb <[email protected]> * docstrings, space fix, init file Signed-off-by: ekmb <[email protected]> * lgtm Signed-off-by: ekmb <[email protected]> * fraction with measure class Signed-off-by: ekmb <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * address comment Signed-off-by: mchrzanowski <[email protected]> * Add ASR CTC tutorial on fine-tuning on another language (#2346) * Add ASR CTC Language finetuning notebook Signed-off-by: smajumdar <[email protected]> * Add to documentation Signed-off-by: smajumdar <[email protected]> * Improve documentation Signed-off-by: smajumdar <[email protected]> * Correct name of the dataset Signed-off-by: smajumdar <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Correct colab link to notebook (#2366) Signed-off-by: smajumdar <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * sgdqa update data directories for testing (#2323) * sgdqa update data directories for testing Signed-off-by: Yang Zhang <[email protected]> * fix syntax Signed-off-by: Yang Zhang <[email protected]> * check if data dir exists Signed-off-by: Yang Zhang <[email protected]> * fix Signed-off-by: Yang Zhang <[email protected]> * adding pretrained model Signed-off-by: Yang Zhang <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Added documentation for export() (#2330) * Added export document Signed-off-by: Boris Fomitchev <[email protected]> * Addressed review comments Signed-off-by: Boris Fomitchev <[email protected]> Co-authored-by: Eric Harper <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Update Citrinet model card info (#2369) * Update model card info Signed-off-by: smajumdar <[email protected]> * Cleanup Docs Signed-off-by: smajumdar <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * [NMT] Model Parallel Megatron Encoders (#2238) * add megatron encoder Signed-off-by: ericharper <[email protected]> * added megatron to get_nmt_tokenizer Signed-off-by: ericharper <[email protected]> * add vocab_size and hidden_size to megatron bert Signed-off-by: ericharper <[email protected]> * add megatron encoder module Signed-off-by: ericharper <[email protected]> * fixed horrible typo Signed-off-by: ericharper <[email protected]> * fix typo and add default Signed-off-by: ericharper <[email protected]> * updating nlp overrides for mp nmt Signed-off-by: ericharper <[email protected]> * move some logic back to nlpmodel from overrides Signed-off-by: ericharper <[email protected]> * add checkpoint_file property Signed-off-by: ericharper <[email protected]> * fix property Signed-off-by: ericharper <[email protected]> * num_tokentypes=0 Signed-off-by: ericharper <[email protected]> * typo Signed-off-by: ericharper <[email protected]> * typo Signed-off-by: ericharper <[email protected]> * find_unused_parameters=True Signed-off-by: ericharper <[email protected]> * typo Signed-off-by: ericharper <[email protected]> * style Signed-off-by: ericharper <[email protected]> * get instead of pop Signed-off-by: ericharper <[email protected]> * remove token type ids from megatron input example Signed-off-by: ericharper <[email protected]> * pop vocab_size Signed-off-by: ericharper <[email protected]> * fix checkpointing for model parallel Signed-off-by: ericharper <[email protected]> * fix bug in non model parallel Signed-off-by: ericharper <[email protected]> * convert cfg.trainer to dict Signed-off-by: ericharper <[email protected]> * make num_tokentypes configurable for nmt Signed-off-by: ericharper <[email protected]> * update checkpoint_file when using named megatron model in nemo Signed-off-by: ericharper <[email protected]> * make vocab_file configurable Signed-off-by: ericharper <[email protected]> * dataclass can't have mutable default Signed-off-by: ericharper <[email protected]> * style Signed-off-by: ericharper <[email protected]> * unused imports Signed-off-by: ericharper <[email protected]> * revert input example Signed-off-by: ericharper <[email protected]> * check that checkpoint version is not None Signed-off-by: ericharper <[email protected]> * add mp jenkins test Signed-off-by: ericharper <[email protected]> * update docstring Signed-off-by: ericharper <[email protected]> * add docs for pretrained encoders with nemo nmt Signed-off-by: ericharper <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Add notebook with recommendations for 8 kHz speech (#2326) * Added a notebook with best practices for telephony speech * Added datasets detaiils * Added training recommendations * Emptied out cells with results * Added tutorial to docs Signed-off-by: jbalam <[email protected]> * Addressed review comments Signed-off-by: jbalam <[email protected]> * Added a line to note original sampling rate of an4 Signed-off-by: jbalam <[email protected]> * Made changes suggested in review Signed-off-by: jbalam <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Add FastEmit support for RNNT Losses (#2374) * Temp commit Signed-off-by: smajumdar <[email protected]> * Initial code for fastemit forward pass Signed-off-by: smajumdar <[email protected]> * Correct return reg value Signed-off-by: smajumdar <[email protected]> * Initial cpu impl Signed-off-by: smajumdar <[email protected]> * Try gpu impl Signed-off-by: smajumdar <[email protected]> * Try gpu impl Signed-off-by: smajumdar <[email protected]> * Correct few impl Signed-off-by: smajumdar <[email protected]> * Update fastemit scaling Signed-off-by: smajumdar <[email protected]> * Cleanup fastemit Signed-off-by: smajumdar <[email protected]> * Finalize FastEmit regularization PR Signed-off-by: smajumdar <[email protected]> * Refactor code to support fastemit regularization Signed-off-by: smajumdar <[email protected]> Co-authored-by: Samuel Kriman <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * byt5 unicode implementation, first cut Signed-off-by: Mike Chrzanowski <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * add bytelevel tokenizer Signed-off-by: Mike Chrzanowski <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * update styling Signed-off-by: Mike Chrzanowski <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * avoid circular import Signed-off-by: Mike Chrzanowski <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * fix bugs in hifigan code (#2392) Signed-off-by: Oktai Tatanov <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Update setup.py (#2394) Signed-off-by: Jason <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Update bytelevel_tokenizer.py Signed-off-by: mchrzanowski <[email protected]> * Update bytelevel_tokenizer.py Signed-off-by: mchrzanowski <[email protected]> * typo Signed-off-by: mchrzanowski <[email protected]> * missed one Signed-off-by: mchrzanowski <[email protected]> * bug fixes Signed-off-by: mchrzanowski <[email protected]> * style fix Signed-off-by: mchrzanowski <[email protected]> * bytelevelprocessor is now generic. Signed-off-by: mchrzanowski <[email protected]> * style fix Signed-off-by: mchrzanowski <[email protected]> * update checkpointing (#2396) Signed-off-by: Jason <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * style Signed-off-by: ericharper <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * woops, didnt merge jenkinsfile the right way * add newline Signed-off-by: mchrzanowski <[email protected]> * undo changes to enja processor Signed-off-by: mchrzanowski <[email protected]> * processor selection decision fix Signed-off-by: mchrzanowski <[email protected]> * newline fix Signed-off-by: mchrzanowski <[email protected]> Co-authored-by: Evelina <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Aleksey Grinchuk (Oleksii Hrinchuk) <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Jason <[email protected]> Co-authored-by: mchrzanowski <[email protected]> Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: Boris Fomitchev <[email protected]> Co-authored-by: Jagadeesh Balam <[email protected]> Co-authored-by: Samuel Kriman <[email protected]> Co-authored-by: Oktai Tatanov <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: root <[email protected]> Signed-off-by: Tuan Lai <[email protected]> * Minor Fix Signed-off-by: Tuan Lai <[email protected]> * Minor Fixes Signed-off-by: Tuan Lai <[email protected]> * Add TextNormalizationTestDataset and testing/evaluation code Signed-off-by: Tuan Lai <[email protected]> * Add TextNormalizationTaggerDataset and training code for tagger Signed-off-by: Tuan Lai <[email protected]> * Restore from local nemo ckpts Signed-off-by: Tuan Lai <[email protected]> * Add TextNormalizationDecoderDataset Signed-off-by: Tuan Lai <[email protected]> * Add interactive mode for neural_text_normalization_test.py Signed-off-by: Tuan Lai <[email protected]> * Add options to do training or not for tagger/decoder Signed-off-by: Tuan Lai <[email protected]> * Renamed Signed-off-by: Tuan Lai <[email protected]> * Implemented setup dataloader for decoder Signed-off-by: Tuan Lai <[email protected]> * Implemented training and validation for decoder Signed-off-by: Tuan Lai <[email protected]> * Data augmentation for decoder training Signed-off-by: Tuan Lai <[email protected]> * Config change Signed-off-by: Tuan Lai <[email protected]> * add blossom-ci.yml (#2401) Signed-off-by: ericharper <[email protected]> Signed-off-by: Tuan Lai <[email protected]> * Merge r1.1 bugfixes into main (#2407) * Update notebook branch and Jenkinsfile for 1.1.0 testing (#2378) * update branch Signed-off-by: ericharper <[email protected]> * update jenkinsfile Signed-off-by: ericharper <[email protected]> * [BUGFIX] NMT Multi-node was incorrectly computing num_replicas (#2380) * fix property when not using model parallel Signed-off-by: ericharper <[email protected]> * fix property when not using model parallel Signed-off-by: ericharper <[email protected]> * add debug statement Signed-off-by: ericharper <[email protected]> * add debug statement Signed-off-by: ericharper <[email protected]> * instantiate with NLPDDPPlugin with num_nodes from trainer config Signed-off-by: ericharper <[email protected]> * Update ASR scripts for tokenizer building and tarred dataset building (#2381) * Update ASR scripts for tokenizer building and tarred dataset building Signed-off-by: smajumdar <[email protected]> * Update container Signed-off-by: smajumdar <[email protected]> * Add STT Zh Citrinet 1024 Gamma 0.25 model Signed-off-by: smajumdar <[email protected]> * Update notebook (#2391) Signed-off-by: smajumdar <[email protected]> * ASR Notebooks fix for 1.1.0 (#2395) * nb fix for spring clean Signed-off-by: fayejf <[email protected]> * remove outdated instruction Signed-off-by: fayejf <[email protected]> * Mean normalization (#2397) * norm embeddings Signed-off-by: nithinraok <[email protected]> * move to utils Signed-off-by: nithinraok <[email protected]> * Bugfix adaptive spec augment time masking (#2398) * bugfix adaptive spec augment Signed-off-by: smajumdar <[email protected]> * Revert freq mask guard Signed-off-by: smajumdar <[email protected]> * Revert freq mask guard Signed-off-by: smajumdar <[email protected]> * Remove static time width clamping Signed-off-by: smajumdar <[email protected]> * Correct typos and issues with notebooks (#2402) * Fix Primer notebook Signed-off-by: smajumdar <[email protected]> * Typo Signed-off-by: smajumdar <[email protected]> * remove accelerator=DDP in tutorial notebooks to avoid errors. (#2403) Signed-off-by: Hoo Chang Shin <[email protected]> Co-authored-by: Hoo Chang Shin <[email protected]> * style Signed-off-by: ericharper <[email protected]> * update jenkins branch Signed-off-by: ericharper <[email protected]> * update notebook branch to main Signed-off-by: ericharper <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: fayejf <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Co-authored-by: khcs <[email protected]> Co-authored-by: Hoo Chang Shin <[email protected]> Signed-off-by: Tuan Lai <[email protected]> * Remove unused imports Signed-off-by: Tuan Lai <[email protected]> * Add initial doc for text_normalization Signed-off-by: Tuan Lai <[email protected]> * Fixed imports warnings Signed-off-by: Tuan Lai <[email protected]> * Minor Fix Signed-off-by: Tuan Lai <[email protected]> * Renamed Signed-off-by: Tuan Lai <[email protected]> * Allowed duplex modes Signed-off-by: Tuan Lai <[email protected]> * Minor Fix Signed-off-by: Tuan Lai <[email protected]> * Add docs for duplex_text_normalization_train and duplex_text_normalization_test Signed-off-by: Tuan Lai <[email protected]> * docstrings for model codes + minor fix Signed-off-by: Tuan Lai <[email protected]> * Add more comments and doc strings Signed-off-by: Tuan Lai <[email protected]> * Add doc for datasets + Use time.perf_counter() Signed-off-by: Tuan Lai <[email protected]> * Add code for preprocessing Google TN data Signed-off-by: Tuan Lai <[email protected]> * Add more docs and comments + Minor Fixes Signed-off-by: Tuan Lai <[email protected]> * Add more licenses + Fixed comments + Minors Signed-off-by: Tuan Lai <[email protected]> * Moved evaluation logic to DuplexTextNormalizationModel Signed-off-by: Tuan Lai <[email protected]> * Add logging errors Signed-off-by: Tuan Lai <[email protected]> * Updated validation code of tagger + Minors Signed-off-by: Tuan Lai <[email protected]> * Also write tag preds to log file Signed-off-by: Tuan Lai <[email protected]> * Add data augmentation for tagger dataset Signed-off-by: Tuan Lai <[email protected]> * Added experimental decorators Signed-off-by: Tuan Lai <[email protected]> * Updated docs Signed-off-by: Tuan Lai <[email protected]> * Updated duplex_tn_config.yaml Signed-off-by: Tuan Lai <[email protected]> * Compute token precision of tagger using NeMo metrics Signed-off-by: Tuan Lai <[email protected]> * Fixed saving issue when using ddp accelerator Signed-off-by: Tuan Lai <[email protected]> * Refactoring Signed-off-by: Tuan Lai <[email protected]> * Add option to keep punctuations in TextNormalizationTestDataset Signed-off-by: Tuan Lai <[email protected]> * Changes to input preprocessing + decoder's postprocessing Signed-off-by: Tuan Lai <[email protected]> * Fixed styles + Add references Signed-off-by: Tuan Lai <[email protected]> * Renamed examples/nlp/duplex_text_normalization/utils.py to helpers.py Signed-off-by: Tuan Lai <[email protected]> Co-authored-by: Jagadeesh Balam <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: Samuel Kriman <[email protected]> Co-authored-by: Oktai Tatanov <[email protected]> Co-authored-by: Jason <[email protected]> Co-authored-by: Mike Chrzanowski <[email protected]> Co-authored-by: Evelina <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Aleksey Grinchuk (Oleksii Hrinchuk) <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: mchrzanowski <[email protected]> Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: Boris Fomitchev <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: fayejf <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Co-authored-by: khcs <[email protected]> Co-authored-by: Hoo Chang Shin <[email protected]>
* upper cased date support Signed-off-by: ekmb <[email protected]> * update whitelist, change roman weights Signed-off-by: ekmb <[email protected]> * docstrings, space fix, init file Signed-off-by: ekmb <[email protected]> * lgtm Signed-off-by: ekmb <[email protected]> * fraction with measure class Signed-off-by: ekmb <[email protected]>
* Audio Norm (NVIDIA#2285) * add jenkins test, refactoring Signed-off-by: ekmb <[email protected]> * update test Signed-off-by: ekmb <[email protected]> * fix new test Signed-off-by: ekmb <[email protected]> * add serial to the default normalizer, add tests Signed-off-by: ekmb <[email protected]> * manifest test added Signed-off-by: ekmb <[email protected]> * expose more params, new test cases Signed-off-by: ekmb <[email protected]> * fix jenkins, serial clean, exclude range from cardinal Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins dollar sign format Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins dollar sign format Signed-off-by: ekmb <[email protected]> * addressed review comments Signed-off-by: ekmb <[email protected]> * fix decimal in measure Signed-off-by: ekmb <[email protected]> * move serial in cardinal Signed-off-by: ekmb <[email protected]> * clean up Signed-off-by: ekmb <[email protected]> * update for SH zero -> oh Signed-off-by: ekmb <[email protected]> * change n_tagger default Signed-off-by: ekmb <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * bumping version to 1.0.1 Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Add check for numba regardless of device Signed-off-by: smajumdar <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * upper bound for webdataset Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Correct Dockerfile Signed-off-by: smajumdar <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * update readmes Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * update README (NVIDIA#2332) Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * ddp translate GPU allocation fix (NVIDIA#2312) * fixed branch in IR tutorial Signed-off-by: AlexGrinch <[email protected]> * ddp translate GPU allocation fix Signed-off-by: AlexGrinch <[email protected]> * map_location instead of set_device Signed-off-by: AlexGrinch <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Shallow fusion (NVIDIA#2315) * fixed branch in IR tutorial Signed-off-by: AlexGrinch <[email protected]> * shallow fusion init commit Signed-off-by: AlexGrinch <[email protected]> * debug info removed Signed-off-by: AlexGrinch <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * [BUGFIX] Add upper bound to hydra for 1.0.x (NVIDIA#2337) * upper bound hydra Signed-off-by: ericharper <[email protected]> * upper bound hydra Signed-off-by: ericharper <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * update version number Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * update package version Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * sparrowhawk tests + punctuation post processing for pynini TN (NVIDIA#2320) * add jenkins test, refactoring Signed-off-by: ekmb <[email protected]> * update test Signed-off-by: ekmb <[email protected]> * fix new test Signed-off-by: ekmb <[email protected]> * add serial to the default normalizer, add tests Signed-off-by: ekmb <[email protected]> * manifest test added Signed-off-by: ekmb <[email protected]> * expose more params, new test cases Signed-off-by: ekmb <[email protected]> * fix jenkins, serial clean, exclude range from cardinal Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins dollar sign format Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins dollar sign format Signed-off-by: ekmb <[email protected]> * addressed review comments Signed-off-by: ekmb <[email protected]> * fix decimal in measure Signed-off-by: ekmb <[email protected]> * move serial in cardinal Signed-off-by: ekmb <[email protected]> * sh tests init Signed-off-by: ekmb <[email protected]> * sparrowhawk container tests support added Signed-off-by: ekmb <[email protected]> * add post process to normalize.py, update tests Signed-off-by: ekmb <[email protected]> * remove duplication Signed-off-by: ekmb <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Update notebooks to 1.0.2 release (NVIDIA#2338) Signed-off-by: smajumdar <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Update ranges for omegaconf and hydra (NVIDIA#2336) * Update ranges Signed-off-by: smajumdar <[email protected]> * Updates for Hydra and OmegaConf updates Signed-off-by: smajumdar <[email protected]> * Style fixes Signed-off-by: smajumdar <[email protected]> * Correct tests and revert patch for model utils Signed-off-by: smajumdar <[email protected]> * Correct docstring Signed-off-by: smajumdar <[email protected]> * Revert unnecessary change Signed-off-by: smajumdar <[email protected]> * Revert unnecessary change Signed-off-by: smajumdar <[email protected]> * Guard scheduler for None Signed-off-by: smajumdar <[email protected]> * default to 0.0 if bpe_dropout is None Signed-off-by: ericharper <[email protected]> * Correctly log class that was restored Signed-off-by: smajumdar <[email protected]> * Root patch *bpe_dropout Signed-off-by: smajumdar <[email protected]> Co-authored-by: ericharper <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Update FastPitch Export (NVIDIA#2355) Signed-off-by: Jason <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * byt5 unicode implementation, first cut Signed-off-by: mchrzanowski <[email protected]> * add bytelevel tokenizer Signed-off-by: mchrzanowski <[email protected]> * update out_dir to not collide (NVIDIA#2358) Signed-off-by: ericharper <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Update container version to 21.05 (NVIDIA#2309) * Update container version Signed-off-by: smajumdar <[email protected]> * Temporarily change export format of waveglow Signed-off-by: smajumdar <[email protected]> * Add conda update for numba Signed-off-by: smajumdar <[email protected]> * Update numba compat via global flag for strictness level `--relax_numba_compat`, remove pytorchlightning.metrics, refactor out numba utils to core, update tests Signed-off-by: smajumdar <[email protected]> * Correct order of numba minimum verion, remove wrong flag from test Signed-off-by: smajumdar <[email protected]> * Double test of cuda numba Signed-off-by: smajumdar <[email protected]> * Double test of cuda numba Signed-off-by: smajumdar <[email protected]> * Enable RNNT tests Signed-off-by: smajumdar <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Text Normalization Update (NVIDIA#2356) * upper cased date support Signed-off-by: ekmb <[email protected]> * update whitelist, change roman weights Signed-off-by: ekmb <[email protected]> * docstrings, space fix, init file Signed-off-by: ekmb <[email protected]> * lgtm Signed-off-by: ekmb <[email protected]> * fraction with measure class Signed-off-by: ekmb <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * address comment Signed-off-by: mchrzanowski <[email protected]> * Add ASR CTC tutorial on fine-tuning on another language (NVIDIA#2346) * Add ASR CTC Language finetuning notebook Signed-off-by: smajumdar <[email protected]> * Add to documentation Signed-off-by: smajumdar <[email protected]> * Improve documentation Signed-off-by: smajumdar <[email protected]> * Correct name of the dataset Signed-off-by: smajumdar <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Correct colab link to notebook (NVIDIA#2366) Signed-off-by: smajumdar <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * sgdqa update data directories for testing (NVIDIA#2323) * sgdqa update data directories for testing Signed-off-by: Yang Zhang <[email protected]> * fix syntax Signed-off-by: Yang Zhang <[email protected]> * check if data dir exists Signed-off-by: Yang Zhang <[email protected]> * fix Signed-off-by: Yang Zhang <[email protected]> * adding pretrained model Signed-off-by: Yang Zhang <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Added documentation for export() (NVIDIA#2330) * Added export document Signed-off-by: Boris Fomitchev <[email protected]> * Addressed review comments Signed-off-by: Boris Fomitchev <[email protected]> Co-authored-by: Eric Harper <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Update Citrinet model card info (NVIDIA#2369) * Update model card info Signed-off-by: smajumdar <[email protected]> * Cleanup Docs Signed-off-by: smajumdar <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * [NMT] Model Parallel Megatron Encoders (NVIDIA#2238) * add megatron encoder Signed-off-by: ericharper <[email protected]> * added megatron to get_nmt_tokenizer Signed-off-by: ericharper <[email protected]> * add vocab_size and hidden_size to megatron bert Signed-off-by: ericharper <[email protected]> * add megatron encoder module Signed-off-by: ericharper <[email protected]> * fixed horrible typo Signed-off-by: ericharper <[email protected]> * fix typo and add default Signed-off-by: ericharper <[email protected]> * updating nlp overrides for mp nmt Signed-off-by: ericharper <[email protected]> * move some logic back to nlpmodel from overrides Signed-off-by: ericharper <[email protected]> * add checkpoint_file property Signed-off-by: ericharper <[email protected]> * fix property Signed-off-by: ericharper <[email protected]> * num_tokentypes=0 Signed-off-by: ericharper <[email protected]> * typo Signed-off-by: ericharper <[email protected]> * typo Signed-off-by: ericharper <[email protected]> * find_unused_parameters=True Signed-off-by: ericharper <[email protected]> * typo Signed-off-by: ericharper <[email protected]> * style Signed-off-by: ericharper <[email protected]> * get instead of pop Signed-off-by: ericharper <[email protected]> * remove token type ids from megatron input example Signed-off-by: ericharper <[email protected]> * pop vocab_size Signed-off-by: ericharper <[email protected]> * fix checkpointing for model parallel Signed-off-by: ericharper <[email protected]> * fix bug in non model parallel Signed-off-by: ericharper <[email protected]> * convert cfg.trainer to dict Signed-off-by: ericharper <[email protected]> * make num_tokentypes configurable for nmt Signed-off-by: ericharper <[email protected]> * update checkpoint_file when using named megatron model in nemo Signed-off-by: ericharper <[email protected]> * make vocab_file configurable Signed-off-by: ericharper <[email protected]> * dataclass can't have mutable default Signed-off-by: ericharper <[email protected]> * style Signed-off-by: ericharper <[email protected]> * unused imports Signed-off-by: ericharper <[email protected]> * revert input example Signed-off-by: ericharper <[email protected]> * check that checkpoint version is not None Signed-off-by: ericharper <[email protected]> * add mp jenkins test Signed-off-by: ericharper <[email protected]> * update docstring Signed-off-by: ericharper <[email protected]> * add docs for pretrained encoders with nemo nmt Signed-off-by: ericharper <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Add notebook with recommendations for 8 kHz speech (NVIDIA#2326) * Added a notebook with best practices for telephony speech * Added datasets detaiils * Added training recommendations * Emptied out cells with results * Added tutorial to docs Signed-off-by: jbalam <[email protected]> * Addressed review comments Signed-off-by: jbalam <[email protected]> * Added a line to note original sampling rate of an4 Signed-off-by: jbalam <[email protected]> * Made changes suggested in review Signed-off-by: jbalam <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Add FastEmit support for RNNT Losses (NVIDIA#2374) * Temp commit Signed-off-by: smajumdar <[email protected]> * Initial code for fastemit forward pass Signed-off-by: smajumdar <[email protected]> * Correct return reg value Signed-off-by: smajumdar <[email protected]> * Initial cpu impl Signed-off-by: smajumdar <[email protected]> * Try gpu impl Signed-off-by: smajumdar <[email protected]> * Try gpu impl Signed-off-by: smajumdar <[email protected]> * Correct few impl Signed-off-by: smajumdar <[email protected]> * Update fastemit scaling Signed-off-by: smajumdar <[email protected]> * Cleanup fastemit Signed-off-by: smajumdar <[email protected]> * Finalize FastEmit regularization PR Signed-off-by: smajumdar <[email protected]> * Refactor code to support fastemit regularization Signed-off-by: smajumdar <[email protected]> Co-authored-by: Samuel Kriman <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * byt5 unicode implementation, first cut Signed-off-by: Mike Chrzanowski <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * add bytelevel tokenizer Signed-off-by: Mike Chrzanowski <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * update styling Signed-off-by: Mike Chrzanowski <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * avoid circular import Signed-off-by: Mike Chrzanowski <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * fix bugs in hifigan code (NVIDIA#2392) Signed-off-by: Oktai Tatanov <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Update setup.py (NVIDIA#2394) Signed-off-by: Jason <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Update bytelevel_tokenizer.py Signed-off-by: mchrzanowski <[email protected]> * Update bytelevel_tokenizer.py Signed-off-by: mchrzanowski <[email protected]> * typo Signed-off-by: mchrzanowski <[email protected]> * missed one Signed-off-by: mchrzanowski <[email protected]> * bug fixes Signed-off-by: mchrzanowski <[email protected]> * style fix Signed-off-by: mchrzanowski <[email protected]> * bytelevelprocessor is now generic. Signed-off-by: mchrzanowski <[email protected]> * style fix Signed-off-by: mchrzanowski <[email protected]> * update checkpointing (NVIDIA#2396) Signed-off-by: Jason <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * style Signed-off-by: ericharper <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * woops, didnt merge jenkinsfile the right way * add newline Signed-off-by: mchrzanowski <[email protected]> * undo changes to enja processor Signed-off-by: mchrzanowski <[email protected]> * processor selection decision fix Signed-off-by: mchrzanowski <[email protected]> * newline fix Signed-off-by: mchrzanowski <[email protected]> Co-authored-by: Evelina <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Aleksey Grinchuk (Oleksii Hrinchuk) <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Jason <[email protected]> Co-authored-by: mchrzanowski <[email protected]> Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: Boris Fomitchev <[email protected]> Co-authored-by: Jagadeesh Balam <[email protected]> Co-authored-by: Samuel Kriman <[email protected]> Co-authored-by: Oktai Tatanov <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: root <[email protected]>
* Add notebook with recommendations for 8 kHz speech (NVIDIA#2326) * Added a notebook with best practices for telephony speech * Added datasets detaiils * Added training recommendations * Emptied out cells with results * Added tutorial to docs Signed-off-by: jbalam <[email protected]> * Addressed review comments Signed-off-by: jbalam <[email protected]> * Added a line to note original sampling rate of an4 Signed-off-by: jbalam <[email protected]> * Made changes suggested in review Signed-off-by: jbalam <[email protected]> Signed-off-by: Tuan Lai <[email protected]> * Add FastEmit support for RNNT Losses (NVIDIA#2374) * Temp commit Signed-off-by: smajumdar <[email protected]> * Initial code for fastemit forward pass Signed-off-by: smajumdar <[email protected]> * Correct return reg value Signed-off-by: smajumdar <[email protected]> * Initial cpu impl Signed-off-by: smajumdar <[email protected]> * Try gpu impl Signed-off-by: smajumdar <[email protected]> * Try gpu impl Signed-off-by: smajumdar <[email protected]> * Correct few impl Signed-off-by: smajumdar <[email protected]> * Update fastemit scaling Signed-off-by: smajumdar <[email protected]> * Cleanup fastemit Signed-off-by: smajumdar <[email protected]> * Finalize FastEmit regularization PR Signed-off-by: smajumdar <[email protected]> * Refactor code to support fastemit regularization Signed-off-by: smajumdar <[email protected]> Co-authored-by: Samuel Kriman <[email protected]> Signed-off-by: Tuan Lai <[email protected]> * Implement inference functions of TN models Signed-off-by: Tuan Lai <[email protected]> * Minor Fix Signed-off-by: Tuan Lai <[email protected]> * fix bugs in hifigan code (NVIDIA#2392) Signed-off-by: Oktai Tatanov <[email protected]> Signed-off-by: Tuan Lai <[email protected]> * Update setup.py (NVIDIA#2394) Signed-off-by: Jason <[email protected]> Signed-off-by: Tuan Lai <[email protected]> * update checkpointing (NVIDIA#2396) Signed-off-by: Jason <[email protected]> Signed-off-by: Tuan Lai <[email protected]> * byt5 unicode implementation (NVIDIA#2365) * Audio Norm (NVIDIA#2285) * add jenkins test, refactoring Signed-off-by: ekmb <[email protected]> * update test Signed-off-by: ekmb <[email protected]> * fix new test Signed-off-by: ekmb <[email protected]> * add serial to the default normalizer, add tests Signed-off-by: ekmb <[email protected]> * manifest test added Signed-off-by: ekmb <[email protected]> * expose more params, new test cases Signed-off-by: ekmb <[email protected]> * fix jenkins, serial clean, exclude range from cardinal Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins dollar sign format Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins dollar sign format Signed-off-by: ekmb <[email protected]> * addressed review comments Signed-off-by: ekmb <[email protected]> * fix decimal in measure Signed-off-by: ekmb <[email protected]> * move serial in cardinal Signed-off-by: ekmb <[email protected]> * clean up Signed-off-by: ekmb <[email protected]> * update for SH zero -> oh Signed-off-by: ekmb <[email protected]> * change n_tagger default Signed-off-by: ekmb <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * bumping version to 1.0.1 Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Add check for numba regardless of device Signed-off-by: smajumdar <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * upper bound for webdataset Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Correct Dockerfile Signed-off-by: smajumdar <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * update readmes Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * update README (NVIDIA#2332) Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * ddp translate GPU allocation fix (NVIDIA#2312) * fixed branch in IR tutorial Signed-off-by: AlexGrinch <[email protected]> * ddp translate GPU allocation fix Signed-off-by: AlexGrinch <[email protected]> * map_location instead of set_device Signed-off-by: AlexGrinch <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Shallow fusion (NVIDIA#2315) * fixed branch in IR tutorial Signed-off-by: AlexGrinch <[email protected]> * shallow fusion init commit Signed-off-by: AlexGrinch <[email protected]> * debug info removed Signed-off-by: AlexGrinch <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * [BUGFIX] Add upper bound to hydra for 1.0.x (NVIDIA#2337) * upper bound hydra Signed-off-by: ericharper <[email protected]> * upper bound hydra Signed-off-by: ericharper <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * update version number Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * update package version Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * sparrowhawk tests + punctuation post processing for pynini TN (NVIDIA#2320) * add jenkins test, refactoring Signed-off-by: ekmb <[email protected]> * update test Signed-off-by: ekmb <[email protected]> * fix new test Signed-off-by: ekmb <[email protected]> * add serial to the default normalizer, add tests Signed-off-by: ekmb <[email protected]> * manifest test added Signed-off-by: ekmb <[email protected]> * expose more params, new test cases Signed-off-by: ekmb <[email protected]> * fix jenkins, serial clean, exclude range from cardinal Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins dollar sign format Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins dollar sign format Signed-off-by: ekmb <[email protected]> * addressed review comments Signed-off-by: ekmb <[email protected]> * fix decimal in measure Signed-off-by: ekmb <[email protected]> * move serial in cardinal Signed-off-by: ekmb <[email protected]> * sh tests init Signed-off-by: ekmb <[email protected]> * sparrowhawk container tests support added Signed-off-by: ekmb <[email protected]> * add post process to normalize.py, update tests Signed-off-by: ekmb <[email protected]> * remove duplication Signed-off-by: ekmb <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Update notebooks to 1.0.2 release (NVIDIA#2338) Signed-off-by: smajumdar <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Update ranges for omegaconf and hydra (NVIDIA#2336) * Update ranges Signed-off-by: smajumdar <[email protected]> * Updates for Hydra and OmegaConf updates Signed-off-by: smajumdar <[email protected]> * Style fixes Signed-off-by: smajumdar <[email protected]> * Correct tests and revert patch for model utils Signed-off-by: smajumdar <[email protected]> * Correct docstring Signed-off-by: smajumdar <[email protected]> * Revert unnecessary change Signed-off-by: smajumdar <[email protected]> * Revert unnecessary change Signed-off-by: smajumdar <[email protected]> * Guard scheduler for None Signed-off-by: smajumdar <[email protected]> * default to 0.0 if bpe_dropout is None Signed-off-by: ericharper <[email protected]> * Correctly log class that was restored Signed-off-by: smajumdar <[email protected]> * Root patch *bpe_dropout Signed-off-by: smajumdar <[email protected]> Co-authored-by: ericharper <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Update FastPitch Export (NVIDIA#2355) Signed-off-by: Jason <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * byt5 unicode implementation, first cut Signed-off-by: mchrzanowski <[email protected]> * add bytelevel tokenizer Signed-off-by: mchrzanowski <[email protected]> * update out_dir to not collide (NVIDIA#2358) Signed-off-by: ericharper <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Update container version to 21.05 (NVIDIA#2309) * Update container version Signed-off-by: smajumdar <[email protected]> * Temporarily change export format of waveglow Signed-off-by: smajumdar <[email protected]> * Add conda update for numba Signed-off-by: smajumdar <[email protected]> * Update numba compat via global flag for strictness level `--relax_numba_compat`, remove pytorchlightning.metrics, refactor out numba utils to core, update tests Signed-off-by: smajumdar <[email protected]> * Correct order of numba minimum verion, remove wrong flag from test Signed-off-by: smajumdar <[email protected]> * Double test of cuda numba Signed-off-by: smajumdar <[email protected]> * Double test of cuda numba Signed-off-by: smajumdar <[email protected]> * Enable RNNT tests Signed-off-by: smajumdar <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Text Normalization Update (NVIDIA#2356) * upper cased date support Signed-off-by: ekmb <[email protected]> * update whitelist, change roman weights Signed-off-by: ekmb <[email protected]> * docstrings, space fix, init file Signed-off-by: ekmb <[email protected]> * lgtm Signed-off-by: ekmb <[email protected]> * fraction with measure class Signed-off-by: ekmb <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * address comment Signed-off-by: mchrzanowski <[email protected]> * Add ASR CTC tutorial on fine-tuning on another language (NVIDIA#2346) * Add ASR CTC Language finetuning notebook Signed-off-by: smajumdar <[email protected]> * Add to documentation Signed-off-by: smajumdar <[email protected]> * Improve documentation Signed-off-by: smajumdar <[email protected]> * Correct name of the dataset Signed-off-by: smajumdar <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Correct colab link to notebook (NVIDIA#2366) Signed-off-by: smajumdar <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * sgdqa update data directories for testing (NVIDIA#2323) * sgdqa update data directories for testing Signed-off-by: Yang Zhang <[email protected]> * fix syntax Signed-off-by: Yang Zhang <[email protected]> * check if data dir exists Signed-off-by: Yang Zhang <[email protected]> * fix Signed-off-by: Yang Zhang <[email protected]> * adding pretrained model Signed-off-by: Yang Zhang <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Added documentation for export() (NVIDIA#2330) * Added export document Signed-off-by: Boris Fomitchev <[email protected]> * Addressed review comments Signed-off-by: Boris Fomitchev <[email protected]> Co-authored-by: Eric Harper <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Update Citrinet model card info (NVIDIA#2369) * Update model card info Signed-off-by: smajumdar <[email protected]> * Cleanup Docs Signed-off-by: smajumdar <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * [NMT] Model Parallel Megatron Encoders (NVIDIA#2238) * add megatron encoder Signed-off-by: ericharper <[email protected]> * added megatron to get_nmt_tokenizer Signed-off-by: ericharper <[email protected]> * add vocab_size and hidden_size to megatron bert Signed-off-by: ericharper <[email protected]> * add megatron encoder module Signed-off-by: ericharper <[email protected]> * fixed horrible typo Signed-off-by: ericharper <[email protected]> * fix typo and add default Signed-off-by: ericharper <[email protected]> * updating nlp overrides for mp nmt Signed-off-by: ericharper <[email protected]> * move some logic back to nlpmodel from overrides Signed-off-by: ericharper <[email protected]> * add checkpoint_file property Signed-off-by: ericharper <[email protected]> * fix property Signed-off-by: ericharper <[email protected]> * num_tokentypes=0 Signed-off-by: ericharper <[email protected]> * typo Signed-off-by: ericharper <[email protected]> * typo Signed-off-by: ericharper <[email protected]> * find_unused_parameters=True Signed-off-by: ericharper <[email protected]> * typo Signed-off-by: ericharper <[email protected]> * style Signed-off-by: ericharper <[email protected]> * get instead of pop Signed-off-by: ericharper <[email protected]> * remove token type ids from megatron input example Signed-off-by: ericharper <[email protected]> * pop vocab_size Signed-off-by: ericharper <[email protected]> * fix checkpointing for model parallel Signed-off-by: ericharper <[email protected]> * fix bug in non model parallel Signed-off-by: ericharper <[email protected]> * convert cfg.trainer to dict Signed-off-by: ericharper <[email protected]> * make num_tokentypes configurable for nmt Signed-off-by: ericharper <[email protected]> * update checkpoint_file when using named megatron model in nemo Signed-off-by: ericharper <[email protected]> * make vocab_file configurable Signed-off-by: ericharper <[email protected]> * dataclass can't have mutable default Signed-off-by: ericharper <[email protected]> * style Signed-off-by: ericharper <[email protected]> * unused imports Signed-off-by: ericharper <[email protected]> * revert input example Signed-off-by: ericharper <[email protected]> * check that checkpoint version is not None Signed-off-by: ericharper <[email protected]> * add mp jenkins test Signed-off-by: ericharper <[email protected]> * update docstring Signed-off-by: ericharper <[email protected]> * add docs for pretrained encoders with nemo nmt Signed-off-by: ericharper <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Add notebook with recommendations for 8 kHz speech (NVIDIA#2326) * Added a notebook with best practices for telephony speech * Added datasets detaiils * Added training recommendations * Emptied out cells with results * Added tutorial to docs Signed-off-by: jbalam <[email protected]> * Addressed review comments Signed-off-by: jbalam <[email protected]> * Added a line to note original sampling rate of an4 Signed-off-by: jbalam <[email protected]> * Made changes suggested in review Signed-off-by: jbalam <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Add FastEmit support for RNNT Losses (NVIDIA#2374) * Temp commit Signed-off-by: smajumdar <[email protected]> * Initial code for fastemit forward pass Signed-off-by: smajumdar <[email protected]> * Correct return reg value Signed-off-by: smajumdar <[email protected]> * Initial cpu impl Signed-off-by: smajumdar <[email protected]> * Try gpu impl Signed-off-by: smajumdar <[email protected]> * Try gpu impl Signed-off-by: smajumdar <[email protected]> * Correct few impl Signed-off-by: smajumdar <[email protected]> * Update fastemit scaling Signed-off-by: smajumdar <[email protected]> * Cleanup fastemit Signed-off-by: smajumdar <[email protected]> * Finalize FastEmit regularization PR Signed-off-by: smajumdar <[email protected]> * Refactor code to support fastemit regularization Signed-off-by: smajumdar <[email protected]> Co-authored-by: Samuel Kriman <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * byt5 unicode implementation, first cut Signed-off-by: Mike Chrzanowski <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * add bytelevel tokenizer Signed-off-by: Mike Chrzanowski <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * update styling Signed-off-by: Mike Chrzanowski <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * avoid circular import Signed-off-by: Mike Chrzanowski <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * fix bugs in hifigan code (NVIDIA#2392) Signed-off-by: Oktai Tatanov <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Update setup.py (NVIDIA#2394) Signed-off-by: Jason <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Update bytelevel_tokenizer.py Signed-off-by: mchrzanowski <[email protected]> * Update bytelevel_tokenizer.py Signed-off-by: mchrzanowski <[email protected]> * typo Signed-off-by: mchrzanowski <[email protected]> * missed one Signed-off-by: mchrzanowski <[email protected]> * bug fixes Signed-off-by: mchrzanowski <[email protected]> * style fix Signed-off-by: mchrzanowski <[email protected]> * bytelevelprocessor is now generic. Signed-off-by: mchrzanowski <[email protected]> * style fix Signed-off-by: mchrzanowski <[email protected]> * update checkpointing (NVIDIA#2396) Signed-off-by: Jason <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * style Signed-off-by: ericharper <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * woops, didnt merge jenkinsfile the right way * add newline Signed-off-by: mchrzanowski <[email protected]> * undo changes to enja processor Signed-off-by: mchrzanowski <[email protected]> * processor selection decision fix Signed-off-by: mchrzanowski <[email protected]> * newline fix Signed-off-by: mchrzanowski <[email protected]> Co-authored-by: Evelina <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Aleksey Grinchuk (Oleksii Hrinchuk) <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Jason <[email protected]> Co-authored-by: mchrzanowski <[email protected]> Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: Boris Fomitchev <[email protected]> Co-authored-by: Jagadeesh Balam <[email protected]> Co-authored-by: Samuel Kriman <[email protected]> Co-authored-by: Oktai Tatanov <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: root <[email protected]> Signed-off-by: Tuan Lai <[email protected]> * Minor Fix Signed-off-by: Tuan Lai <[email protected]> * Minor Fixes Signed-off-by: Tuan Lai <[email protected]> * Add TextNormalizationTestDataset and testing/evaluation code Signed-off-by: Tuan Lai <[email protected]> * Add TextNormalizationTaggerDataset and training code for tagger Signed-off-by: Tuan Lai <[email protected]> * Restore from local nemo ckpts Signed-off-by: Tuan Lai <[email protected]> * Add TextNormalizationDecoderDataset Signed-off-by: Tuan Lai <[email protected]> * Add interactive mode for neural_text_normalization_test.py Signed-off-by: Tuan Lai <[email protected]> * Add options to do training or not for tagger/decoder Signed-off-by: Tuan Lai <[email protected]> * Renamed Signed-off-by: Tuan Lai <[email protected]> * Implemented setup dataloader for decoder Signed-off-by: Tuan Lai <[email protected]> * Implemented training and validation for decoder Signed-off-by: Tuan Lai <[email protected]> * Data augmentation for decoder training Signed-off-by: Tuan Lai <[email protected]> * Config change Signed-off-by: Tuan Lai <[email protected]> * add blossom-ci.yml (NVIDIA#2401) Signed-off-by: ericharper <[email protected]> Signed-off-by: Tuan Lai <[email protected]> * Merge r1.1 bugfixes into main (NVIDIA#2407) * Update notebook branch and Jenkinsfile for 1.1.0 testing (NVIDIA#2378) * update branch Signed-off-by: ericharper <[email protected]> * update jenkinsfile Signed-off-by: ericharper <[email protected]> * [BUGFIX] NMT Multi-node was incorrectly computing num_replicas (NVIDIA#2380) * fix property when not using model parallel Signed-off-by: ericharper <[email protected]> * fix property when not using model parallel Signed-off-by: ericharper <[email protected]> * add debug statement Signed-off-by: ericharper <[email protected]> * add debug statement Signed-off-by: ericharper <[email protected]> * instantiate with NLPDDPPlugin with num_nodes from trainer config Signed-off-by: ericharper <[email protected]> * Update ASR scripts for tokenizer building and tarred dataset building (NVIDIA#2381) * Update ASR scripts for tokenizer building and tarred dataset building Signed-off-by: smajumdar <[email protected]> * Update container Signed-off-by: smajumdar <[email protected]> * Add STT Zh Citrinet 1024 Gamma 0.25 model Signed-off-by: smajumdar <[email protected]> * Update notebook (NVIDIA#2391) Signed-off-by: smajumdar <[email protected]> * ASR Notebooks fix for 1.1.0 (NVIDIA#2395) * nb fix for spring clean Signed-off-by: fayejf <[email protected]> * remove outdated instruction Signed-off-by: fayejf <[email protected]> * Mean normalization (NVIDIA#2397) * norm embeddings Signed-off-by: nithinraok <[email protected]> * move to utils Signed-off-by: nithinraok <[email protected]> * Bugfix adaptive spec augment time masking (NVIDIA#2398) * bugfix adaptive spec augment Signed-off-by: smajumdar <[email protected]> * Revert freq mask guard Signed-off-by: smajumdar <[email protected]> * Revert freq mask guard Signed-off-by: smajumdar <[email protected]> * Remove static time width clamping Signed-off-by: smajumdar <[email protected]> * Correct typos and issues with notebooks (NVIDIA#2402) * Fix Primer notebook Signed-off-by: smajumdar <[email protected]> * Typo Signed-off-by: smajumdar <[email protected]> * remove accelerator=DDP in tutorial notebooks to avoid errors. (NVIDIA#2403) Signed-off-by: Hoo Chang Shin <[email protected]> Co-authored-by: Hoo Chang Shin <[email protected]> * style Signed-off-by: ericharper <[email protected]> * update jenkins branch Signed-off-by: ericharper <[email protected]> * update notebook branch to main Signed-off-by: ericharper <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: fayejf <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Co-authored-by: khcs <[email protected]> Co-authored-by: Hoo Chang Shin <[email protected]> Signed-off-by: Tuan Lai <[email protected]> * Remove unused imports Signed-off-by: Tuan Lai <[email protected]> * Add initial doc for text_normalization Signed-off-by: Tuan Lai <[email protected]> * Fixed imports warnings Signed-off-by: Tuan Lai <[email protected]> * Minor Fix Signed-off-by: Tuan Lai <[email protected]> * Renamed Signed-off-by: Tuan Lai <[email protected]> * Allowed duplex modes Signed-off-by: Tuan Lai <[email protected]> * Minor Fix Signed-off-by: Tuan Lai <[email protected]> * Add docs for duplex_text_normalization_train and duplex_text_normalization_test Signed-off-by: Tuan Lai <[email protected]> * docstrings for model codes + minor fix Signed-off-by: Tuan Lai <[email protected]> * Add more comments and doc strings Signed-off-by: Tuan Lai <[email protected]> * Add doc for datasets + Use time.perf_counter() Signed-off-by: Tuan Lai <[email protected]> * Add code for preprocessing Google TN data Signed-off-by: Tuan Lai <[email protected]> * Add more docs and comments + Minor Fixes Signed-off-by: Tuan Lai <[email protected]> * Add more licenses + Fixed comments + Minors Signed-off-by: Tuan Lai <[email protected]> * Moved evaluation logic to DuplexTextNormalizationModel Signed-off-by: Tuan Lai <[email protected]> * Add logging errors Signed-off-by: Tuan Lai <[email protected]> * Updated validation code of tagger + Minors Signed-off-by: Tuan Lai <[email protected]> * Also write tag preds to log file Signed-off-by: Tuan Lai <[email protected]> * Add data augmentation for tagger dataset Signed-off-by: Tuan Lai <[email protected]> * Added experimental decorators Signed-off-by: Tuan Lai <[email protected]> * Updated docs Signed-off-by: Tuan Lai <[email protected]> * Updated duplex_tn_config.yaml Signed-off-by: Tuan Lai <[email protected]> * Compute token precision of tagger using NeMo metrics Signed-off-by: Tuan Lai <[email protected]> * Fixed saving issue when using ddp accelerator Signed-off-by: Tuan Lai <[email protected]> * Refactoring Signed-off-by: Tuan Lai <[email protected]> * Add option to keep punctuations in TextNormalizationTestDataset Signed-off-by: Tuan Lai <[email protected]> * Changes to input preprocessing + decoder's postprocessing Signed-off-by: Tuan Lai <[email protected]> * Fixed styles + Add references Signed-off-by: Tuan Lai <[email protected]> * Renamed examples/nlp/duplex_text_normalization/utils.py to helpers.py Signed-off-by: Tuan Lai <[email protected]> Co-authored-by: Jagadeesh Balam <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: Samuel Kriman <[email protected]> Co-authored-by: Oktai Tatanov <[email protected]> Co-authored-by: Jason <[email protected]> Co-authored-by: Mike Chrzanowski <[email protected]> Co-authored-by: Evelina <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Aleksey Grinchuk (Oleksii Hrinchuk) <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: mchrzanowski <[email protected]> Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: Boris Fomitchev <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: fayejf <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Co-authored-by: khcs <[email protected]> Co-authored-by: Hoo Chang Shin <[email protected]> Signed-off-by: Ghasem Pasandi <[email protected]>
* Add notebook with recommendations for 8 kHz speech (#2326) * Added a notebook with best practices for telephony speech * Added datasets detaiils * Added training recommendations * Emptied out cells with results * Added tutorial to docs Signed-off-by: jbalam <[email protected]> * Addressed review comments Signed-off-by: jbalam <[email protected]> * Added a line to note original sampling rate of an4 Signed-off-by: jbalam <[email protected]> * Made changes suggested in review Signed-off-by: jbalam <[email protected]> Signed-off-by: Tuan Lai <[email protected]> * Add FastEmit support for RNNT Losses (#2374) * Temp commit Signed-off-by: smajumdar <[email protected]> * Initial code for fastemit forward pass Signed-off-by: smajumdar <[email protected]> * Correct return reg value Signed-off-by: smajumdar <[email protected]> * Initial cpu impl Signed-off-by: smajumdar <[email protected]> * Try gpu impl Signed-off-by: smajumdar <[email protected]> * Try gpu impl Signed-off-by: smajumdar <[email protected]> * Correct few impl Signed-off-by: smajumdar <[email protected]> * Update fastemit scaling Signed-off-by: smajumdar <[email protected]> * Cleanup fastemit Signed-off-by: smajumdar <[email protected]> * Finalize FastEmit regularization PR Signed-off-by: smajumdar <[email protected]> * Refactor code to support fastemit regularization Signed-off-by: smajumdar <[email protected]> Co-authored-by: Samuel Kriman <[email protected]> Signed-off-by: Tuan Lai <[email protected]> * Implement inference functions of TN models Signed-off-by: Tuan Lai <[email protected]> * Minor Fix Signed-off-by: Tuan Lai <[email protected]> * fix bugs in hifigan code (#2392) Signed-off-by: Oktai Tatanov <[email protected]> Signed-off-by: Tuan Lai <[email protected]> * Update setup.py (#2394) Signed-off-by: Jason <[email protected]> Signed-off-by: Tuan Lai <[email protected]> * update checkpointing (#2396) Signed-off-by: Jason <[email protected]> Signed-off-by: Tuan Lai <[email protected]> * byt5 unicode implementation (#2365) * Audio Norm (#2285) * add jenkins test, refactoring Signed-off-by: ekmb <[email protected]> * update test Signed-off-by: ekmb <[email protected]> * fix new test Signed-off-by: ekmb <[email protected]> * add serial to the default normalizer, add tests Signed-off-by: ekmb <[email protected]> * manifest test added Signed-off-by: ekmb <[email protected]> * expose more params, new test cases Signed-off-by: ekmb <[email protected]> * fix jenkins, serial clean, exclude range from cardinal Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins dollar sign format Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins dollar sign format Signed-off-by: ekmb <[email protected]> * addressed review comments Signed-off-by: ekmb <[email protected]> * fix decimal in measure Signed-off-by: ekmb <[email protected]> * move serial in cardinal Signed-off-by: ekmb <[email protected]> * clean up Signed-off-by: ekmb <[email protected]> * update for SH zero -> oh Signed-off-by: ekmb <[email protected]> * change n_tagger default Signed-off-by: ekmb <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * bumping version to 1.0.1 Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Add check for numba regardless of device Signed-off-by: smajumdar <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * upper bound for webdataset Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Correct Dockerfile Signed-off-by: smajumdar <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * update readmes Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * update README (#2332) Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * ddp translate GPU allocation fix (#2312) * fixed branch in IR tutorial Signed-off-by: AlexGrinch <[email protected]> * ddp translate GPU allocation fix Signed-off-by: AlexGrinch <[email protected]> * map_location instead of set_device Signed-off-by: AlexGrinch <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Shallow fusion (#2315) * fixed branch in IR tutorial Signed-off-by: AlexGrinch <[email protected]> * shallow fusion init commit Signed-off-by: AlexGrinch <[email protected]> * debug info removed Signed-off-by: AlexGrinch <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * [BUGFIX] Add upper bound to hydra for 1.0.x (#2337) * upper bound hydra Signed-off-by: ericharper <[email protected]> * upper bound hydra Signed-off-by: ericharper <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * update version number Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * update package version Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * sparrowhawk tests + punctuation post processing for pynini TN (#2320) * add jenkins test, refactoring Signed-off-by: ekmb <[email protected]> * update test Signed-off-by: ekmb <[email protected]> * fix new test Signed-off-by: ekmb <[email protected]> * add serial to the default normalizer, add tests Signed-off-by: ekmb <[email protected]> * manifest test added Signed-off-by: ekmb <[email protected]> * expose more params, new test cases Signed-off-by: ekmb <[email protected]> * fix jenkins, serial clean, exclude range from cardinal Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins dollar sign format Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins dollar sign format Signed-off-by: ekmb <[email protected]> * addressed review comments Signed-off-by: ekmb <[email protected]> * fix decimal in measure Signed-off-by: ekmb <[email protected]> * move serial in cardinal Signed-off-by: ekmb <[email protected]> * sh tests init Signed-off-by: ekmb <[email protected]> * sparrowhawk container tests support added Signed-off-by: ekmb <[email protected]> * add post process to normalize.py, update tests Signed-off-by: ekmb <[email protected]> * remove duplication Signed-off-by: ekmb <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Update notebooks to 1.0.2 release (#2338) Signed-off-by: smajumdar <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Update ranges for omegaconf and hydra (#2336) * Update ranges Signed-off-by: smajumdar <[email protected]> * Updates for Hydra and OmegaConf updates Signed-off-by: smajumdar <[email protected]> * Style fixes Signed-off-by: smajumdar <[email protected]> * Correct tests and revert patch for model utils Signed-off-by: smajumdar <[email protected]> * Correct docstring Signed-off-by: smajumdar <[email protected]> * Revert unnecessary change Signed-off-by: smajumdar <[email protected]> * Revert unnecessary change Signed-off-by: smajumdar <[email protected]> * Guard scheduler for None Signed-off-by: smajumdar <[email protected]> * default to 0.0 if bpe_dropout is None Signed-off-by: ericharper <[email protected]> * Correctly log class that was restored Signed-off-by: smajumdar <[email protected]> * Root patch *bpe_dropout Signed-off-by: smajumdar <[email protected]> Co-authored-by: ericharper <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Update FastPitch Export (#2355) Signed-off-by: Jason <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * byt5 unicode implementation, first cut Signed-off-by: mchrzanowski <[email protected]> * add bytelevel tokenizer Signed-off-by: mchrzanowski <[email protected]> * update out_dir to not collide (#2358) Signed-off-by: ericharper <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Update container version to 21.05 (#2309) * Update container version Signed-off-by: smajumdar <[email protected]> * Temporarily change export format of waveglow Signed-off-by: smajumdar <[email protected]> * Add conda update for numba Signed-off-by: smajumdar <[email protected]> * Update numba compat via global flag for strictness level `--relax_numba_compat`, remove pytorchlightning.metrics, refactor out numba utils to core, update tests Signed-off-by: smajumdar <[email protected]> * Correct order of numba minimum verion, remove wrong flag from test Signed-off-by: smajumdar <[email protected]> * Double test of cuda numba Signed-off-by: smajumdar <[email protected]> * Double test of cuda numba Signed-off-by: smajumdar <[email protected]> * Enable RNNT tests Signed-off-by: smajumdar <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Text Normalization Update (#2356) * upper cased date support Signed-off-by: ekmb <[email protected]> * update whitelist, change roman weights Signed-off-by: ekmb <[email protected]> * docstrings, space fix, init file Signed-off-by: ekmb <[email protected]> * lgtm Signed-off-by: ekmb <[email protected]> * fraction with measure class Signed-off-by: ekmb <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * address comment Signed-off-by: mchrzanowski <[email protected]> * Add ASR CTC tutorial on fine-tuning on another language (#2346) * Add ASR CTC Language finetuning notebook Signed-off-by: smajumdar <[email protected]> * Add to documentation Signed-off-by: smajumdar <[email protected]> * Improve documentation Signed-off-by: smajumdar <[email protected]> * Correct name of the dataset Signed-off-by: smajumdar <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Correct colab link to notebook (#2366) Signed-off-by: smajumdar <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * sgdqa update data directories for testing (#2323) * sgdqa update data directories for testing Signed-off-by: Yang Zhang <[email protected]> * fix syntax Signed-off-by: Yang Zhang <[email protected]> * check if data dir exists Signed-off-by: Yang Zhang <[email protected]> * fix Signed-off-by: Yang Zhang <[email protected]> * adding pretrained model Signed-off-by: Yang Zhang <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Added documentation for export() (#2330) * Added export document Signed-off-by: Boris Fomitchev <[email protected]> * Addressed review comments Signed-off-by: Boris Fomitchev <[email protected]> Co-authored-by: Eric Harper <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Update Citrinet model card info (#2369) * Update model card info Signed-off-by: smajumdar <[email protected]> * Cleanup Docs Signed-off-by: smajumdar <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * [NMT] Model Parallel Megatron Encoders (#2238) * add megatron encoder Signed-off-by: ericharper <[email protected]> * added megatron to get_nmt_tokenizer Signed-off-by: ericharper <[email protected]> * add vocab_size and hidden_size to megatron bert Signed-off-by: ericharper <[email protected]> * add megatron encoder module Signed-off-by: ericharper <[email protected]> * fixed horrible typo Signed-off-by: ericharper <[email protected]> * fix typo and add default Signed-off-by: ericharper <[email protected]> * updating nlp overrides for mp nmt Signed-off-by: ericharper <[email protected]> * move some logic back to nlpmodel from overrides Signed-off-by: ericharper <[email protected]> * add checkpoint_file property Signed-off-by: ericharper <[email protected]> * fix property Signed-off-by: ericharper <[email protected]> * num_tokentypes=0 Signed-off-by: ericharper <[email protected]> * typo Signed-off-by: ericharper <[email protected]> * typo Signed-off-by: ericharper <[email protected]> * find_unused_parameters=True Signed-off-by: ericharper <[email protected]> * typo Signed-off-by: ericharper <[email protected]> * style Signed-off-by: ericharper <[email protected]> * get instead of pop Signed-off-by: ericharper <[email protected]> * remove token type ids from megatron input example Signed-off-by: ericharper <[email protected]> * pop vocab_size Signed-off-by: ericharper <[email protected]> * fix checkpointing for model parallel Signed-off-by: ericharper <[email protected]> * fix bug in non model parallel Signed-off-by: ericharper <[email protected]> * convert cfg.trainer to dict Signed-off-by: ericharper <[email protected]> * make num_tokentypes configurable for nmt Signed-off-by: ericharper <[email protected]> * update checkpoint_file when using named megatron model in nemo Signed-off-by: ericharper <[email protected]> * make vocab_file configurable Signed-off-by: ericharper <[email protected]> * dataclass can't have mutable default Signed-off-by: ericharper <[email protected]> * style Signed-off-by: ericharper <[email protected]> * unused imports Signed-off-by: ericharper <[email protected]> * revert input example Signed-off-by: ericharper <[email protected]> * check that checkpoint version is not None Signed-off-by: ericharper <[email protected]> * add mp jenkins test Signed-off-by: ericharper <[email protected]> * update docstring Signed-off-by: ericharper <[email protected]> * add docs for pretrained encoders with nemo nmt Signed-off-by: ericharper <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Add notebook with recommendations for 8 kHz speech (#2326) * Added a notebook with best practices for telephony speech * Added datasets detaiils * Added training recommendations * Emptied out cells with results * Added tutorial to docs Signed-off-by: jbalam <[email protected]> * Addressed review comments Signed-off-by: jbalam <[email protected]> * Added a line to note original sampling rate of an4 Signed-off-by: jbalam <[email protected]> * Made changes suggested in review Signed-off-by: jbalam <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Add FastEmit support for RNNT Losses (#2374) * Temp commit Signed-off-by: smajumdar <[email protected]> * Initial code for fastemit forward pass Signed-off-by: smajumdar <[email protected]> * Correct return reg value Signed-off-by: smajumdar <[email protected]> * Initial cpu impl Signed-off-by: smajumdar <[email protected]> * Try gpu impl Signed-off-by: smajumdar <[email protected]> * Try gpu impl Signed-off-by: smajumdar <[email protected]> * Correct few impl Signed-off-by: smajumdar <[email protected]> * Update fastemit scaling Signed-off-by: smajumdar <[email protected]> * Cleanup fastemit Signed-off-by: smajumdar <[email protected]> * Finalize FastEmit regularization PR Signed-off-by: smajumdar <[email protected]> * Refactor code to support fastemit regularization Signed-off-by: smajumdar <[email protected]> Co-authored-by: Samuel Kriman <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * byt5 unicode implementation, first cut Signed-off-by: Mike Chrzanowski <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * add bytelevel tokenizer Signed-off-by: Mike Chrzanowski <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * update styling Signed-off-by: Mike Chrzanowski <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * avoid circular import Signed-off-by: Mike Chrzanowski <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * fix bugs in hifigan code (#2392) Signed-off-by: Oktai Tatanov <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Update setup.py (#2394) Signed-off-by: Jason <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Update bytelevel_tokenizer.py Signed-off-by: mchrzanowski <[email protected]> * Update bytelevel_tokenizer.py Signed-off-by: mchrzanowski <[email protected]> * typo Signed-off-by: mchrzanowski <[email protected]> * missed one Signed-off-by: mchrzanowski <[email protected]> * bug fixes Signed-off-by: mchrzanowski <[email protected]> * style fix Signed-off-by: mchrzanowski <[email protected]> * bytelevelprocessor is now generic. Signed-off-by: mchrzanowski <[email protected]> * style fix Signed-off-by: mchrzanowski <[email protected]> * update checkpointing (#2396) Signed-off-by: Jason <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * style Signed-off-by: ericharper <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * woops, didnt merge jenkinsfile the right way * add newline Signed-off-by: mchrzanowski <[email protected]> * undo changes to enja processor Signed-off-by: mchrzanowski <[email protected]> * processor selection decision fix Signed-off-by: mchrzanowski <[email protected]> * newline fix Signed-off-by: mchrzanowski <[email protected]> Co-authored-by: Evelina <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Aleksey Grinchuk (Oleksii Hrinchuk) <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Jason <[email protected]> Co-authored-by: mchrzanowski <[email protected]> Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: Boris Fomitchev <[email protected]> Co-authored-by: Jagadeesh Balam <[email protected]> Co-authored-by: Samuel Kriman <[email protected]> Co-authored-by: Oktai Tatanov <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: root <[email protected]> Signed-off-by: Tuan Lai <[email protected]> * Minor Fix Signed-off-by: Tuan Lai <[email protected]> * Minor Fixes Signed-off-by: Tuan Lai <[email protected]> * Add TextNormalizationTestDataset and testing/evaluation code Signed-off-by: Tuan Lai <[email protected]> * Add TextNormalizationTaggerDataset and training code for tagger Signed-off-by: Tuan Lai <[email protected]> * Restore from local nemo ckpts Signed-off-by: Tuan Lai <[email protected]> * Add TextNormalizationDecoderDataset Signed-off-by: Tuan Lai <[email protected]> * Add interactive mode for neural_text_normalization_test.py Signed-off-by: Tuan Lai <[email protected]> * Add options to do training or not for tagger/decoder Signed-off-by: Tuan Lai <[email protected]> * Renamed Signed-off-by: Tuan Lai <[email protected]> * Implemented setup dataloader for decoder Signed-off-by: Tuan Lai <[email protected]> * Implemented training and validation for decoder Signed-off-by: Tuan Lai <[email protected]> * Data augmentation for decoder training Signed-off-by: Tuan Lai <[email protected]> * Config change Signed-off-by: Tuan Lai <[email protected]> * add blossom-ci.yml (#2401) Signed-off-by: ericharper <[email protected]> Signed-off-by: Tuan Lai <[email protected]> * Merge r1.1 bugfixes into main (#2407) * Update notebook branch and Jenkinsfile for 1.1.0 testing (#2378) * update branch Signed-off-by: ericharper <[email protected]> * update jenkinsfile Signed-off-by: ericharper <[email protected]> * [BUGFIX] NMT Multi-node was incorrectly computing num_replicas (#2380) * fix property when not using model parallel Signed-off-by: ericharper <[email protected]> * fix property when not using model parallel Signed-off-by: ericharper <[email protected]> * add debug statement Signed-off-by: ericharper <[email protected]> * add debug statement Signed-off-by: ericharper <[email protected]> * instantiate with NLPDDPPlugin with num_nodes from trainer config Signed-off-by: ericharper <[email protected]> * Update ASR scripts for tokenizer building and tarred dataset building (#2381) * Update ASR scripts for tokenizer building and tarred dataset building Signed-off-by: smajumdar <[email protected]> * Update container Signed-off-by: smajumdar <[email protected]> * Add STT Zh Citrinet 1024 Gamma 0.25 model Signed-off-by: smajumdar <[email protected]> * Update notebook (#2391) Signed-off-by: smajumdar <[email protected]> * ASR Notebooks fix for 1.1.0 (#2395) * nb fix for spring clean Signed-off-by: fayejf <[email protected]> * remove outdated instruction Signed-off-by: fayejf <[email protected]> * Mean normalization (#2397) * norm embeddings Signed-off-by: nithinraok <[email protected]> * move to utils Signed-off-by: nithinraok <[email protected]> * Bugfix adaptive spec augment time masking (#2398) * bugfix adaptive spec augment Signed-off-by: smajumdar <[email protected]> * Revert freq mask guard Signed-off-by: smajumdar <[email protected]> * Revert freq mask guard Signed-off-by: smajumdar <[email protected]> * Remove static time width clamping Signed-off-by: smajumdar <[email protected]> * Correct typos and issues with notebooks (#2402) * Fix Primer notebook Signed-off-by: smajumdar <[email protected]> * Typo Signed-off-by: smajumdar <[email protected]> * remove accelerator=DDP in tutorial notebooks to avoid errors. (#2403) Signed-off-by: Hoo Chang Shin <[email protected]> Co-authored-by: Hoo Chang Shin <[email protected]> * style Signed-off-by: ericharper <[email protected]> * update jenkins branch Signed-off-by: ericharper <[email protected]> * update notebook branch to main Signed-off-by: ericharper <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: fayejf <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Co-authored-by: khcs <[email protected]> Co-authored-by: Hoo Chang Shin <[email protected]> Signed-off-by: Tuan Lai <[email protected]> * Remove unused imports Signed-off-by: Tuan Lai <[email protected]> * Add initial doc for text_normalization Signed-off-by: Tuan Lai <[email protected]> * Fixed imports warnings Signed-off-by: Tuan Lai <[email protected]> * Minor Fix Signed-off-by: Tuan Lai <[email protected]> * Renamed Signed-off-by: Tuan Lai <[email protected]> * Allowed duplex modes Signed-off-by: Tuan Lai <[email protected]> * Minor Fix Signed-off-by: Tuan Lai <[email protected]> * Add docs for duplex_text_normalization_train and duplex_text_normalization_test Signed-off-by: Tuan Lai <[email protected]> * docstrings for model codes + minor fix Signed-off-by: Tuan Lai <[email protected]> * Add more comments and doc strings Signed-off-by: Tuan Lai <[email protected]> * Add doc for datasets + Use time.perf_counter() Signed-off-by: Tuan Lai <[email protected]> * Add code for preprocessing Google TN data Signed-off-by: Tuan Lai <[email protected]> * Add more docs and comments + Minor Fixes Signed-off-by: Tuan Lai <[email protected]> * Add more licenses + Fixed comments + Minors Signed-off-by: Tuan Lai <[email protected]> * Moved evaluation logic to DuplexTextNormalizationModel Signed-off-by: Tuan Lai <[email protected]> * Add logging errors Signed-off-by: Tuan Lai <[email protected]> * Updated validation code of tagger + Minors Signed-off-by: Tuan Lai <[email protected]> * Also write tag preds to log file Signed-off-by: Tuan Lai <[email protected]> * Add data augmentation for tagger dataset Signed-off-by: Tuan Lai <[email protected]> * Added experimental decorators Signed-off-by: Tuan Lai <[email protected]> * Updated docs Signed-off-by: Tuan Lai <[email protected]> * Updated duplex_tn_config.yaml Signed-off-by: Tuan Lai <[email protected]> * Compute token precision of tagger using NeMo metrics Signed-off-by: Tuan Lai <[email protected]> * Fixed saving issue when using ddp accelerator Signed-off-by: Tuan Lai <[email protected]> * Refactoring Signed-off-by: Tuan Lai <[email protected]> * Add option to keep punctuations in TextNormalizationTestDataset Signed-off-by: Tuan Lai <[email protected]> * Changes to input preprocessing + decoder's postprocessing Signed-off-by: Tuan Lai <[email protected]> * Fixed styles + Add references Signed-off-by: Tuan Lai <[email protected]> * Renamed examples/nlp/duplex_text_normalization/utils.py to helpers.py Signed-off-by: Tuan Lai <[email protected]> Co-authored-by: Jagadeesh Balam <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: Samuel Kriman <[email protected]> Co-authored-by: Oktai Tatanov <[email protected]> Co-authored-by: Jason <[email protected]> Co-authored-by: Mike Chrzanowski <[email protected]> Co-authored-by: Evelina <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Aleksey Grinchuk (Oleksii Hrinchuk) <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: mchrzanowski <[email protected]> Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: Boris Fomitchev <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: fayejf <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Co-authored-by: khcs <[email protected]> Co-authored-by: Hoo Chang Shin <[email protected]>
* Add notebook with recommendations for 8 kHz speech (NVIDIA#2326) * Added a notebook with best practices for telephony speech * Added datasets detaiils * Added training recommendations * Emptied out cells with results * Added tutorial to docs Signed-off-by: jbalam <[email protected]> * Addressed review comments Signed-off-by: jbalam <[email protected]> * Added a line to note original sampling rate of an4 Signed-off-by: jbalam <[email protected]> * Made changes suggested in review Signed-off-by: jbalam <[email protected]> Signed-off-by: Tuan Lai <[email protected]> * Add FastEmit support for RNNT Losses (NVIDIA#2374) * Temp commit Signed-off-by: smajumdar <[email protected]> * Initial code for fastemit forward pass Signed-off-by: smajumdar <[email protected]> * Correct return reg value Signed-off-by: smajumdar <[email protected]> * Initial cpu impl Signed-off-by: smajumdar <[email protected]> * Try gpu impl Signed-off-by: smajumdar <[email protected]> * Try gpu impl Signed-off-by: smajumdar <[email protected]> * Correct few impl Signed-off-by: smajumdar <[email protected]> * Update fastemit scaling Signed-off-by: smajumdar <[email protected]> * Cleanup fastemit Signed-off-by: smajumdar <[email protected]> * Finalize FastEmit regularization PR Signed-off-by: smajumdar <[email protected]> * Refactor code to support fastemit regularization Signed-off-by: smajumdar <[email protected]> Co-authored-by: Samuel Kriman <[email protected]> Signed-off-by: Tuan Lai <[email protected]> * Implement inference functions of TN models Signed-off-by: Tuan Lai <[email protected]> * Minor Fix Signed-off-by: Tuan Lai <[email protected]> * fix bugs in hifigan code (NVIDIA#2392) Signed-off-by: Oktai Tatanov <[email protected]> Signed-off-by: Tuan Lai <[email protected]> * Update setup.py (NVIDIA#2394) Signed-off-by: Jason <[email protected]> Signed-off-by: Tuan Lai <[email protected]> * update checkpointing (NVIDIA#2396) Signed-off-by: Jason <[email protected]> Signed-off-by: Tuan Lai <[email protected]> * byt5 unicode implementation (NVIDIA#2365) * Audio Norm (NVIDIA#2285) * add jenkins test, refactoring Signed-off-by: ekmb <[email protected]> * update test Signed-off-by: ekmb <[email protected]> * fix new test Signed-off-by: ekmb <[email protected]> * add serial to the default normalizer, add tests Signed-off-by: ekmb <[email protected]> * manifest test added Signed-off-by: ekmb <[email protected]> * expose more params, new test cases Signed-off-by: ekmb <[email protected]> * fix jenkins, serial clean, exclude range from cardinal Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins dollar sign format Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins dollar sign format Signed-off-by: ekmb <[email protected]> * addressed review comments Signed-off-by: ekmb <[email protected]> * fix decimal in measure Signed-off-by: ekmb <[email protected]> * move serial in cardinal Signed-off-by: ekmb <[email protected]> * clean up Signed-off-by: ekmb <[email protected]> * update for SH zero -> oh Signed-off-by: ekmb <[email protected]> * change n_tagger default Signed-off-by: ekmb <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * bumping version to 1.0.1 Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Add check for numba regardless of device Signed-off-by: smajumdar <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * upper bound for webdataset Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Correct Dockerfile Signed-off-by: smajumdar <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * update readmes Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * update README (NVIDIA#2332) Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * ddp translate GPU allocation fix (NVIDIA#2312) * fixed branch in IR tutorial Signed-off-by: AlexGrinch <[email protected]> * ddp translate GPU allocation fix Signed-off-by: AlexGrinch <[email protected]> * map_location instead of set_device Signed-off-by: AlexGrinch <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Shallow fusion (NVIDIA#2315) * fixed branch in IR tutorial Signed-off-by: AlexGrinch <[email protected]> * shallow fusion init commit Signed-off-by: AlexGrinch <[email protected]> * debug info removed Signed-off-by: AlexGrinch <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * [BUGFIX] Add upper bound to hydra for 1.0.x (NVIDIA#2337) * upper bound hydra Signed-off-by: ericharper <[email protected]> * upper bound hydra Signed-off-by: ericharper <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * update version number Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * update package version Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * sparrowhawk tests + punctuation post processing for pynini TN (NVIDIA#2320) * add jenkins test, refactoring Signed-off-by: ekmb <[email protected]> * update test Signed-off-by: ekmb <[email protected]> * fix new test Signed-off-by: ekmb <[email protected]> * add serial to the default normalizer, add tests Signed-off-by: ekmb <[email protected]> * manifest test added Signed-off-by: ekmb <[email protected]> * expose more params, new test cases Signed-off-by: ekmb <[email protected]> * fix jenkins, serial clean, exclude range from cardinal Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins dollar sign format Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins dollar sign format Signed-off-by: ekmb <[email protected]> * addressed review comments Signed-off-by: ekmb <[email protected]> * fix decimal in measure Signed-off-by: ekmb <[email protected]> * move serial in cardinal Signed-off-by: ekmb <[email protected]> * sh tests init Signed-off-by: ekmb <[email protected]> * sparrowhawk container tests support added Signed-off-by: ekmb <[email protected]> * add post process to normalize.py, update tests Signed-off-by: ekmb <[email protected]> * remove duplication Signed-off-by: ekmb <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Update notebooks to 1.0.2 release (NVIDIA#2338) Signed-off-by: smajumdar <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Update ranges for omegaconf and hydra (NVIDIA#2336) * Update ranges Signed-off-by: smajumdar <[email protected]> * Updates for Hydra and OmegaConf updates Signed-off-by: smajumdar <[email protected]> * Style fixes Signed-off-by: smajumdar <[email protected]> * Correct tests and revert patch for model utils Signed-off-by: smajumdar <[email protected]> * Correct docstring Signed-off-by: smajumdar <[email protected]> * Revert unnecessary change Signed-off-by: smajumdar <[email protected]> * Revert unnecessary change Signed-off-by: smajumdar <[email protected]> * Guard scheduler for None Signed-off-by: smajumdar <[email protected]> * default to 0.0 if bpe_dropout is None Signed-off-by: ericharper <[email protected]> * Correctly log class that was restored Signed-off-by: smajumdar <[email protected]> * Root patch *bpe_dropout Signed-off-by: smajumdar <[email protected]> Co-authored-by: ericharper <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Update FastPitch Export (NVIDIA#2355) Signed-off-by: Jason <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * byt5 unicode implementation, first cut Signed-off-by: mchrzanowski <[email protected]> * add bytelevel tokenizer Signed-off-by: mchrzanowski <[email protected]> * update out_dir to not collide (NVIDIA#2358) Signed-off-by: ericharper <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Update container version to 21.05 (NVIDIA#2309) * Update container version Signed-off-by: smajumdar <[email protected]> * Temporarily change export format of waveglow Signed-off-by: smajumdar <[email protected]> * Add conda update for numba Signed-off-by: smajumdar <[email protected]> * Update numba compat via global flag for strictness level `--relax_numba_compat`, remove pytorchlightning.metrics, refactor out numba utils to core, update tests Signed-off-by: smajumdar <[email protected]> * Correct order of numba minimum verion, remove wrong flag from test Signed-off-by: smajumdar <[email protected]> * Double test of cuda numba Signed-off-by: smajumdar <[email protected]> * Double test of cuda numba Signed-off-by: smajumdar <[email protected]> * Enable RNNT tests Signed-off-by: smajumdar <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Text Normalization Update (NVIDIA#2356) * upper cased date support Signed-off-by: ekmb <[email protected]> * update whitelist, change roman weights Signed-off-by: ekmb <[email protected]> * docstrings, space fix, init file Signed-off-by: ekmb <[email protected]> * lgtm Signed-off-by: ekmb <[email protected]> * fraction with measure class Signed-off-by: ekmb <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * address comment Signed-off-by: mchrzanowski <[email protected]> * Add ASR CTC tutorial on fine-tuning on another language (NVIDIA#2346) * Add ASR CTC Language finetuning notebook Signed-off-by: smajumdar <[email protected]> * Add to documentation Signed-off-by: smajumdar <[email protected]> * Improve documentation Signed-off-by: smajumdar <[email protected]> * Correct name of the dataset Signed-off-by: smajumdar <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Correct colab link to notebook (NVIDIA#2366) Signed-off-by: smajumdar <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * sgdqa update data directories for testing (NVIDIA#2323) * sgdqa update data directories for testing Signed-off-by: Yang Zhang <[email protected]> * fix syntax Signed-off-by: Yang Zhang <[email protected]> * check if data dir exists Signed-off-by: Yang Zhang <[email protected]> * fix Signed-off-by: Yang Zhang <[email protected]> * adding pretrained model Signed-off-by: Yang Zhang <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Added documentation for export() (NVIDIA#2330) * Added export document Signed-off-by: Boris Fomitchev <[email protected]> * Addressed review comments Signed-off-by: Boris Fomitchev <[email protected]> Co-authored-by: Eric Harper <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Update Citrinet model card info (NVIDIA#2369) * Update model card info Signed-off-by: smajumdar <[email protected]> * Cleanup Docs Signed-off-by: smajumdar <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * [NMT] Model Parallel Megatron Encoders (NVIDIA#2238) * add megatron encoder Signed-off-by: ericharper <[email protected]> * added megatron to get_nmt_tokenizer Signed-off-by: ericharper <[email protected]> * add vocab_size and hidden_size to megatron bert Signed-off-by: ericharper <[email protected]> * add megatron encoder module Signed-off-by: ericharper <[email protected]> * fixed horrible typo Signed-off-by: ericharper <[email protected]> * fix typo and add default Signed-off-by: ericharper <[email protected]> * updating nlp overrides for mp nmt Signed-off-by: ericharper <[email protected]> * move some logic back to nlpmodel from overrides Signed-off-by: ericharper <[email protected]> * add checkpoint_file property Signed-off-by: ericharper <[email protected]> * fix property Signed-off-by: ericharper <[email protected]> * num_tokentypes=0 Signed-off-by: ericharper <[email protected]> * typo Signed-off-by: ericharper <[email protected]> * typo Signed-off-by: ericharper <[email protected]> * find_unused_parameters=True Signed-off-by: ericharper <[email protected]> * typo Signed-off-by: ericharper <[email protected]> * style Signed-off-by: ericharper <[email protected]> * get instead of pop Signed-off-by: ericharper <[email protected]> * remove token type ids from megatron input example Signed-off-by: ericharper <[email protected]> * pop vocab_size Signed-off-by: ericharper <[email protected]> * fix checkpointing for model parallel Signed-off-by: ericharper <[email protected]> * fix bug in non model parallel Signed-off-by: ericharper <[email protected]> * convert cfg.trainer to dict Signed-off-by: ericharper <[email protected]> * make num_tokentypes configurable for nmt Signed-off-by: ericharper <[email protected]> * update checkpoint_file when using named megatron model in nemo Signed-off-by: ericharper <[email protected]> * make vocab_file configurable Signed-off-by: ericharper <[email protected]> * dataclass can't have mutable default Signed-off-by: ericharper <[email protected]> * style Signed-off-by: ericharper <[email protected]> * unused imports Signed-off-by: ericharper <[email protected]> * revert input example Signed-off-by: ericharper <[email protected]> * check that checkpoint version is not None Signed-off-by: ericharper <[email protected]> * add mp jenkins test Signed-off-by: ericharper <[email protected]> * update docstring Signed-off-by: ericharper <[email protected]> * add docs for pretrained encoders with nemo nmt Signed-off-by: ericharper <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Add notebook with recommendations for 8 kHz speech (NVIDIA#2326) * Added a notebook with best practices for telephony speech * Added datasets detaiils * Added training recommendations * Emptied out cells with results * Added tutorial to docs Signed-off-by: jbalam <[email protected]> * Addressed review comments Signed-off-by: jbalam <[email protected]> * Added a line to note original sampling rate of an4 Signed-off-by: jbalam <[email protected]> * Made changes suggested in review Signed-off-by: jbalam <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Add FastEmit support for RNNT Losses (NVIDIA#2374) * Temp commit Signed-off-by: smajumdar <[email protected]> * Initial code for fastemit forward pass Signed-off-by: smajumdar <[email protected]> * Correct return reg value Signed-off-by: smajumdar <[email protected]> * Initial cpu impl Signed-off-by: smajumdar <[email protected]> * Try gpu impl Signed-off-by: smajumdar <[email protected]> * Try gpu impl Signed-off-by: smajumdar <[email protected]> * Correct few impl Signed-off-by: smajumdar <[email protected]> * Update fastemit scaling Signed-off-by: smajumdar <[email protected]> * Cleanup fastemit Signed-off-by: smajumdar <[email protected]> * Finalize FastEmit regularization PR Signed-off-by: smajumdar <[email protected]> * Refactor code to support fastemit regularization Signed-off-by: smajumdar <[email protected]> Co-authored-by: Samuel Kriman <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * byt5 unicode implementation, first cut Signed-off-by: Mike Chrzanowski <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * add bytelevel tokenizer Signed-off-by: Mike Chrzanowski <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * update styling Signed-off-by: Mike Chrzanowski <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * avoid circular import Signed-off-by: Mike Chrzanowski <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * fix bugs in hifigan code (NVIDIA#2392) Signed-off-by: Oktai Tatanov <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Update setup.py (NVIDIA#2394) Signed-off-by: Jason <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Update bytelevel_tokenizer.py Signed-off-by: mchrzanowski <[email protected]> * Update bytelevel_tokenizer.py Signed-off-by: mchrzanowski <[email protected]> * typo Signed-off-by: mchrzanowski <[email protected]> * missed one Signed-off-by: mchrzanowski <[email protected]> * bug fixes Signed-off-by: mchrzanowski <[email protected]> * style fix Signed-off-by: mchrzanowski <[email protected]> * bytelevelprocessor is now generic. Signed-off-by: mchrzanowski <[email protected]> * style fix Signed-off-by: mchrzanowski <[email protected]> * update checkpointing (NVIDIA#2396) Signed-off-by: Jason <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * style Signed-off-by: ericharper <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * woops, didnt merge jenkinsfile the right way * add newline Signed-off-by: mchrzanowski <[email protected]> * undo changes to enja processor Signed-off-by: mchrzanowski <[email protected]> * processor selection decision fix Signed-off-by: mchrzanowski <[email protected]> * newline fix Signed-off-by: mchrzanowski <[email protected]> Co-authored-by: Evelina <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Aleksey Grinchuk (Oleksii Hrinchuk) <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Jason <[email protected]> Co-authored-by: mchrzanowski <[email protected]> Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: Boris Fomitchev <[email protected]> Co-authored-by: Jagadeesh Balam <[email protected]> Co-authored-by: Samuel Kriman <[email protected]> Co-authored-by: Oktai Tatanov <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: root <[email protected]> Signed-off-by: Tuan Lai <[email protected]> * Minor Fix Signed-off-by: Tuan Lai <[email protected]> * Minor Fixes Signed-off-by: Tuan Lai <[email protected]> * Add TextNormalizationTestDataset and testing/evaluation code Signed-off-by: Tuan Lai <[email protected]> * Add TextNormalizationTaggerDataset and training code for tagger Signed-off-by: Tuan Lai <[email protected]> * Restore from local nemo ckpts Signed-off-by: Tuan Lai <[email protected]> * Add TextNormalizationDecoderDataset Signed-off-by: Tuan Lai <[email protected]> * Add interactive mode for neural_text_normalization_test.py Signed-off-by: Tuan Lai <[email protected]> * Add options to do training or not for tagger/decoder Signed-off-by: Tuan Lai <[email protected]> * Renamed Signed-off-by: Tuan Lai <[email protected]> * Implemented setup dataloader for decoder Signed-off-by: Tuan Lai <[email protected]> * Implemented training and validation for decoder Signed-off-by: Tuan Lai <[email protected]> * Data augmentation for decoder training Signed-off-by: Tuan Lai <[email protected]> * Config change Signed-off-by: Tuan Lai <[email protected]> * add blossom-ci.yml (NVIDIA#2401) Signed-off-by: ericharper <[email protected]> Signed-off-by: Tuan Lai <[email protected]> * Merge r1.1 bugfixes into main (NVIDIA#2407) * Update notebook branch and Jenkinsfile for 1.1.0 testing (NVIDIA#2378) * update branch Signed-off-by: ericharper <[email protected]> * update jenkinsfile Signed-off-by: ericharper <[email protected]> * [BUGFIX] NMT Multi-node was incorrectly computing num_replicas (NVIDIA#2380) * fix property when not using model parallel Signed-off-by: ericharper <[email protected]> * fix property when not using model parallel Signed-off-by: ericharper <[email protected]> * add debug statement Signed-off-by: ericharper <[email protected]> * add debug statement Signed-off-by: ericharper <[email protected]> * instantiate with NLPDDPPlugin with num_nodes from trainer config Signed-off-by: ericharper <[email protected]> * Update ASR scripts for tokenizer building and tarred dataset building (NVIDIA#2381) * Update ASR scripts for tokenizer building and tarred dataset building Signed-off-by: smajumdar <[email protected]> * Update container Signed-off-by: smajumdar <[email protected]> * Add STT Zh Citrinet 1024 Gamma 0.25 model Signed-off-by: smajumdar <[email protected]> * Update notebook (NVIDIA#2391) Signed-off-by: smajumdar <[email protected]> * ASR Notebooks fix for 1.1.0 (NVIDIA#2395) * nb fix for spring clean Signed-off-by: fayejf <[email protected]> * remove outdated instruction Signed-off-by: fayejf <[email protected]> * Mean normalization (NVIDIA#2397) * norm embeddings Signed-off-by: nithinraok <[email protected]> * move to utils Signed-off-by: nithinraok <[email protected]> * Bugfix adaptive spec augment time masking (NVIDIA#2398) * bugfix adaptive spec augment Signed-off-by: smajumdar <[email protected]> * Revert freq mask guard Signed-off-by: smajumdar <[email protected]> * Revert freq mask guard Signed-off-by: smajumdar <[email protected]> * Remove static time width clamping Signed-off-by: smajumdar <[email protected]> * Correct typos and issues with notebooks (NVIDIA#2402) * Fix Primer notebook Signed-off-by: smajumdar <[email protected]> * Typo Signed-off-by: smajumdar <[email protected]> * remove accelerator=DDP in tutorial notebooks to avoid errors. (NVIDIA#2403) Signed-off-by: Hoo Chang Shin <[email protected]> Co-authored-by: Hoo Chang Shin <[email protected]> * style Signed-off-by: ericharper <[email protected]> * update jenkins branch Signed-off-by: ericharper <[email protected]> * update notebook branch to main Signed-off-by: ericharper <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: fayejf <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Co-authored-by: khcs <[email protected]> Co-authored-by: Hoo Chang Shin <[email protected]> Signed-off-by: Tuan Lai <[email protected]> * Remove unused imports Signed-off-by: Tuan Lai <[email protected]> * Add initial doc for text_normalization Signed-off-by: Tuan Lai <[email protected]> * Fixed imports warnings Signed-off-by: Tuan Lai <[email protected]> * Minor Fix Signed-off-by: Tuan Lai <[email protected]> * Renamed Signed-off-by: Tuan Lai <[email protected]> * Allowed duplex modes Signed-off-by: Tuan Lai <[email protected]> * Minor Fix Signed-off-by: Tuan Lai <[email protected]> * Add docs for duplex_text_normalization_train and duplex_text_normalization_test Signed-off-by: Tuan Lai <[email protected]> * docstrings for model codes + minor fix Signed-off-by: Tuan Lai <[email protected]> * Add more comments and doc strings Signed-off-by: Tuan Lai <[email protected]> * Add doc for datasets + Use time.perf_counter() Signed-off-by: Tuan Lai <[email protected]> * Add code for preprocessing Google TN data Signed-off-by: Tuan Lai <[email protected]> * Add more docs and comments + Minor Fixes Signed-off-by: Tuan Lai <[email protected]> * Add more licenses + Fixed comments + Minors Signed-off-by: Tuan Lai <[email protected]> * Moved evaluation logic to DuplexTextNormalizationModel Signed-off-by: Tuan Lai <[email protected]> * Add logging errors Signed-off-by: Tuan Lai <[email protected]> * Updated validation code of tagger + Minors Signed-off-by: Tuan Lai <[email protected]> * Also write tag preds to log file Signed-off-by: Tuan Lai <[email protected]> * Add data augmentation for tagger dataset Signed-off-by: Tuan Lai <[email protected]> * Added experimental decorators Signed-off-by: Tuan Lai <[email protected]> * Updated docs Signed-off-by: Tuan Lai <[email protected]> * Updated duplex_tn_config.yaml Signed-off-by: Tuan Lai <[email protected]> * Compute token precision of tagger using NeMo metrics Signed-off-by: Tuan Lai <[email protected]> * Fixed saving issue when using ddp accelerator Signed-off-by: Tuan Lai <[email protected]> * Refactoring Signed-off-by: Tuan Lai <[email protected]> * Add option to keep punctuations in TextNormalizationTestDataset Signed-off-by: Tuan Lai <[email protected]> * Changes to input preprocessing + decoder's postprocessing Signed-off-by: Tuan Lai <[email protected]> * Fixed styles + Add references Signed-off-by: Tuan Lai <[email protected]> * Renamed examples/nlp/duplex_text_normalization/utils.py to helpers.py Signed-off-by: Tuan Lai <[email protected]> Co-authored-by: Jagadeesh Balam <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: Samuel Kriman <[email protected]> Co-authored-by: Oktai Tatanov <[email protected]> Co-authored-by: Jason <[email protected]> Co-authored-by: Mike Chrzanowski <[email protected]> Co-authored-by: Evelina <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Aleksey Grinchuk (Oleksii Hrinchuk) <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: mchrzanowski <[email protected]> Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: Boris Fomitchev <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: fayejf <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Co-authored-by: khcs <[email protected]> Co-authored-by: Hoo Chang Shin <[email protected]>
* upper bound for webdataset Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: Micha Livne <[email protected]> * Correct Dockerfile Signed-off-by: smajumdar <[email protected]> Signed-off-by: Micha Livne <[email protected]> * update readmes Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: Micha Livne <[email protected]> * update README (#2332) Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: Micha Livne <[email protected]> * ddp translate GPU allocation fix (#2312) * fixed branch in IR tutorial Signed-off-by: AlexGrinch <[email protected]> * ddp translate GPU allocation fix Signed-off-by: AlexGrinch <[email protected]> * map_location instead of set_device Signed-off-by: AlexGrinch <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Signed-off-by: Micha Livne <[email protected]> * Shallow fusion (#2315) * fixed branch in IR tutorial Signed-off-by: AlexGrinch <[email protected]> * shallow fusion init commit Signed-off-by: AlexGrinch <[email protected]> * debug info removed Signed-off-by: AlexGrinch <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Signed-off-by: Micha Livne <[email protected]> * [BUGFIX] Add upper bound to hydra for 1.0.x (#2337) * upper bound hydra Signed-off-by: ericharper <[email protected]> * upper bound hydra Signed-off-by: ericharper <[email protected]> Signed-off-by: Micha Livne <[email protected]> * update version number Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: Micha Livne <[email protected]> * update package version Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: Micha Livne <[email protected]> * sparrowhawk tests + punctuation post processing for pynini TN (#2320) * add jenkins test, refactoring Signed-off-by: ekmb <[email protected]> * update test Signed-off-by: ekmb <[email protected]> * fix new test Signed-off-by: ekmb <[email protected]> * add serial to the default normalizer, add tests Signed-off-by: ekmb <[email protected]> * manifest test added Signed-off-by: ekmb <[email protected]> * expose more params, new test cases Signed-off-by: ekmb <[email protected]> * fix jenkins, serial clean, exclude range from cardinal Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins dollar sign format Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins dollar sign format Signed-off-by: ekmb <[email protected]> * addressed review comments Signed-off-by: ekmb <[email protected]> * fix decimal in measure Signed-off-by: ekmb <[email protected]> * move serial in cardinal Signed-off-by: ekmb <[email protected]> * sh tests init Signed-off-by: ekmb <[email protected]> * sparrowhawk container tests support added Signed-off-by: ekmb <[email protected]> * add post process to normalize.py, update tests Signed-off-by: ekmb <[email protected]> * remove duplication Signed-off-by: ekmb <[email protected]> Signed-off-by: Micha Livne <[email protected]> * Update notebooks to 1.0.2 release (#2338) Signed-off-by: smajumdar <[email protected]> Signed-off-by: Micha Livne <[email protected]> * Update ranges for omegaconf and hydra (#2336) * Update ranges Signed-off-by: smajumdar <[email protected]> * Updates for Hydra and OmegaConf updates Signed-off-by: smajumdar <[email protected]> * Style fixes Signed-off-by: smajumdar <[email protected]> * Correct tests and revert patch for model utils Signed-off-by: smajumdar <[email protected]> * Correct docstring Signed-off-by: smajumdar <[email protected]> * Revert unnecessary change Signed-off-by: smajumdar <[email protected]> * Revert unnecessary change Signed-off-by: smajumdar <[email protected]> * Guard scheduler for None Signed-off-by: smajumdar <[email protected]> * default to 0.0 if bpe_dropout is None Signed-off-by: ericharper <[email protected]> * Correctly log class that was restored Signed-off-by: smajumdar <[email protected]> * Root patch *bpe_dropout Signed-off-by: smajumdar <[email protected]> Co-authored-by: ericharper <[email protected]> Signed-off-by: Micha Livne <[email protected]> * Update FastPitch Export (#2355) Signed-off-by: Jason <[email protected]> Signed-off-by: Micha Livne <[email protected]> * update out_dir to not collide (#2358) Signed-off-by: ericharper <[email protected]> Signed-off-by: Micha Livne <[email protected]> * Update container version to 21.05 (#2309) * Update container version Signed-off-by: smajumdar <[email protected]> * Temporarily change export format of waveglow Signed-off-by: smajumdar <[email protected]> * Add conda update for numba Signed-off-by: smajumdar <[email protected]> * Update numba compat via global flag for strictness level `--relax_numba_compat`, remove pytorchlightning.metrics, refactor out numba utils to core, update tests Signed-off-by: smajumdar <[email protected]> * Correct order of numba minimum verion, remove wrong flag from test Signed-off-by: smajumdar <[email protected]> * Double test of cuda numba Signed-off-by: smajumdar <[email protected]> * Double test of cuda numba Signed-off-by: smajumdar <[email protected]> * Enable RNNT tests Signed-off-by: smajumdar <[email protected]> Signed-off-by: Micha Livne <[email protected]> * Text Normalization Update (#2356) * upper cased date support Signed-off-by: ekmb <[email protected]> * update whitelist, change roman weights Signed-off-by: ekmb <[email protected]> * docstrings, space fix, init file Signed-off-by: ekmb <[email protected]> * lgtm Signed-off-by: ekmb <[email protected]> * fraction with measure class Signed-off-by: ekmb <[email protected]> Signed-off-by: Micha Livne <[email protected]> * Add ASR CTC tutorial on fine-tuning on another language (#2346) * Add ASR CTC Language finetuning notebook Signed-off-by: smajumdar <[email protected]> * Add to documentation Signed-off-by: smajumdar <[email protected]> * Improve documentation Signed-off-by: smajumdar <[email protected]> * Correct name of the dataset Signed-off-by: smajumdar <[email protected]> Signed-off-by: Micha Livne <[email protected]> * Correct colab link to notebook (#2366) Signed-off-by: smajumdar <[email protected]> Signed-off-by: Micha Livne <[email protected]> * sgdqa update data directories for testing (#2323) * sgdqa update data directories for testing Signed-off-by: Yang Zhang <[email protected]> * fix syntax Signed-off-by: Yang Zhang <[email protected]> * check if data dir exists Signed-off-by: Yang Zhang <[email protected]> * fix Signed-off-by: Yang Zhang <[email protected]> * adding pretrained model Signed-off-by: Yang Zhang <[email protected]> Signed-off-by: Micha Livne <[email protected]> * Added documentation for export() (#2330) * Added export document Signed-off-by: Boris Fomitchev <[email protected]> * Addressed review comments Signed-off-by: Boris Fomitchev <[email protected]> Co-authored-by: Eric Harper <[email protected]> Signed-off-by: Micha Livne <[email protected]> * Update Citrinet model card info (#2369) * Update model card info Signed-off-by: smajumdar <[email protected]> * Cleanup Docs Signed-off-by: smajumdar <[email protected]> Signed-off-by: Micha Livne <[email protected]> * [NMT] Model Parallel Megatron Encoders (#2238) * add megatron encoder Signed-off-by: ericharper <[email protected]> * added megatron to get_nmt_tokenizer Signed-off-by: ericharper <[email protected]> * add vocab_size and hidden_size to megatron bert Signed-off-by: ericharper <[email protected]> * add megatron encoder module Signed-off-by: ericharper <[email protected]> * fixed horrible typo Signed-off-by: ericharper <[email protected]> * fix typo and add default Signed-off-by: ericharper <[email protected]> * updating nlp overrides for mp nmt Signed-off-by: ericharper <[email protected]> * move some logic back to nlpmodel from overrides Signed-off-by: ericharper <[email protected]> * add checkpoint_file property Signed-off-by: ericharper <[email protected]> * fix property Signed-off-by: ericharper <[email protected]> * num_tokentypes=0 Signed-off-by: ericharper <[email protected]> * typo Signed-off-by: ericharper <[email protected]> * typo Signed-off-by: ericharper <[email protected]> * find_unused_parameters=True Signed-off-by: ericharper <[email protected]> * typo Signed-off-by: ericharper <[email protected]> * style Signed-off-by: ericharper <[email protected]> * get instead of pop Signed-off-by: ericharper <[email protected]> * remove token type ids from megatron input example Signed-off-by: ericharper <[email protected]> * pop vocab_size Signed-off-by: ericharper <[email protected]> * fix checkpointing for model parallel Signed-off-by: ericharper <[email protected]> * fix bug in non model parallel Signed-off-by: ericharper <[email protected]> * convert cfg.trainer to dict Signed-off-by: ericharper <[email protected]> * make num_tokentypes configurable for nmt Signed-off-by: ericharper <[email protected]> * update checkpoint_file when using named megatron model in nemo Signed-off-by: ericharper <[email protected]> * make vocab_file configurable Signed-off-by: ericharper <[email protected]> * dataclass can't have mutable default Signed-off-by: ericharper <[email protected]> * style Signed-off-by: ericharper <[email protected]> * unused imports Signed-off-by: ericharper <[email protected]> * revert input example Signed-off-by: ericharper <[email protected]> * check that checkpoint version is not None Signed-off-by: ericharper <[email protected]> * add mp jenkins test Signed-off-by: ericharper <[email protected]> * update docstring Signed-off-by: ericharper <[email protected]> * add docs for pretrained encoders with nemo nmt Signed-off-by: ericharper <[email protected]> Signed-off-by: Micha Livne <[email protected]> * Add notebook with recommendations for 8 kHz speech (#2326) * Added a notebook with best practices for telephony speech * Added datasets detaiils * Added training recommendations * Emptied out cells with results * Added tutorial to docs Signed-off-by: jbalam <[email protected]> * Addressed review comments Signed-off-by: jbalam <[email protected]> * Added a line to note original sampling rate of an4 Signed-off-by: jbalam <[email protected]> * Made changes suggested in review Signed-off-by: jbalam <[email protected]> Signed-off-by: Micha Livne <[email protected]> * 1. Working on bottleneck transformers. Signed-off-by: Micha Livne <[email protected]> * 1. Working on bottleneck transformers. * 1. Done cleaning code of bottleneck transformers. 2. Ready to test. Signed-off-by: Micha Livne <[email protected]> * 1. Done cleaning code of bottleneck transformers. 2. Ready to test. * 1. Working on training script. Signed-off-by: Micha Livne <[email protected]> * 1. Working on training script. * 1. Updated config class name. Signed-off-by: Micha Livne <[email protected]> * 1. Updated config class name. * 1. Training script is ready to be tested. Signed-off-by: Micha Livne <[email protected]> * 1. Training script is ready to be tested. * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. * Add FastEmit support for RNNT Losses (#2374) * Temp commit Signed-off-by: smajumdar <[email protected]> * Initial code for fastemit forward pass Signed-off-by: smajumdar <[email protected]> * Correct return reg value Signed-off-by: smajumdar <[email protected]> * Initial cpu impl Signed-off-by: smajumdar <[email protected]> * Try gpu impl Signed-off-by: smajumdar <[email protected]> * Try gpu impl Signed-off-by: smajumdar <[email protected]> * Correct few impl Signed-off-by: smajumdar <[email protected]> * Update fastemit scaling Signed-off-by: smajumdar <[email protected]> * Cleanup fastemit Signed-off-by: smajumdar <[email protected]> * Finalize FastEmit regularization PR Signed-off-by: smajumdar <[email protected]> * Refactor code to support fastemit regularization Signed-off-by: smajumdar <[email protected]> Co-authored-by: Samuel Kriman <[email protected]> Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. * 1. Fixed bugs. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed bugs. * 1. Fixed missing import. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed missing import. * 1. Fixed support in seq2seq-br. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed support in seq2seq-br. * 1. Added NLPDDPPlugin. Signed-off-by: Micha Livne <[email protected]> * 1. Added NLPDDPPlugin. * fix bugs in hifigan code (#2392) Signed-off-by: Oktai Tatanov <[email protected]> Signed-off-by: Micha Livne <[email protected]> * Update setup.py (#2394) Signed-off-by: Jason <[email protected]> Signed-off-by: Micha Livne <[email protected]> * 1. Updated to support multi-node training. Signed-off-by: Micha Livne <[email protected]> * 1. Added comments. Signed-off-by: Micha Livne <[email protected]> * 1. MTBottleneckModel is in its own file mt_enc_dec_bottleneck_model. Signed-off-by: Micha Livne <[email protected]> * 1. Switched loss annealing to rely on self.trainer.global_step Signed-off-by: Micha Livne <[email protected]> * 1. Added comments regrding the use of return_ortho_loss. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Added detailed logging of loss during training (still need to do the same for eval). Signed-off-by: Micha Livne <[email protected]> * 1. Testing a fix to import bug. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging wrong import issue. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Added logging of results to validation step (no tested yet). Signed-off-by: Micha Livne <[email protected]> * 1. Fixed missing import. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Testing failing immports. Signed-off-by: Micha Livne <[email protected]> * 1. Disabling changes. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Enabled bottleneck architecture. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed identation. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed import statement. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed typo. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed logging of arbitrary values. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed torch lightining logging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Added a missing import. Signed-off-by: Micha Livne <[email protected]> * 1. Added NLPDDPPlugin. Signed-off-by: Micha Livne <[email protected]> * 1. Cleaned style. Signed-off-by: Micha Livne <[email protected]> * 1. Updated sign of computed loss. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed double import. Signed-off-by: Micha Livne <[email protected]> * 1. Moved logging of additional loss terms into MTBottleneckModel class. Signed-off-by: Micha Livne <[email protected]> * 1. Updated permissions. Signed-off-by: Micha Livne <[email protected]> * 1. Added initial perceiver package. Signed-off-by: Micha Livne <[email protected]> * 1. Working on encoder. Signed-off-by: Micha Livne <[email protected]> * 1. Testing perceiver. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. FInished implementing Perceiver. Signed-off-by: Micha Livne <[email protected]> * 1. Updated default arch. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Ignoring independant perceiver implementation. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Added latent transformer to perceiver Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Added TransformerBottleneckDecoderNM. Signed-off-by: Micha Livne <[email protected]> * 1. Added TransformerBottleneckEncoderNM. Signed-off-by: Micha Livne <[email protected]> * 1. Updated bottleneck perceiver. Signed-off-by: Micha Livne <[email protected]> * 1. Updated MTBottleneckModel. Signed-off-by: Micha Livne <[email protected]> * 1. Added BridgeEncoder. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Cleaned code. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Updated architecture name. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Added support in bridge encoder. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Added support in hidden_init_method to BridgeEncoder. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Removed unneeded imports. Signed-off-by: Micha Livne <[email protected]> * 1. Updated comment in YAML Signed-off-by: Micha Livne <[email protected]> * 1. Fixed style. Signed-off-by: Micha Livne <[email protected]> * 1. Updated YAML comments. 2. hidden_blocks in bridge relates to post-processing after bridge1. Updated YAML comments. 2. hidden_blocks in bridge relates to post-processing after bridge (instead of hidden_blocks-1). Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Initial cross attention in Perceiver with params init has independant parameters. Signed-off-by: Micha Livne <[email protected]> * 1. Updated Perciver forward. Signed-off-by: Micha Livne <[email protected]> * 1. Updated TransformerEncoder to be a component as opposed to a parent class. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Updated example command. Signed-off-by: Micha Livne <[email protected]> * 1. forward nethod in MTBottleneckModel does not compute loss. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed style. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Added label smoothing for per-sample loss. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed style. Signed-off-by: Micha Livne <[email protected]> * 1. Updated recon_only loss to nll. Signed-off-by: Micha Livne <[email protected]> * 1. Update yaml doc. Signed-off-by: Micha Livne <[email protected]> * 1. Updated default config to have 32 hidden steps. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Updated doc. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed type. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed unreachable code bug. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed wrong sign for reconstruction per sample (instead of per token). Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed style. Signed-off-by: Micha Livne <[email protected]> * 1. Updated comments. Signed-off-by: Micha Livne <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Aleksey Grinchuk (Oleksii Hrinchuk) <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Evelina <[email protected]> Co-authored-by: Jason <[email protected]> Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: Boris Fomitchev <[email protected]> Co-authored-by: Jagadeesh Balam <[email protected]> Co-authored-by: Micha Livne <[email protected]> Co-authored-by: Samuel Kriman <[email protected]> Co-authored-by: Oktai Tatanov <[email protected]>
* upper bound for webdataset Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: Micha Livne <[email protected]> * Correct Dockerfile Signed-off-by: smajumdar <[email protected]> Signed-off-by: Micha Livne <[email protected]> * update readmes Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: Micha Livne <[email protected]> * update README (NVIDIA#2332) Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: Micha Livne <[email protected]> * ddp translate GPU allocation fix (NVIDIA#2312) * fixed branch in IR tutorial Signed-off-by: AlexGrinch <[email protected]> * ddp translate GPU allocation fix Signed-off-by: AlexGrinch <[email protected]> * map_location instead of set_device Signed-off-by: AlexGrinch <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Signed-off-by: Micha Livne <[email protected]> * Shallow fusion (NVIDIA#2315) * fixed branch in IR tutorial Signed-off-by: AlexGrinch <[email protected]> * shallow fusion init commit Signed-off-by: AlexGrinch <[email protected]> * debug info removed Signed-off-by: AlexGrinch <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Signed-off-by: Micha Livne <[email protected]> * [BUGFIX] Add upper bound to hydra for 1.0.x (NVIDIA#2337) * upper bound hydra Signed-off-by: ericharper <[email protected]> * upper bound hydra Signed-off-by: ericharper <[email protected]> Signed-off-by: Micha Livne <[email protected]> * update version number Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: Micha Livne <[email protected]> * update package version Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: Micha Livne <[email protected]> * sparrowhawk tests + punctuation post processing for pynini TN (NVIDIA#2320) * add jenkins test, refactoring Signed-off-by: ekmb <[email protected]> * update test Signed-off-by: ekmb <[email protected]> * fix new test Signed-off-by: ekmb <[email protected]> * add serial to the default normalizer, add tests Signed-off-by: ekmb <[email protected]> * manifest test added Signed-off-by: ekmb <[email protected]> * expose more params, new test cases Signed-off-by: ekmb <[email protected]> * fix jenkins, serial clean, exclude range from cardinal Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins dollar sign format Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins dollar sign format Signed-off-by: ekmb <[email protected]> * addressed review comments Signed-off-by: ekmb <[email protected]> * fix decimal in measure Signed-off-by: ekmb <[email protected]> * move serial in cardinal Signed-off-by: ekmb <[email protected]> * sh tests init Signed-off-by: ekmb <[email protected]> * sparrowhawk container tests support added Signed-off-by: ekmb <[email protected]> * add post process to normalize.py, update tests Signed-off-by: ekmb <[email protected]> * remove duplication Signed-off-by: ekmb <[email protected]> Signed-off-by: Micha Livne <[email protected]> * Update notebooks to 1.0.2 release (NVIDIA#2338) Signed-off-by: smajumdar <[email protected]> Signed-off-by: Micha Livne <[email protected]> * Update ranges for omegaconf and hydra (NVIDIA#2336) * Update ranges Signed-off-by: smajumdar <[email protected]> * Updates for Hydra and OmegaConf updates Signed-off-by: smajumdar <[email protected]> * Style fixes Signed-off-by: smajumdar <[email protected]> * Correct tests and revert patch for model utils Signed-off-by: smajumdar <[email protected]> * Correct docstring Signed-off-by: smajumdar <[email protected]> * Revert unnecessary change Signed-off-by: smajumdar <[email protected]> * Revert unnecessary change Signed-off-by: smajumdar <[email protected]> * Guard scheduler for None Signed-off-by: smajumdar <[email protected]> * default to 0.0 if bpe_dropout is None Signed-off-by: ericharper <[email protected]> * Correctly log class that was restored Signed-off-by: smajumdar <[email protected]> * Root patch *bpe_dropout Signed-off-by: smajumdar <[email protected]> Co-authored-by: ericharper <[email protected]> Signed-off-by: Micha Livne <[email protected]> * Update FastPitch Export (NVIDIA#2355) Signed-off-by: Jason <[email protected]> Signed-off-by: Micha Livne <[email protected]> * update out_dir to not collide (NVIDIA#2358) Signed-off-by: ericharper <[email protected]> Signed-off-by: Micha Livne <[email protected]> * Update container version to 21.05 (NVIDIA#2309) * Update container version Signed-off-by: smajumdar <[email protected]> * Temporarily change export format of waveglow Signed-off-by: smajumdar <[email protected]> * Add conda update for numba Signed-off-by: smajumdar <[email protected]> * Update numba compat via global flag for strictness level `--relax_numba_compat`, remove pytorchlightning.metrics, refactor out numba utils to core, update tests Signed-off-by: smajumdar <[email protected]> * Correct order of numba minimum verion, remove wrong flag from test Signed-off-by: smajumdar <[email protected]> * Double test of cuda numba Signed-off-by: smajumdar <[email protected]> * Double test of cuda numba Signed-off-by: smajumdar <[email protected]> * Enable RNNT tests Signed-off-by: smajumdar <[email protected]> Signed-off-by: Micha Livne <[email protected]> * Text Normalization Update (NVIDIA#2356) * upper cased date support Signed-off-by: ekmb <[email protected]> * update whitelist, change roman weights Signed-off-by: ekmb <[email protected]> * docstrings, space fix, init file Signed-off-by: ekmb <[email protected]> * lgtm Signed-off-by: ekmb <[email protected]> * fraction with measure class Signed-off-by: ekmb <[email protected]> Signed-off-by: Micha Livne <[email protected]> * Add ASR CTC tutorial on fine-tuning on another language (NVIDIA#2346) * Add ASR CTC Language finetuning notebook Signed-off-by: smajumdar <[email protected]> * Add to documentation Signed-off-by: smajumdar <[email protected]> * Improve documentation Signed-off-by: smajumdar <[email protected]> * Correct name of the dataset Signed-off-by: smajumdar <[email protected]> Signed-off-by: Micha Livne <[email protected]> * Correct colab link to notebook (NVIDIA#2366) Signed-off-by: smajumdar <[email protected]> Signed-off-by: Micha Livne <[email protected]> * sgdqa update data directories for testing (NVIDIA#2323) * sgdqa update data directories for testing Signed-off-by: Yang Zhang <[email protected]> * fix syntax Signed-off-by: Yang Zhang <[email protected]> * check if data dir exists Signed-off-by: Yang Zhang <[email protected]> * fix Signed-off-by: Yang Zhang <[email protected]> * adding pretrained model Signed-off-by: Yang Zhang <[email protected]> Signed-off-by: Micha Livne <[email protected]> * Added documentation for export() (NVIDIA#2330) * Added export document Signed-off-by: Boris Fomitchev <[email protected]> * Addressed review comments Signed-off-by: Boris Fomitchev <[email protected]> Co-authored-by: Eric Harper <[email protected]> Signed-off-by: Micha Livne <[email protected]> * Update Citrinet model card info (NVIDIA#2369) * Update model card info Signed-off-by: smajumdar <[email protected]> * Cleanup Docs Signed-off-by: smajumdar <[email protected]> Signed-off-by: Micha Livne <[email protected]> * [NMT] Model Parallel Megatron Encoders (NVIDIA#2238) * add megatron encoder Signed-off-by: ericharper <[email protected]> * added megatron to get_nmt_tokenizer Signed-off-by: ericharper <[email protected]> * add vocab_size and hidden_size to megatron bert Signed-off-by: ericharper <[email protected]> * add megatron encoder module Signed-off-by: ericharper <[email protected]> * fixed horrible typo Signed-off-by: ericharper <[email protected]> * fix typo and add default Signed-off-by: ericharper <[email protected]> * updating nlp overrides for mp nmt Signed-off-by: ericharper <[email protected]> * move some logic back to nlpmodel from overrides Signed-off-by: ericharper <[email protected]> * add checkpoint_file property Signed-off-by: ericharper <[email protected]> * fix property Signed-off-by: ericharper <[email protected]> * num_tokentypes=0 Signed-off-by: ericharper <[email protected]> * typo Signed-off-by: ericharper <[email protected]> * typo Signed-off-by: ericharper <[email protected]> * find_unused_parameters=True Signed-off-by: ericharper <[email protected]> * typo Signed-off-by: ericharper <[email protected]> * style Signed-off-by: ericharper <[email protected]> * get instead of pop Signed-off-by: ericharper <[email protected]> * remove token type ids from megatron input example Signed-off-by: ericharper <[email protected]> * pop vocab_size Signed-off-by: ericharper <[email protected]> * fix checkpointing for model parallel Signed-off-by: ericharper <[email protected]> * fix bug in non model parallel Signed-off-by: ericharper <[email protected]> * convert cfg.trainer to dict Signed-off-by: ericharper <[email protected]> * make num_tokentypes configurable for nmt Signed-off-by: ericharper <[email protected]> * update checkpoint_file when using named megatron model in nemo Signed-off-by: ericharper <[email protected]> * make vocab_file configurable Signed-off-by: ericharper <[email protected]> * dataclass can't have mutable default Signed-off-by: ericharper <[email protected]> * style Signed-off-by: ericharper <[email protected]> * unused imports Signed-off-by: ericharper <[email protected]> * revert input example Signed-off-by: ericharper <[email protected]> * check that checkpoint version is not None Signed-off-by: ericharper <[email protected]> * add mp jenkins test Signed-off-by: ericharper <[email protected]> * update docstring Signed-off-by: ericharper <[email protected]> * add docs for pretrained encoders with nemo nmt Signed-off-by: ericharper <[email protected]> Signed-off-by: Micha Livne <[email protected]> * Add notebook with recommendations for 8 kHz speech (NVIDIA#2326) * Added a notebook with best practices for telephony speech * Added datasets detaiils * Added training recommendations * Emptied out cells with results * Added tutorial to docs Signed-off-by: jbalam <[email protected]> * Addressed review comments Signed-off-by: jbalam <[email protected]> * Added a line to note original sampling rate of an4 Signed-off-by: jbalam <[email protected]> * Made changes suggested in review Signed-off-by: jbalam <[email protected]> Signed-off-by: Micha Livne <[email protected]> * 1. Working on bottleneck transformers. Signed-off-by: Micha Livne <[email protected]> * 1. Working on bottleneck transformers. * 1. Done cleaning code of bottleneck transformers. 2. Ready to test. Signed-off-by: Micha Livne <[email protected]> * 1. Done cleaning code of bottleneck transformers. 2. Ready to test. * 1. Working on training script. Signed-off-by: Micha Livne <[email protected]> * 1. Working on training script. * 1. Updated config class name. Signed-off-by: Micha Livne <[email protected]> * 1. Updated config class name. * 1. Training script is ready to be tested. Signed-off-by: Micha Livne <[email protected]> * 1. Training script is ready to be tested. * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. * Add FastEmit support for RNNT Losses (NVIDIA#2374) * Temp commit Signed-off-by: smajumdar <[email protected]> * Initial code for fastemit forward pass Signed-off-by: smajumdar <[email protected]> * Correct return reg value Signed-off-by: smajumdar <[email protected]> * Initial cpu impl Signed-off-by: smajumdar <[email protected]> * Try gpu impl Signed-off-by: smajumdar <[email protected]> * Try gpu impl Signed-off-by: smajumdar <[email protected]> * Correct few impl Signed-off-by: smajumdar <[email protected]> * Update fastemit scaling Signed-off-by: smajumdar <[email protected]> * Cleanup fastemit Signed-off-by: smajumdar <[email protected]> * Finalize FastEmit regularization PR Signed-off-by: smajumdar <[email protected]> * Refactor code to support fastemit regularization Signed-off-by: smajumdar <[email protected]> Co-authored-by: Samuel Kriman <[email protected]> Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. * 1. Fixed bugs. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed bugs. * 1. Fixed missing import. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed missing import. * 1. Fixed support in seq2seq-br. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed support in seq2seq-br. * 1. Added NLPDDPPlugin. Signed-off-by: Micha Livne <[email protected]> * 1. Added NLPDDPPlugin. * fix bugs in hifigan code (NVIDIA#2392) Signed-off-by: Oktai Tatanov <[email protected]> Signed-off-by: Micha Livne <[email protected]> * Update setup.py (NVIDIA#2394) Signed-off-by: Jason <[email protected]> Signed-off-by: Micha Livne <[email protected]> * 1. Updated to support multi-node training. Signed-off-by: Micha Livne <[email protected]> * 1. Added comments. Signed-off-by: Micha Livne <[email protected]> * 1. MTBottleneckModel is in its own file mt_enc_dec_bottleneck_model. Signed-off-by: Micha Livne <[email protected]> * 1. Switched loss annealing to rely on self.trainer.global_step Signed-off-by: Micha Livne <[email protected]> * 1. Added comments regrding the use of return_ortho_loss. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Added detailed logging of loss during training (still need to do the same for eval). Signed-off-by: Micha Livne <[email protected]> * 1. Testing a fix to import bug. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging wrong import issue. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Added logging of results to validation step (no tested yet). Signed-off-by: Micha Livne <[email protected]> * 1. Fixed missing import. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Testing failing immports. Signed-off-by: Micha Livne <[email protected]> * 1. Disabling changes. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Enabled bottleneck architecture. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed identation. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed import statement. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed typo. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed logging of arbitrary values. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed torch lightining logging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Added a missing import. Signed-off-by: Micha Livne <[email protected]> * 1. Added NLPDDPPlugin. Signed-off-by: Micha Livne <[email protected]> * 1. Cleaned style. Signed-off-by: Micha Livne <[email protected]> * 1. Updated sign of computed loss. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed double import. Signed-off-by: Micha Livne <[email protected]> * 1. Moved logging of additional loss terms into MTBottleneckModel class. Signed-off-by: Micha Livne <[email protected]> * 1. Updated permissions. Signed-off-by: Micha Livne <[email protected]> * 1. Added initial perceiver package. Signed-off-by: Micha Livne <[email protected]> * 1. Working on encoder. Signed-off-by: Micha Livne <[email protected]> * 1. Testing perceiver. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. FInished implementing Perceiver. Signed-off-by: Micha Livne <[email protected]> * 1. Updated default arch. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Ignoring independant perceiver implementation. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Added latent transformer to perceiver Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Added TransformerBottleneckDecoderNM. Signed-off-by: Micha Livne <[email protected]> * 1. Added TransformerBottleneckEncoderNM. Signed-off-by: Micha Livne <[email protected]> * 1. Updated bottleneck perceiver. Signed-off-by: Micha Livne <[email protected]> * 1. Updated MTBottleneckModel. Signed-off-by: Micha Livne <[email protected]> * 1. Added BridgeEncoder. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Cleaned code. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Updated architecture name. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Added support in bridge encoder. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Added support in hidden_init_method to BridgeEncoder. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Removed unneeded imports. Signed-off-by: Micha Livne <[email protected]> * 1. Updated comment in YAML Signed-off-by: Micha Livne <[email protected]> * 1. Fixed style. Signed-off-by: Micha Livne <[email protected]> * 1. Updated YAML comments. 2. hidden_blocks in bridge relates to post-processing after bridge1. Updated YAML comments. 2. hidden_blocks in bridge relates to post-processing after bridge (instead of hidden_blocks-1). Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Initial cross attention in Perceiver with params init has independant parameters. Signed-off-by: Micha Livne <[email protected]> * 1. Updated Perciver forward. Signed-off-by: Micha Livne <[email protected]> * 1. Updated TransformerEncoder to be a component as opposed to a parent class. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Updated example command. Signed-off-by: Micha Livne <[email protected]> * 1. forward nethod in MTBottleneckModel does not compute loss. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed style. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Added label smoothing for per-sample loss. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed style. Signed-off-by: Micha Livne <[email protected]> * 1. Updated recon_only loss to nll. Signed-off-by: Micha Livne <[email protected]> * 1. Update yaml doc. Signed-off-by: Micha Livne <[email protected]> * 1. Updated default config to have 32 hidden steps. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Updated doc. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed type. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed unreachable code bug. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed wrong sign for reconstruction per sample (instead of per token). Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed style. Signed-off-by: Micha Livne <[email protected]> * 1. Updated comments. Signed-off-by: Micha Livne <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Aleksey Grinchuk (Oleksii Hrinchuk) <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Evelina <[email protected]> Co-authored-by: Jason <[email protected]> Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: Boris Fomitchev <[email protected]> Co-authored-by: Jagadeesh Balam <[email protected]> Co-authored-by: Micha Livne <[email protected]> Co-authored-by: Samuel Kriman <[email protected]> Co-authored-by: Oktai Tatanov <[email protected]> Signed-off-by: Jason <[email protected]>
* Add notebook with recommendations for 8 kHz speech (NVIDIA#2326) * Added a notebook with best practices for telephony speech * Added datasets detaiils * Added training recommendations * Emptied out cells with results * Added tutorial to docs Signed-off-by: jbalam <[email protected]> * Addressed review comments Signed-off-by: jbalam <[email protected]> * Added a line to note original sampling rate of an4 Signed-off-by: jbalam <[email protected]> * Made changes suggested in review Signed-off-by: jbalam <[email protected]> Signed-off-by: Tuan Lai <[email protected]> * Add FastEmit support for RNNT Losses (NVIDIA#2374) * Temp commit Signed-off-by: smajumdar <[email protected]> * Initial code for fastemit forward pass Signed-off-by: smajumdar <[email protected]> * Correct return reg value Signed-off-by: smajumdar <[email protected]> * Initial cpu impl Signed-off-by: smajumdar <[email protected]> * Try gpu impl Signed-off-by: smajumdar <[email protected]> * Try gpu impl Signed-off-by: smajumdar <[email protected]> * Correct few impl Signed-off-by: smajumdar <[email protected]> * Update fastemit scaling Signed-off-by: smajumdar <[email protected]> * Cleanup fastemit Signed-off-by: smajumdar <[email protected]> * Finalize FastEmit regularization PR Signed-off-by: smajumdar <[email protected]> * Refactor code to support fastemit regularization Signed-off-by: smajumdar <[email protected]> Co-authored-by: Samuel Kriman <[email protected]> Signed-off-by: Tuan Lai <[email protected]> * Implement inference functions of TN models Signed-off-by: Tuan Lai <[email protected]> * Minor Fix Signed-off-by: Tuan Lai <[email protected]> * fix bugs in hifigan code (NVIDIA#2392) Signed-off-by: Oktai Tatanov <[email protected]> Signed-off-by: Tuan Lai <[email protected]> * Update setup.py (NVIDIA#2394) Signed-off-by: Jason <[email protected]> Signed-off-by: Tuan Lai <[email protected]> * update checkpointing (NVIDIA#2396) Signed-off-by: Jason <[email protected]> Signed-off-by: Tuan Lai <[email protected]> * byt5 unicode implementation (NVIDIA#2365) * Audio Norm (NVIDIA#2285) * add jenkins test, refactoring Signed-off-by: ekmb <[email protected]> * update test Signed-off-by: ekmb <[email protected]> * fix new test Signed-off-by: ekmb <[email protected]> * add serial to the default normalizer, add tests Signed-off-by: ekmb <[email protected]> * manifest test added Signed-off-by: ekmb <[email protected]> * expose more params, new test cases Signed-off-by: ekmb <[email protected]> * fix jenkins, serial clean, exclude range from cardinal Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins dollar sign format Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins dollar sign format Signed-off-by: ekmb <[email protected]> * addressed review comments Signed-off-by: ekmb <[email protected]> * fix decimal in measure Signed-off-by: ekmb <[email protected]> * move serial in cardinal Signed-off-by: ekmb <[email protected]> * clean up Signed-off-by: ekmb <[email protected]> * update for SH zero -> oh Signed-off-by: ekmb <[email protected]> * change n_tagger default Signed-off-by: ekmb <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * bumping version to 1.0.1 Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Add check for numba regardless of device Signed-off-by: smajumdar <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * upper bound for webdataset Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Correct Dockerfile Signed-off-by: smajumdar <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * update readmes Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * update README (NVIDIA#2332) Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * ddp translate GPU allocation fix (NVIDIA#2312) * fixed branch in IR tutorial Signed-off-by: AlexGrinch <[email protected]> * ddp translate GPU allocation fix Signed-off-by: AlexGrinch <[email protected]> * map_location instead of set_device Signed-off-by: AlexGrinch <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Shallow fusion (NVIDIA#2315) * fixed branch in IR tutorial Signed-off-by: AlexGrinch <[email protected]> * shallow fusion init commit Signed-off-by: AlexGrinch <[email protected]> * debug info removed Signed-off-by: AlexGrinch <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * [BUGFIX] Add upper bound to hydra for 1.0.x (NVIDIA#2337) * upper bound hydra Signed-off-by: ericharper <[email protected]> * upper bound hydra Signed-off-by: ericharper <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * update version number Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * update package version Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * sparrowhawk tests + punctuation post processing for pynini TN (NVIDIA#2320) * add jenkins test, refactoring Signed-off-by: ekmb <[email protected]> * update test Signed-off-by: ekmb <[email protected]> * fix new test Signed-off-by: ekmb <[email protected]> * add serial to the default normalizer, add tests Signed-off-by: ekmb <[email protected]> * manifest test added Signed-off-by: ekmb <[email protected]> * expose more params, new test cases Signed-off-by: ekmb <[email protected]> * fix jenkins, serial clean, exclude range from cardinal Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins dollar sign format Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins dollar sign format Signed-off-by: ekmb <[email protected]> * addressed review comments Signed-off-by: ekmb <[email protected]> * fix decimal in measure Signed-off-by: ekmb <[email protected]> * move serial in cardinal Signed-off-by: ekmb <[email protected]> * sh tests init Signed-off-by: ekmb <[email protected]> * sparrowhawk container tests support added Signed-off-by: ekmb <[email protected]> * add post process to normalize.py, update tests Signed-off-by: ekmb <[email protected]> * remove duplication Signed-off-by: ekmb <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Update notebooks to 1.0.2 release (NVIDIA#2338) Signed-off-by: smajumdar <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Update ranges for omegaconf and hydra (NVIDIA#2336) * Update ranges Signed-off-by: smajumdar <[email protected]> * Updates for Hydra and OmegaConf updates Signed-off-by: smajumdar <[email protected]> * Style fixes Signed-off-by: smajumdar <[email protected]> * Correct tests and revert patch for model utils Signed-off-by: smajumdar <[email protected]> * Correct docstring Signed-off-by: smajumdar <[email protected]> * Revert unnecessary change Signed-off-by: smajumdar <[email protected]> * Revert unnecessary change Signed-off-by: smajumdar <[email protected]> * Guard scheduler for None Signed-off-by: smajumdar <[email protected]> * default to 0.0 if bpe_dropout is None Signed-off-by: ericharper <[email protected]> * Correctly log class that was restored Signed-off-by: smajumdar <[email protected]> * Root patch *bpe_dropout Signed-off-by: smajumdar <[email protected]> Co-authored-by: ericharper <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Update FastPitch Export (NVIDIA#2355) Signed-off-by: Jason <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * byt5 unicode implementation, first cut Signed-off-by: mchrzanowski <[email protected]> * add bytelevel tokenizer Signed-off-by: mchrzanowski <[email protected]> * update out_dir to not collide (NVIDIA#2358) Signed-off-by: ericharper <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Update container version to 21.05 (NVIDIA#2309) * Update container version Signed-off-by: smajumdar <[email protected]> * Temporarily change export format of waveglow Signed-off-by: smajumdar <[email protected]> * Add conda update for numba Signed-off-by: smajumdar <[email protected]> * Update numba compat via global flag for strictness level `--relax_numba_compat`, remove pytorchlightning.metrics, refactor out numba utils to core, update tests Signed-off-by: smajumdar <[email protected]> * Correct order of numba minimum verion, remove wrong flag from test Signed-off-by: smajumdar <[email protected]> * Double test of cuda numba Signed-off-by: smajumdar <[email protected]> * Double test of cuda numba Signed-off-by: smajumdar <[email protected]> * Enable RNNT tests Signed-off-by: smajumdar <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Text Normalization Update (NVIDIA#2356) * upper cased date support Signed-off-by: ekmb <[email protected]> * update whitelist, change roman weights Signed-off-by: ekmb <[email protected]> * docstrings, space fix, init file Signed-off-by: ekmb <[email protected]> * lgtm Signed-off-by: ekmb <[email protected]> * fraction with measure class Signed-off-by: ekmb <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * address comment Signed-off-by: mchrzanowski <[email protected]> * Add ASR CTC tutorial on fine-tuning on another language (NVIDIA#2346) * Add ASR CTC Language finetuning notebook Signed-off-by: smajumdar <[email protected]> * Add to documentation Signed-off-by: smajumdar <[email protected]> * Improve documentation Signed-off-by: smajumdar <[email protected]> * Correct name of the dataset Signed-off-by: smajumdar <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Correct colab link to notebook (NVIDIA#2366) Signed-off-by: smajumdar <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * sgdqa update data directories for testing (NVIDIA#2323) * sgdqa update data directories for testing Signed-off-by: Yang Zhang <[email protected]> * fix syntax Signed-off-by: Yang Zhang <[email protected]> * check if data dir exists Signed-off-by: Yang Zhang <[email protected]> * fix Signed-off-by: Yang Zhang <[email protected]> * adding pretrained model Signed-off-by: Yang Zhang <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Added documentation for export() (NVIDIA#2330) * Added export document Signed-off-by: Boris Fomitchev <[email protected]> * Addressed review comments Signed-off-by: Boris Fomitchev <[email protected]> Co-authored-by: Eric Harper <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Update Citrinet model card info (NVIDIA#2369) * Update model card info Signed-off-by: smajumdar <[email protected]> * Cleanup Docs Signed-off-by: smajumdar <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * [NMT] Model Parallel Megatron Encoders (NVIDIA#2238) * add megatron encoder Signed-off-by: ericharper <[email protected]> * added megatron to get_nmt_tokenizer Signed-off-by: ericharper <[email protected]> * add vocab_size and hidden_size to megatron bert Signed-off-by: ericharper <[email protected]> * add megatron encoder module Signed-off-by: ericharper <[email protected]> * fixed horrible typo Signed-off-by: ericharper <[email protected]> * fix typo and add default Signed-off-by: ericharper <[email protected]> * updating nlp overrides for mp nmt Signed-off-by: ericharper <[email protected]> * move some logic back to nlpmodel from overrides Signed-off-by: ericharper <[email protected]> * add checkpoint_file property Signed-off-by: ericharper <[email protected]> * fix property Signed-off-by: ericharper <[email protected]> * num_tokentypes=0 Signed-off-by: ericharper <[email protected]> * typo Signed-off-by: ericharper <[email protected]> * typo Signed-off-by: ericharper <[email protected]> * find_unused_parameters=True Signed-off-by: ericharper <[email protected]> * typo Signed-off-by: ericharper <[email protected]> * style Signed-off-by: ericharper <[email protected]> * get instead of pop Signed-off-by: ericharper <[email protected]> * remove token type ids from megatron input example Signed-off-by: ericharper <[email protected]> * pop vocab_size Signed-off-by: ericharper <[email protected]> * fix checkpointing for model parallel Signed-off-by: ericharper <[email protected]> * fix bug in non model parallel Signed-off-by: ericharper <[email protected]> * convert cfg.trainer to dict Signed-off-by: ericharper <[email protected]> * make num_tokentypes configurable for nmt Signed-off-by: ericharper <[email protected]> * update checkpoint_file when using named megatron model in nemo Signed-off-by: ericharper <[email protected]> * make vocab_file configurable Signed-off-by: ericharper <[email protected]> * dataclass can't have mutable default Signed-off-by: ericharper <[email protected]> * style Signed-off-by: ericharper <[email protected]> * unused imports Signed-off-by: ericharper <[email protected]> * revert input example Signed-off-by: ericharper <[email protected]> * check that checkpoint version is not None Signed-off-by: ericharper <[email protected]> * add mp jenkins test Signed-off-by: ericharper <[email protected]> * update docstring Signed-off-by: ericharper <[email protected]> * add docs for pretrained encoders with nemo nmt Signed-off-by: ericharper <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Add notebook with recommendations for 8 kHz speech (NVIDIA#2326) * Added a notebook with best practices for telephony speech * Added datasets detaiils * Added training recommendations * Emptied out cells with results * Added tutorial to docs Signed-off-by: jbalam <[email protected]> * Addressed review comments Signed-off-by: jbalam <[email protected]> * Added a line to note original sampling rate of an4 Signed-off-by: jbalam <[email protected]> * Made changes suggested in review Signed-off-by: jbalam <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Add FastEmit support for RNNT Losses (NVIDIA#2374) * Temp commit Signed-off-by: smajumdar <[email protected]> * Initial code for fastemit forward pass Signed-off-by: smajumdar <[email protected]> * Correct return reg value Signed-off-by: smajumdar <[email protected]> * Initial cpu impl Signed-off-by: smajumdar <[email protected]> * Try gpu impl Signed-off-by: smajumdar <[email protected]> * Try gpu impl Signed-off-by: smajumdar <[email protected]> * Correct few impl Signed-off-by: smajumdar <[email protected]> * Update fastemit scaling Signed-off-by: smajumdar <[email protected]> * Cleanup fastemit Signed-off-by: smajumdar <[email protected]> * Finalize FastEmit regularization PR Signed-off-by: smajumdar <[email protected]> * Refactor code to support fastemit regularization Signed-off-by: smajumdar <[email protected]> Co-authored-by: Samuel Kriman <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * byt5 unicode implementation, first cut Signed-off-by: Mike Chrzanowski <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * add bytelevel tokenizer Signed-off-by: Mike Chrzanowski <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * update styling Signed-off-by: Mike Chrzanowski <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * avoid circular import Signed-off-by: Mike Chrzanowski <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * fix bugs in hifigan code (NVIDIA#2392) Signed-off-by: Oktai Tatanov <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Update setup.py (NVIDIA#2394) Signed-off-by: Jason <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * Update bytelevel_tokenizer.py Signed-off-by: mchrzanowski <[email protected]> * Update bytelevel_tokenizer.py Signed-off-by: mchrzanowski <[email protected]> * typo Signed-off-by: mchrzanowski <[email protected]> * missed one Signed-off-by: mchrzanowski <[email protected]> * bug fixes Signed-off-by: mchrzanowski <[email protected]> * style fix Signed-off-by: mchrzanowski <[email protected]> * bytelevelprocessor is now generic. Signed-off-by: mchrzanowski <[email protected]> * style fix Signed-off-by: mchrzanowski <[email protected]> * update checkpointing (NVIDIA#2396) Signed-off-by: Jason <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * style Signed-off-by: ericharper <[email protected]> Signed-off-by: mchrzanowski <[email protected]> * woops, didnt merge jenkinsfile the right way * add newline Signed-off-by: mchrzanowski <[email protected]> * undo changes to enja processor Signed-off-by: mchrzanowski <[email protected]> * processor selection decision fix Signed-off-by: mchrzanowski <[email protected]> * newline fix Signed-off-by: mchrzanowski <[email protected]> Co-authored-by: Evelina <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Aleksey Grinchuk (Oleksii Hrinchuk) <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Jason <[email protected]> Co-authored-by: mchrzanowski <[email protected]> Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: Boris Fomitchev <[email protected]> Co-authored-by: Jagadeesh Balam <[email protected]> Co-authored-by: Samuel Kriman <[email protected]> Co-authored-by: Oktai Tatanov <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: root <[email protected]> Signed-off-by: Tuan Lai <[email protected]> * Minor Fix Signed-off-by: Tuan Lai <[email protected]> * Minor Fixes Signed-off-by: Tuan Lai <[email protected]> * Add TextNormalizationTestDataset and testing/evaluation code Signed-off-by: Tuan Lai <[email protected]> * Add TextNormalizationTaggerDataset and training code for tagger Signed-off-by: Tuan Lai <[email protected]> * Restore from local nemo ckpts Signed-off-by: Tuan Lai <[email protected]> * Add TextNormalizationDecoderDataset Signed-off-by: Tuan Lai <[email protected]> * Add interactive mode for neural_text_normalization_test.py Signed-off-by: Tuan Lai <[email protected]> * Add options to do training or not for tagger/decoder Signed-off-by: Tuan Lai <[email protected]> * Renamed Signed-off-by: Tuan Lai <[email protected]> * Implemented setup dataloader for decoder Signed-off-by: Tuan Lai <[email protected]> * Implemented training and validation for decoder Signed-off-by: Tuan Lai <[email protected]> * Data augmentation for decoder training Signed-off-by: Tuan Lai <[email protected]> * Config change Signed-off-by: Tuan Lai <[email protected]> * add blossom-ci.yml (NVIDIA#2401) Signed-off-by: ericharper <[email protected]> Signed-off-by: Tuan Lai <[email protected]> * Merge r1.1 bugfixes into main (NVIDIA#2407) * Update notebook branch and Jenkinsfile for 1.1.0 testing (NVIDIA#2378) * update branch Signed-off-by: ericharper <[email protected]> * update jenkinsfile Signed-off-by: ericharper <[email protected]> * [BUGFIX] NMT Multi-node was incorrectly computing num_replicas (NVIDIA#2380) * fix property when not using model parallel Signed-off-by: ericharper <[email protected]> * fix property when not using model parallel Signed-off-by: ericharper <[email protected]> * add debug statement Signed-off-by: ericharper <[email protected]> * add debug statement Signed-off-by: ericharper <[email protected]> * instantiate with NLPDDPPlugin with num_nodes from trainer config Signed-off-by: ericharper <[email protected]> * Update ASR scripts for tokenizer building and tarred dataset building (NVIDIA#2381) * Update ASR scripts for tokenizer building and tarred dataset building Signed-off-by: smajumdar <[email protected]> * Update container Signed-off-by: smajumdar <[email protected]> * Add STT Zh Citrinet 1024 Gamma 0.25 model Signed-off-by: smajumdar <[email protected]> * Update notebook (NVIDIA#2391) Signed-off-by: smajumdar <[email protected]> * ASR Notebooks fix for 1.1.0 (NVIDIA#2395) * nb fix for spring clean Signed-off-by: fayejf <[email protected]> * remove outdated instruction Signed-off-by: fayejf <[email protected]> * Mean normalization (NVIDIA#2397) * norm embeddings Signed-off-by: nithinraok <[email protected]> * move to utils Signed-off-by: nithinraok <[email protected]> * Bugfix adaptive spec augment time masking (NVIDIA#2398) * bugfix adaptive spec augment Signed-off-by: smajumdar <[email protected]> * Revert freq mask guard Signed-off-by: smajumdar <[email protected]> * Revert freq mask guard Signed-off-by: smajumdar <[email protected]> * Remove static time width clamping Signed-off-by: smajumdar <[email protected]> * Correct typos and issues with notebooks (NVIDIA#2402) * Fix Primer notebook Signed-off-by: smajumdar <[email protected]> * Typo Signed-off-by: smajumdar <[email protected]> * remove accelerator=DDP in tutorial notebooks to avoid errors. (NVIDIA#2403) Signed-off-by: Hoo Chang Shin <[email protected]> Co-authored-by: Hoo Chang Shin <[email protected]> * style Signed-off-by: ericharper <[email protected]> * update jenkins branch Signed-off-by: ericharper <[email protected]> * update notebook branch to main Signed-off-by: ericharper <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: fayejf <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Co-authored-by: khcs <[email protected]> Co-authored-by: Hoo Chang Shin <[email protected]> Signed-off-by: Tuan Lai <[email protected]> * Remove unused imports Signed-off-by: Tuan Lai <[email protected]> * Add initial doc for text_normalization Signed-off-by: Tuan Lai <[email protected]> * Fixed imports warnings Signed-off-by: Tuan Lai <[email protected]> * Minor Fix Signed-off-by: Tuan Lai <[email protected]> * Renamed Signed-off-by: Tuan Lai <[email protected]> * Allowed duplex modes Signed-off-by: Tuan Lai <[email protected]> * Minor Fix Signed-off-by: Tuan Lai <[email protected]> * Add docs for duplex_text_normalization_train and duplex_text_normalization_test Signed-off-by: Tuan Lai <[email protected]> * docstrings for model codes + minor fix Signed-off-by: Tuan Lai <[email protected]> * Add more comments and doc strings Signed-off-by: Tuan Lai <[email protected]> * Add doc for datasets + Use time.perf_counter() Signed-off-by: Tuan Lai <[email protected]> * Add code for preprocessing Google TN data Signed-off-by: Tuan Lai <[email protected]> * Add more docs and comments + Minor Fixes Signed-off-by: Tuan Lai <[email protected]> * Add more licenses + Fixed comments + Minors Signed-off-by: Tuan Lai <[email protected]> * Moved evaluation logic to DuplexTextNormalizationModel Signed-off-by: Tuan Lai <[email protected]> * Add logging errors Signed-off-by: Tuan Lai <[email protected]> * Updated validation code of tagger + Minors Signed-off-by: Tuan Lai <[email protected]> * Also write tag preds to log file Signed-off-by: Tuan Lai <[email protected]> * Add data augmentation for tagger dataset Signed-off-by: Tuan Lai <[email protected]> * Added experimental decorators Signed-off-by: Tuan Lai <[email protected]> * Updated docs Signed-off-by: Tuan Lai <[email protected]> * Updated duplex_tn_config.yaml Signed-off-by: Tuan Lai <[email protected]> * Compute token precision of tagger using NeMo metrics Signed-off-by: Tuan Lai <[email protected]> * Fixed saving issue when using ddp accelerator Signed-off-by: Tuan Lai <[email protected]> * Refactoring Signed-off-by: Tuan Lai <[email protected]> * Add option to keep punctuations in TextNormalizationTestDataset Signed-off-by: Tuan Lai <[email protected]> * Changes to input preprocessing + decoder's postprocessing Signed-off-by: Tuan Lai <[email protected]> * Fixed styles + Add references Signed-off-by: Tuan Lai <[email protected]> * Renamed examples/nlp/duplex_text_normalization/utils.py to helpers.py Signed-off-by: Tuan Lai <[email protected]> Co-authored-by: Jagadeesh Balam <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: Samuel Kriman <[email protected]> Co-authored-by: Oktai Tatanov <[email protected]> Co-authored-by: Jason <[email protected]> Co-authored-by: Mike Chrzanowski <[email protected]> Co-authored-by: Evelina <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Aleksey Grinchuk (Oleksii Hrinchuk) <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: mchrzanowski <[email protected]> Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: Boris Fomitchev <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: fayejf <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Co-authored-by: khcs <[email protected]> Co-authored-by: Hoo Chang Shin <[email protected]> Signed-off-by: Paarth Neekhara <[email protected]>
* upper bound for webdataset Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: Micha Livne <[email protected]> * Correct Dockerfile Signed-off-by: smajumdar <[email protected]> Signed-off-by: Micha Livne <[email protected]> * update readmes Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: Micha Livne <[email protected]> * update README (NVIDIA#2332) Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: Micha Livne <[email protected]> * ddp translate GPU allocation fix (NVIDIA#2312) * fixed branch in IR tutorial Signed-off-by: AlexGrinch <[email protected]> * ddp translate GPU allocation fix Signed-off-by: AlexGrinch <[email protected]> * map_location instead of set_device Signed-off-by: AlexGrinch <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Signed-off-by: Micha Livne <[email protected]> * Shallow fusion (NVIDIA#2315) * fixed branch in IR tutorial Signed-off-by: AlexGrinch <[email protected]> * shallow fusion init commit Signed-off-by: AlexGrinch <[email protected]> * debug info removed Signed-off-by: AlexGrinch <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Signed-off-by: Micha Livne <[email protected]> * [BUGFIX] Add upper bound to hydra for 1.0.x (NVIDIA#2337) * upper bound hydra Signed-off-by: ericharper <[email protected]> * upper bound hydra Signed-off-by: ericharper <[email protected]> Signed-off-by: Micha Livne <[email protected]> * update version number Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: Micha Livne <[email protected]> * update package version Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: Micha Livne <[email protected]> * sparrowhawk tests + punctuation post processing for pynini TN (NVIDIA#2320) * add jenkins test, refactoring Signed-off-by: ekmb <[email protected]> * update test Signed-off-by: ekmb <[email protected]> * fix new test Signed-off-by: ekmb <[email protected]> * add serial to the default normalizer, add tests Signed-off-by: ekmb <[email protected]> * manifest test added Signed-off-by: ekmb <[email protected]> * expose more params, new test cases Signed-off-by: ekmb <[email protected]> * fix jenkins, serial clean, exclude range from cardinal Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins dollar sign format Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins dollar sign format Signed-off-by: ekmb <[email protected]> * addressed review comments Signed-off-by: ekmb <[email protected]> * fix decimal in measure Signed-off-by: ekmb <[email protected]> * move serial in cardinal Signed-off-by: ekmb <[email protected]> * sh tests init Signed-off-by: ekmb <[email protected]> * sparrowhawk container tests support added Signed-off-by: ekmb <[email protected]> * add post process to normalize.py, update tests Signed-off-by: ekmb <[email protected]> * remove duplication Signed-off-by: ekmb <[email protected]> Signed-off-by: Micha Livne <[email protected]> * Update notebooks to 1.0.2 release (NVIDIA#2338) Signed-off-by: smajumdar <[email protected]> Signed-off-by: Micha Livne <[email protected]> * Update ranges for omegaconf and hydra (NVIDIA#2336) * Update ranges Signed-off-by: smajumdar <[email protected]> * Updates for Hydra and OmegaConf updates Signed-off-by: smajumdar <[email protected]> * Style fixes Signed-off-by: smajumdar <[email protected]> * Correct tests and revert patch for model utils Signed-off-by: smajumdar <[email protected]> * Correct docstring Signed-off-by: smajumdar <[email protected]> * Revert unnecessary change Signed-off-by: smajumdar <[email protected]> * Revert unnecessary change Signed-off-by: smajumdar <[email protected]> * Guard scheduler for None Signed-off-by: smajumdar <[email protected]> * default to 0.0 if bpe_dropout is None Signed-off-by: ericharper <[email protected]> * Correctly log class that was restored Signed-off-by: smajumdar <[email protected]> * Root patch *bpe_dropout Signed-off-by: smajumdar <[email protected]> Co-authored-by: ericharper <[email protected]> Signed-off-by: Micha Livne <[email protected]> * Update FastPitch Export (NVIDIA#2355) Signed-off-by: Jason <[email protected]> Signed-off-by: Micha Livne <[email protected]> * update out_dir to not collide (NVIDIA#2358) Signed-off-by: ericharper <[email protected]> Signed-off-by: Micha Livne <[email protected]> * Update container version to 21.05 (NVIDIA#2309) * Update container version Signed-off-by: smajumdar <[email protected]> * Temporarily change export format of waveglow Signed-off-by: smajumdar <[email protected]> * Add conda update for numba Signed-off-by: smajumdar <[email protected]> * Update numba compat via global flag for strictness level `--relax_numba_compat`, remove pytorchlightning.metrics, refactor out numba utils to core, update tests Signed-off-by: smajumdar <[email protected]> * Correct order of numba minimum verion, remove wrong flag from test Signed-off-by: smajumdar <[email protected]> * Double test of cuda numba Signed-off-by: smajumdar <[email protected]> * Double test of cuda numba Signed-off-by: smajumdar <[email protected]> * Enable RNNT tests Signed-off-by: smajumdar <[email protected]> Signed-off-by: Micha Livne <[email protected]> * Text Normalization Update (NVIDIA#2356) * upper cased date support Signed-off-by: ekmb <[email protected]> * update whitelist, change roman weights Signed-off-by: ekmb <[email protected]> * docstrings, space fix, init file Signed-off-by: ekmb <[email protected]> * lgtm Signed-off-by: ekmb <[email protected]> * fraction with measure class Signed-off-by: ekmb <[email protected]> Signed-off-by: Micha Livne <[email protected]> * Add ASR CTC tutorial on fine-tuning on another language (NVIDIA#2346) * Add ASR CTC Language finetuning notebook Signed-off-by: smajumdar <[email protected]> * Add to documentation Signed-off-by: smajumdar <[email protected]> * Improve documentation Signed-off-by: smajumdar <[email protected]> * Correct name of the dataset Signed-off-by: smajumdar <[email protected]> Signed-off-by: Micha Livne <[email protected]> * Correct colab link to notebook (NVIDIA#2366) Signed-off-by: smajumdar <[email protected]> Signed-off-by: Micha Livne <[email protected]> * sgdqa update data directories for testing (NVIDIA#2323) * sgdqa update data directories for testing Signed-off-by: Yang Zhang <[email protected]> * fix syntax Signed-off-by: Yang Zhang <[email protected]> * check if data dir exists Signed-off-by: Yang Zhang <[email protected]> * fix Signed-off-by: Yang Zhang <[email protected]> * adding pretrained model Signed-off-by: Yang Zhang <[email protected]> Signed-off-by: Micha Livne <[email protected]> * Added documentation for export() (NVIDIA#2330) * Added export document Signed-off-by: Boris Fomitchev <[email protected]> * Addressed review comments Signed-off-by: Boris Fomitchev <[email protected]> Co-authored-by: Eric Harper <[email protected]> Signed-off-by: Micha Livne <[email protected]> * Update Citrinet model card info (NVIDIA#2369) * Update model card info Signed-off-by: smajumdar <[email protected]> * Cleanup Docs Signed-off-by: smajumdar <[email protected]> Signed-off-by: Micha Livne <[email protected]> * [NMT] Model Parallel Megatron Encoders (NVIDIA#2238) * add megatron encoder Signed-off-by: ericharper <[email protected]> * added megatron to get_nmt_tokenizer Signed-off-by: ericharper <[email protected]> * add vocab_size and hidden_size to megatron bert Signed-off-by: ericharper <[email protected]> * add megatron encoder module Signed-off-by: ericharper <[email protected]> * fixed horrible typo Signed-off-by: ericharper <[email protected]> * fix typo and add default Signed-off-by: ericharper <[email protected]> * updating nlp overrides for mp nmt Signed-off-by: ericharper <[email protected]> * move some logic back to nlpmodel from overrides Signed-off-by: ericharper <[email protected]> * add checkpoint_file property Signed-off-by: ericharper <[email protected]> * fix property Signed-off-by: ericharper <[email protected]> * num_tokentypes=0 Signed-off-by: ericharper <[email protected]> * typo Signed-off-by: ericharper <[email protected]> * typo Signed-off-by: ericharper <[email protected]> * find_unused_parameters=True Signed-off-by: ericharper <[email protected]> * typo Signed-off-by: ericharper <[email protected]> * style Signed-off-by: ericharper <[email protected]> * get instead of pop Signed-off-by: ericharper <[email protected]> * remove token type ids from megatron input example Signed-off-by: ericharper <[email protected]> * pop vocab_size Signed-off-by: ericharper <[email protected]> * fix checkpointing for model parallel Signed-off-by: ericharper <[email protected]> * fix bug in non model parallel Signed-off-by: ericharper <[email protected]> * convert cfg.trainer to dict Signed-off-by: ericharper <[email protected]> * make num_tokentypes configurable for nmt Signed-off-by: ericharper <[email protected]> * update checkpoint_file when using named megatron model in nemo Signed-off-by: ericharper <[email protected]> * make vocab_file configurable Signed-off-by: ericharper <[email protected]> * dataclass can't have mutable default Signed-off-by: ericharper <[email protected]> * style Signed-off-by: ericharper <[email protected]> * unused imports Signed-off-by: ericharper <[email protected]> * revert input example Signed-off-by: ericharper <[email protected]> * check that checkpoint version is not None Signed-off-by: ericharper <[email protected]> * add mp jenkins test Signed-off-by: ericharper <[email protected]> * update docstring Signed-off-by: ericharper <[email protected]> * add docs for pretrained encoders with nemo nmt Signed-off-by: ericharper <[email protected]> Signed-off-by: Micha Livne <[email protected]> * Add notebook with recommendations for 8 kHz speech (NVIDIA#2326) * Added a notebook with best practices for telephony speech * Added datasets detaiils * Added training recommendations * Emptied out cells with results * Added tutorial to docs Signed-off-by: jbalam <[email protected]> * Addressed review comments Signed-off-by: jbalam <[email protected]> * Added a line to note original sampling rate of an4 Signed-off-by: jbalam <[email protected]> * Made changes suggested in review Signed-off-by: jbalam <[email protected]> Signed-off-by: Micha Livne <[email protected]> * 1. Working on bottleneck transformers. Signed-off-by: Micha Livne <[email protected]> * 1. Working on bottleneck transformers. * 1. Done cleaning code of bottleneck transformers. 2. Ready to test. Signed-off-by: Micha Livne <[email protected]> * 1. Done cleaning code of bottleneck transformers. 2. Ready to test. * 1. Working on training script. Signed-off-by: Micha Livne <[email protected]> * 1. Working on training script. * 1. Updated config class name. Signed-off-by: Micha Livne <[email protected]> * 1. Updated config class name. * 1. Training script is ready to be tested. Signed-off-by: Micha Livne <[email protected]> * 1. Training script is ready to be tested. * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. * Add FastEmit support for RNNT Losses (NVIDIA#2374) * Temp commit Signed-off-by: smajumdar <[email protected]> * Initial code for fastemit forward pass Signed-off-by: smajumdar <[email protected]> * Correct return reg value Signed-off-by: smajumdar <[email protected]> * Initial cpu impl Signed-off-by: smajumdar <[email protected]> * Try gpu impl Signed-off-by: smajumdar <[email protected]> * Try gpu impl Signed-off-by: smajumdar <[email protected]> * Correct few impl Signed-off-by: smajumdar <[email protected]> * Update fastemit scaling Signed-off-by: smajumdar <[email protected]> * Cleanup fastemit Signed-off-by: smajumdar <[email protected]> * Finalize FastEmit regularization PR Signed-off-by: smajumdar <[email protected]> * Refactor code to support fastemit regularization Signed-off-by: smajumdar <[email protected]> Co-authored-by: Samuel Kriman <[email protected]> Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. * 1. Fixed bugs. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed bugs. * 1. Fixed missing import. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed missing import. * 1. Fixed support in seq2seq-br. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed support in seq2seq-br. * 1. Added NLPDDPPlugin. Signed-off-by: Micha Livne <[email protected]> * 1. Added NLPDDPPlugin. * fix bugs in hifigan code (NVIDIA#2392) Signed-off-by: Oktai Tatanov <[email protected]> Signed-off-by: Micha Livne <[email protected]> * Update setup.py (NVIDIA#2394) Signed-off-by: Jason <[email protected]> Signed-off-by: Micha Livne <[email protected]> * 1. Updated to support multi-node training. Signed-off-by: Micha Livne <[email protected]> * 1. Added comments. Signed-off-by: Micha Livne <[email protected]> * 1. MTBottleneckModel is in its own file mt_enc_dec_bottleneck_model. Signed-off-by: Micha Livne <[email protected]> * 1. Switched loss annealing to rely on self.trainer.global_step Signed-off-by: Micha Livne <[email protected]> * 1. Added comments regrding the use of return_ortho_loss. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Added detailed logging of loss during training (still need to do the same for eval). Signed-off-by: Micha Livne <[email protected]> * 1. Testing a fix to import bug. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging wrong import issue. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Added logging of results to validation step (no tested yet). Signed-off-by: Micha Livne <[email protected]> * 1. Fixed missing import. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Testing failing immports. Signed-off-by: Micha Livne <[email protected]> * 1. Disabling changes. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Enabled bottleneck architecture. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed identation. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed import statement. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed typo. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed logging of arbitrary values. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed torch lightining logging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Added a missing import. Signed-off-by: Micha Livne <[email protected]> * 1. Added NLPDDPPlugin. Signed-off-by: Micha Livne <[email protected]> * 1. Cleaned style. Signed-off-by: Micha Livne <[email protected]> * 1. Updated sign of computed loss. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed double import. Signed-off-by: Micha Livne <[email protected]> * 1. Moved logging of additional loss terms into MTBottleneckModel class. Signed-off-by: Micha Livne <[email protected]> * 1. Updated permissions. Signed-off-by: Micha Livne <[email protected]> * 1. Added initial perceiver package. Signed-off-by: Micha Livne <[email protected]> * 1. Working on encoder. Signed-off-by: Micha Livne <[email protected]> * 1. Testing perceiver. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. FInished implementing Perceiver. Signed-off-by: Micha Livne <[email protected]> * 1. Updated default arch. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Ignoring independant perceiver implementation. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Added latent transformer to perceiver Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Added TransformerBottleneckDecoderNM. Signed-off-by: Micha Livne <[email protected]> * 1. Added TransformerBottleneckEncoderNM. Signed-off-by: Micha Livne <[email protected]> * 1. Updated bottleneck perceiver. Signed-off-by: Micha Livne <[email protected]> * 1. Updated MTBottleneckModel. Signed-off-by: Micha Livne <[email protected]> * 1. Added BridgeEncoder. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Cleaned code. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Updated architecture name. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Added support in bridge encoder. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Added support in hidden_init_method to BridgeEncoder. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Removed unneeded imports. Signed-off-by: Micha Livne <[email protected]> * 1. Updated comment in YAML Signed-off-by: Micha Livne <[email protected]> * 1. Fixed style. Signed-off-by: Micha Livne <[email protected]> * 1. Updated YAML comments. 2. hidden_blocks in bridge relates to post-processing after bridge1. Updated YAML comments. 2. hidden_blocks in bridge relates to post-processing after bridge (instead of hidden_blocks-1). Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Initial cross attention in Perceiver with params init has independant parameters. Signed-off-by: Micha Livne <[email protected]> * 1. Updated Perciver forward. Signed-off-by: Micha Livne <[email protected]> * 1. Updated TransformerEncoder to be a component as opposed to a parent class. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Updated example command. Signed-off-by: Micha Livne <[email protected]> * 1. forward nethod in MTBottleneckModel does not compute loss. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed style. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Added label smoothing for per-sample loss. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed style. Signed-off-by: Micha Livne <[email protected]> * 1. Updated recon_only loss to nll. Signed-off-by: Micha Livne <[email protected]> * 1. Update yaml doc. Signed-off-by: Micha Livne <[email protected]> * 1. Updated default config to have 32 hidden steps. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Updated doc. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed type. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed unreachable code bug. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed wrong sign for reconstruction per sample (instead of per token). Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed style. Signed-off-by: Micha Livne <[email protected]> * 1. Updated comments. Signed-off-by: Micha Livne <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Aleksey Grinchuk (Oleksii Hrinchuk) <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Evelina <[email protected]> Co-authored-by: Jason <[email protected]> Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: Boris Fomitchev <[email protected]> Co-authored-by: Jagadeesh Balam <[email protected]> Co-authored-by: Micha Livne <[email protected]> Co-authored-by: Samuel Kriman <[email protected]> Co-authored-by: Oktai Tatanov <[email protected]> Signed-off-by: Paarth Neekhara <[email protected]>
* upper bound for webdataset Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: Micha Livne <[email protected]> * Correct Dockerfile Signed-off-by: smajumdar <[email protected]> Signed-off-by: Micha Livne <[email protected]> * update readmes Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: Micha Livne <[email protected]> * update README (NVIDIA#2332) Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: Micha Livne <[email protected]> * ddp translate GPU allocation fix (NVIDIA#2312) * fixed branch in IR tutorial Signed-off-by: AlexGrinch <[email protected]> * ddp translate GPU allocation fix Signed-off-by: AlexGrinch <[email protected]> * map_location instead of set_device Signed-off-by: AlexGrinch <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Signed-off-by: Micha Livne <[email protected]> * Shallow fusion (NVIDIA#2315) * fixed branch in IR tutorial Signed-off-by: AlexGrinch <[email protected]> * shallow fusion init commit Signed-off-by: AlexGrinch <[email protected]> * debug info removed Signed-off-by: AlexGrinch <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Signed-off-by: Micha Livne <[email protected]> * [BUGFIX] Add upper bound to hydra for 1.0.x (NVIDIA#2337) * upper bound hydra Signed-off-by: ericharper <[email protected]> * upper bound hydra Signed-off-by: ericharper <[email protected]> Signed-off-by: Micha Livne <[email protected]> * update version number Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: Micha Livne <[email protected]> * update package version Signed-off-by: Oleksii Kuchaiev <[email protected]> Signed-off-by: Micha Livne <[email protected]> * sparrowhawk tests + punctuation post processing for pynini TN (NVIDIA#2320) * add jenkins test, refactoring Signed-off-by: ekmb <[email protected]> * update test Signed-off-by: ekmb <[email protected]> * fix new test Signed-off-by: ekmb <[email protected]> * add serial to the default normalizer, add tests Signed-off-by: ekmb <[email protected]> * manifest test added Signed-off-by: ekmb <[email protected]> * expose more params, new test cases Signed-off-by: ekmb <[email protected]> * fix jenkins, serial clean, exclude range from cardinal Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins dollar sign format Signed-off-by: ekmb <[email protected]> * jenkins Signed-off-by: ekmb <[email protected]> * jenkins dollar sign format Signed-off-by: ekmb <[email protected]> * addressed review comments Signed-off-by: ekmb <[email protected]> * fix decimal in measure Signed-off-by: ekmb <[email protected]> * move serial in cardinal Signed-off-by: ekmb <[email protected]> * sh tests init Signed-off-by: ekmb <[email protected]> * sparrowhawk container tests support added Signed-off-by: ekmb <[email protected]> * add post process to normalize.py, update tests Signed-off-by: ekmb <[email protected]> * remove duplication Signed-off-by: ekmb <[email protected]> Signed-off-by: Micha Livne <[email protected]> * Update notebooks to 1.0.2 release (NVIDIA#2338) Signed-off-by: smajumdar <[email protected]> Signed-off-by: Micha Livne <[email protected]> * Update ranges for omegaconf and hydra (NVIDIA#2336) * Update ranges Signed-off-by: smajumdar <[email protected]> * Updates for Hydra and OmegaConf updates Signed-off-by: smajumdar <[email protected]> * Style fixes Signed-off-by: smajumdar <[email protected]> * Correct tests and revert patch for model utils Signed-off-by: smajumdar <[email protected]> * Correct docstring Signed-off-by: smajumdar <[email protected]> * Revert unnecessary change Signed-off-by: smajumdar <[email protected]> * Revert unnecessary change Signed-off-by: smajumdar <[email protected]> * Guard scheduler for None Signed-off-by: smajumdar <[email protected]> * default to 0.0 if bpe_dropout is None Signed-off-by: ericharper <[email protected]> * Correctly log class that was restored Signed-off-by: smajumdar <[email protected]> * Root patch *bpe_dropout Signed-off-by: smajumdar <[email protected]> Co-authored-by: ericharper <[email protected]> Signed-off-by: Micha Livne <[email protected]> * Update FastPitch Export (NVIDIA#2355) Signed-off-by: Jason <[email protected]> Signed-off-by: Micha Livne <[email protected]> * update out_dir to not collide (NVIDIA#2358) Signed-off-by: ericharper <[email protected]> Signed-off-by: Micha Livne <[email protected]> * Update container version to 21.05 (NVIDIA#2309) * Update container version Signed-off-by: smajumdar <[email protected]> * Temporarily change export format of waveglow Signed-off-by: smajumdar <[email protected]> * Add conda update for numba Signed-off-by: smajumdar <[email protected]> * Update numba compat via global flag for strictness level `--relax_numba_compat`, remove pytorchlightning.metrics, refactor out numba utils to core, update tests Signed-off-by: smajumdar <[email protected]> * Correct order of numba minimum verion, remove wrong flag from test Signed-off-by: smajumdar <[email protected]> * Double test of cuda numba Signed-off-by: smajumdar <[email protected]> * Double test of cuda numba Signed-off-by: smajumdar <[email protected]> * Enable RNNT tests Signed-off-by: smajumdar <[email protected]> Signed-off-by: Micha Livne <[email protected]> * Text Normalization Update (NVIDIA#2356) * upper cased date support Signed-off-by: ekmb <[email protected]> * update whitelist, change roman weights Signed-off-by: ekmb <[email protected]> * docstrings, space fix, init file Signed-off-by: ekmb <[email protected]> * lgtm Signed-off-by: ekmb <[email protected]> * fraction with measure class Signed-off-by: ekmb <[email protected]> Signed-off-by: Micha Livne <[email protected]> * Add ASR CTC tutorial on fine-tuning on another language (NVIDIA#2346) * Add ASR CTC Language finetuning notebook Signed-off-by: smajumdar <[email protected]> * Add to documentation Signed-off-by: smajumdar <[email protected]> * Improve documentation Signed-off-by: smajumdar <[email protected]> * Correct name of the dataset Signed-off-by: smajumdar <[email protected]> Signed-off-by: Micha Livne <[email protected]> * Correct colab link to notebook (NVIDIA#2366) Signed-off-by: smajumdar <[email protected]> Signed-off-by: Micha Livne <[email protected]> * sgdqa update data directories for testing (NVIDIA#2323) * sgdqa update data directories for testing Signed-off-by: Yang Zhang <[email protected]> * fix syntax Signed-off-by: Yang Zhang <[email protected]> * check if data dir exists Signed-off-by: Yang Zhang <[email protected]> * fix Signed-off-by: Yang Zhang <[email protected]> * adding pretrained model Signed-off-by: Yang Zhang <[email protected]> Signed-off-by: Micha Livne <[email protected]> * Added documentation for export() (NVIDIA#2330) * Added export document Signed-off-by: Boris Fomitchev <[email protected]> * Addressed review comments Signed-off-by: Boris Fomitchev <[email protected]> Co-authored-by: Eric Harper <[email protected]> Signed-off-by: Micha Livne <[email protected]> * Update Citrinet model card info (NVIDIA#2369) * Update model card info Signed-off-by: smajumdar <[email protected]> * Cleanup Docs Signed-off-by: smajumdar <[email protected]> Signed-off-by: Micha Livne <[email protected]> * [NMT] Model Parallel Megatron Encoders (NVIDIA#2238) * add megatron encoder Signed-off-by: ericharper <[email protected]> * added megatron to get_nmt_tokenizer Signed-off-by: ericharper <[email protected]> * add vocab_size and hidden_size to megatron bert Signed-off-by: ericharper <[email protected]> * add megatron encoder module Signed-off-by: ericharper <[email protected]> * fixed horrible typo Signed-off-by: ericharper <[email protected]> * fix typo and add default Signed-off-by: ericharper <[email protected]> * updating nlp overrides for mp nmt Signed-off-by: ericharper <[email protected]> * move some logic back to nlpmodel from overrides Signed-off-by: ericharper <[email protected]> * add checkpoint_file property Signed-off-by: ericharper <[email protected]> * fix property Signed-off-by: ericharper <[email protected]> * num_tokentypes=0 Signed-off-by: ericharper <[email protected]> * typo Signed-off-by: ericharper <[email protected]> * typo Signed-off-by: ericharper <[email protected]> * find_unused_parameters=True Signed-off-by: ericharper <[email protected]> * typo Signed-off-by: ericharper <[email protected]> * style Signed-off-by: ericharper <[email protected]> * get instead of pop Signed-off-by: ericharper <[email protected]> * remove token type ids from megatron input example Signed-off-by: ericharper <[email protected]> * pop vocab_size Signed-off-by: ericharper <[email protected]> * fix checkpointing for model parallel Signed-off-by: ericharper <[email protected]> * fix bug in non model parallel Signed-off-by: ericharper <[email protected]> * convert cfg.trainer to dict Signed-off-by: ericharper <[email protected]> * make num_tokentypes configurable for nmt Signed-off-by: ericharper <[email protected]> * update checkpoint_file when using named megatron model in nemo Signed-off-by: ericharper <[email protected]> * make vocab_file configurable Signed-off-by: ericharper <[email protected]> * dataclass can't have mutable default Signed-off-by: ericharper <[email protected]> * style Signed-off-by: ericharper <[email protected]> * unused imports Signed-off-by: ericharper <[email protected]> * revert input example Signed-off-by: ericharper <[email protected]> * check that checkpoint version is not None Signed-off-by: ericharper <[email protected]> * add mp jenkins test Signed-off-by: ericharper <[email protected]> * update docstring Signed-off-by: ericharper <[email protected]> * add docs for pretrained encoders with nemo nmt Signed-off-by: ericharper <[email protected]> Signed-off-by: Micha Livne <[email protected]> * Add notebook with recommendations for 8 kHz speech (NVIDIA#2326) * Added a notebook with best practices for telephony speech * Added datasets detaiils * Added training recommendations * Emptied out cells with results * Added tutorial to docs Signed-off-by: jbalam <[email protected]> * Addressed review comments Signed-off-by: jbalam <[email protected]> * Added a line to note original sampling rate of an4 Signed-off-by: jbalam <[email protected]> * Made changes suggested in review Signed-off-by: jbalam <[email protected]> Signed-off-by: Micha Livne <[email protected]> * 1. Working on bottleneck transformers. Signed-off-by: Micha Livne <[email protected]> * 1. Working on bottleneck transformers. * 1. Done cleaning code of bottleneck transformers. 2. Ready to test. Signed-off-by: Micha Livne <[email protected]> * 1. Done cleaning code of bottleneck transformers. 2. Ready to test. * 1. Working on training script. Signed-off-by: Micha Livne <[email protected]> * 1. Working on training script. * 1. Updated config class name. Signed-off-by: Micha Livne <[email protected]> * 1. Updated config class name. * 1. Training script is ready to be tested. Signed-off-by: Micha Livne <[email protected]> * 1. Training script is ready to be tested. * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. * Add FastEmit support for RNNT Losses (NVIDIA#2374) * Temp commit Signed-off-by: smajumdar <[email protected]> * Initial code for fastemit forward pass Signed-off-by: smajumdar <[email protected]> * Correct return reg value Signed-off-by: smajumdar <[email protected]> * Initial cpu impl Signed-off-by: smajumdar <[email protected]> * Try gpu impl Signed-off-by: smajumdar <[email protected]> * Try gpu impl Signed-off-by: smajumdar <[email protected]> * Correct few impl Signed-off-by: smajumdar <[email protected]> * Update fastemit scaling Signed-off-by: smajumdar <[email protected]> * Cleanup fastemit Signed-off-by: smajumdar <[email protected]> * Finalize FastEmit regularization PR Signed-off-by: smajumdar <[email protected]> * Refactor code to support fastemit regularization Signed-off-by: smajumdar <[email protected]> Co-authored-by: Samuel Kriman <[email protected]> Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. * 1. Fixed bugs. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed bugs. * 1. Fixed missing import. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed missing import. * 1. Fixed support in seq2seq-br. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed support in seq2seq-br. * 1. Added NLPDDPPlugin. Signed-off-by: Micha Livne <[email protected]> * 1. Added NLPDDPPlugin. * fix bugs in hifigan code (NVIDIA#2392) Signed-off-by: Oktai Tatanov <[email protected]> Signed-off-by: Micha Livne <[email protected]> * Update setup.py (NVIDIA#2394) Signed-off-by: Jason <[email protected]> Signed-off-by: Micha Livne <[email protected]> * 1. Updated to support multi-node training. Signed-off-by: Micha Livne <[email protected]> * 1. Added comments. Signed-off-by: Micha Livne <[email protected]> * 1. MTBottleneckModel is in its own file mt_enc_dec_bottleneck_model. Signed-off-by: Micha Livne <[email protected]> * 1. Switched loss annealing to rely on self.trainer.global_step Signed-off-by: Micha Livne <[email protected]> * 1. Added comments regrding the use of return_ortho_loss. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Added detailed logging of loss during training (still need to do the same for eval). Signed-off-by: Micha Livne <[email protected]> * 1. Testing a fix to import bug. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging wrong import issue. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Added logging of results to validation step (no tested yet). Signed-off-by: Micha Livne <[email protected]> * 1. Fixed missing import. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Testing failing immports. Signed-off-by: Micha Livne <[email protected]> * 1. Disabling changes. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Enabled bottleneck architecture. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed identation. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed import statement. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed typo. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed logging of arbitrary values. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed torch lightining logging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Added a missing import. Signed-off-by: Micha Livne <[email protected]> * 1. Added NLPDDPPlugin. Signed-off-by: Micha Livne <[email protected]> * 1. Cleaned style. Signed-off-by: Micha Livne <[email protected]> * 1. Updated sign of computed loss. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed double import. Signed-off-by: Micha Livne <[email protected]> * 1. Moved logging of additional loss terms into MTBottleneckModel class. Signed-off-by: Micha Livne <[email protected]> * 1. Updated permissions. Signed-off-by: Micha Livne <[email protected]> * 1. Added initial perceiver package. Signed-off-by: Micha Livne <[email protected]> * 1. Working on encoder. Signed-off-by: Micha Livne <[email protected]> * 1. Testing perceiver. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. FInished implementing Perceiver. Signed-off-by: Micha Livne <[email protected]> * 1. Updated default arch. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Ignoring independant perceiver implementation. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Added latent transformer to perceiver Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Added TransformerBottleneckDecoderNM. Signed-off-by: Micha Livne <[email protected]> * 1. Added TransformerBottleneckEncoderNM. Signed-off-by: Micha Livne <[email protected]> * 1. Updated bottleneck perceiver. Signed-off-by: Micha Livne <[email protected]> * 1. Updated MTBottleneckModel. Signed-off-by: Micha Livne <[email protected]> * 1. Added BridgeEncoder. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Cleaned code. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Updated architecture name. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Added support in bridge encoder. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Added support in hidden_init_method to BridgeEncoder. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Removed unneeded imports. Signed-off-by: Micha Livne <[email protected]> * 1. Updated comment in YAML Signed-off-by: Micha Livne <[email protected]> * 1. Fixed style. Signed-off-by: Micha Livne <[email protected]> * 1. Updated YAML comments. 2. hidden_blocks in bridge relates to post-processing after bridge1. Updated YAML comments. 2. hidden_blocks in bridge relates to post-processing after bridge (instead of hidden_blocks-1). Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Initial cross attention in Perceiver with params init has independant parameters. Signed-off-by: Micha Livne <[email protected]> * 1. Updated Perciver forward. Signed-off-by: Micha Livne <[email protected]> * 1. Updated TransformerEncoder to be a component as opposed to a parent class. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Updated example command. Signed-off-by: Micha Livne <[email protected]> * 1. forward nethod in MTBottleneckModel does not compute loss. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed style. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Added label smoothing for per-sample loss. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed style. Signed-off-by: Micha Livne <[email protected]> * 1. Updated recon_only loss to nll. Signed-off-by: Micha Livne <[email protected]> * 1. Update yaml doc. Signed-off-by: Micha Livne <[email protected]> * 1. Updated default config to have 32 hidden steps. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Updated doc. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed type. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed unreachable code bug. Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed wrong sign for reconstruction per sample (instead of per token). Signed-off-by: Micha Livne <[email protected]> * 1. Debugging. Signed-off-by: Micha Livne <[email protected]> * 1. Fixed style. Signed-off-by: Micha Livne <[email protected]> * 1. Updated comments. Signed-off-by: Micha Livne <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Aleksey Grinchuk (Oleksii Hrinchuk) <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Evelina <[email protected]> Co-authored-by: Jason <[email protected]> Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: Boris Fomitchev <[email protected]> Co-authored-by: Jagadeesh Balam <[email protected]> Co-authored-by: Micha Livne <[email protected]> Co-authored-by: Samuel Kriman <[email protected]> Co-authored-by: Oktai Tatanov <[email protected]>