Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[Fix] Hanging for Fully Randomized Bucketing (NVIDIA#4348)
* Update container to 22.05 (NVIDIA#4329) * update container to 22.05 Signed-off-by: ericharper <[email protected]> * try adding safe directory Signed-off-by: ericharper <[email protected]> * try env var Signed-off-by: ericharper <[email protected]> * printenv Signed-off-by: ericharper <[email protected]> * try GIT_BRANCH Signed-off-by: ericharper <[email protected]> * typo Signed-off-by: ericharper <[email protected]> * remove dbug statements Signed-off-by: ericharper <[email protected]> Signed-off-by: stevehuang52 <[email protected]> * Merge r1.9.0 main (NVIDIA#4331) * update branch Signed-off-by: ericharper <[email protected]> * update package info Signed-off-by: ericharper <[email protected]> * cleaned up TN/ ITN doc (NVIDIA#4119) * cleaned up TN/ ITN doc Signed-off-by: Yang Zhang <[email protected]> * fix typo Signed-off-by: Yang Zhang <[email protected]> * fix image Signed-off-by: Yang Zhang <[email protected]> * fix image Signed-off-by: Yang Zhang <[email protected]> * Draft: Fix restoring from checkpoint for case when `model.common_dataset_parameters.label_vocab_dir` is provided (NVIDIA#4136) * Fix restoring from checkpoint with label vocab dir Signed-off-by: PeganovAnton <[email protected]> * Add tests for various ways to pass label ids to model Signed-off-by: PeganovAnton <[email protected]> * Fix typo Signed-off-by: PeganovAnton <[email protected]> * Fix typo Signed-off-by: PeganovAnton <[email protected]> * Do not create tmp directory Signed-off-by: PeganovAnton <[email protected]> * Fix parameter name Signed-off-by: PeganovAnton <[email protected]> * finish cherry-pick op Signed-off-by: PeganovAnton <[email protected]> * Fix labels errors Signed-off-by: PeganovAnton <[email protected]> * Remove duplicate stage Signed-off-by: PeganovAnton <[email protected]> * Change target branch Signed-off-by: PeganovAnton <[email protected]> * fix doc (NVIDIA#4146) Signed-off-by: Yang Zhang <[email protected]> * Tacotron2 retrain (NVIDIA#4103) * fix yaml Signed-off-by: treacker <[email protected]> * Fix for new TTSDataset class Signed-off-by: treacker <[email protected]> * added wandb logging Signed-off-by: treacker <[email protected]> * added wandb logging Signed-off-by: treacker <[email protected]> * fix numpy version Signed-off-by: treacker <[email protected]> * fix numpy version Signed-off-by: treacker <[email protected]> * inference fix Signed-off-by: treacker <[email protected]> * removed old code Signed-off-by: treacker <[email protected]> * updated parser logic Signed-off-by: treacker <[email protected]> * reverted version update Signed-off-by: treacker <[email protected]> * refactored parser logic Signed-off-by: treacker <[email protected]> * Updated Jenkinsfile Signed-off-by: treacker <[email protected]> * Refactored tutorial for Tacotron2 Signed-off-by: treacker <[email protected]> * Made backward compatibility Signed-off-by: treacker <[email protected]> * Made backward compatibility Signed-off-by: treacker <[email protected]> * Update Jenkinsfile Signed-off-by: treacker <[email protected]> * Update tacotron.yaml Signed-off-by: treacker <[email protected]> * Refactoring Signed-off-by: treacker <[email protected]> * cleaned up TN/ ITN doc (NVIDIA#4119) * cleaned up TN/ ITN doc Signed-off-by: Yang Zhang <[email protected]> * fix typo Signed-off-by: Yang Zhang <[email protected]> * fix image Signed-off-by: Yang Zhang <[email protected]> * fix image Signed-off-by: Yang Zhang <[email protected]> Signed-off-by: treacker <[email protected]> * Check implicit grad acc in GLUE dataset building (NVIDIA#4123) * Check implicit grad acc in GLUE dataset building Signed-off-by: MaximumEntropy <[email protected]> * Fix jenkins test for GLUE/XNLI Signed-off-by: MaximumEntropy <[email protected]> Signed-off-by: treacker <[email protected]> * Refactoring Signed-off-by: treacker <[email protected]> * Refactoring Signed-off-by: treacker <[email protected]> * Fixed jenkins Signed-off-by: treacker <[email protected]> * Refactoring Signed-off-by: treacker <[email protected]> * Refactoring Signed-off-by: treacker <[email protected]> * Refactoring Signed-off-by: treacker <[email protected]> Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> * Multiprocess improvements (NVIDIA#4127) * initial commit Signed-off-by: nithinraok <[email protected]> * start fix Signed-off-by: nithinraok <[email protected]> * improve multiprocessing speed while creating speaker dataset Signed-off-by: nithinraok <[email protected]> * updated scp to filelist Signed-off-by: nithinraok <[email protected]> * notebooks' link, typo and import fix (NVIDIA#4158) * redo missing pr 4007 Signed-off-by: fayejf <[email protected]> * remove extremely unreliable links Signed-off-by: fayejf <[email protected]> * update speaker docs (NVIDIA#4164) * update speaker docs Signed-off-by: nithinraok <[email protected]> * chunks -> segments Signed-off-by: nithinraok <[email protected]> * Khz -> kHz Signed-off-by: nithinraok <[email protected]> * small fix (NVIDIA#4180) Signed-off-by: fayejf <[email protected]> * fix the server key value problem (NVIDIA#4196) Signed-off-by: Yi Dong <[email protected]> * Fix/punctuation/trainer required for setting test data (NVIDIA#4199) * Draft of fix Signed-off-by: PeganovAnton <[email protected]> * Add warnings and replace globa_step with current_epoch Signed-off-by: PeganovAnton <[email protected]> * Small improvements to warnings Signed-off-by: PeganovAnton <[email protected]> * Error and warning messages improvements Signed-off-by: PeganovAnton <[email protected]> * Replace self.trainer with self._trainer Signed-off-by: PeganovAnton <[email protected]> * Update ContextNet version (NVIDIA#4207) Signed-off-by: smajumdar <[email protected]> * fix bugs for dialogue tutorial (NVIDIA#4211) Signed-off-by: Zhilin Wang <[email protected]> * Dialogue tutorial fix (NVIDIA#4214) * fix bugs for dialogue tutorial Signed-off-by: Zhilin Wang <[email protected]> * update path for convert_datasets.py due to conflict PR Signed-off-by: Zhilin Wang <[email protected]> * Add docs for Thutmose Tagger (NVIDIA#4173) * Add docs for Thutmose Tagger Signed-off-by: Alexandra Antonova <[email protected]> * add level in docs Signed-off-by: Alexandra Antonova <[email protected]> * delete folder to avoid error with running when folder exists from previous run Signed-off-by: Alexandra Antonova <[email protected]> Co-authored-by: Alexandra Antonova <[email protected]> Co-authored-by: ekmb <[email protected]> * Dialogue tutorial fix (NVIDIA#4218) * fix bugs for dialogue tutorial Signed-off-by: Zhilin Wang <[email protected]> * update path for convert_datasets.py due to conflict PR Signed-off-by: Zhilin Wang <[email protected]> * restore previously deleted files Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * Dialogue tutorial fix (NVIDIA#4221) * fix bugs for dialogue tutorial Signed-off-by: Zhilin Wang <[email protected]> * update path for convert_datasets.py due to conflict PR Signed-off-by: Zhilin Wang <[email protected]> * restore previously deleted files Signed-off-by: Zhilin Wang <[email protected]> * style fix Signed-off-by: Zhilin Wang <[email protected]> * update tutorial Signed-off-by: Zhilin Wang <[email protected]> * fix syntax error in ipynb-file (NVIDIA#4228) Signed-off-by: Alexandra Antonova <[email protected]> Co-authored-by: Alexandra Antonova <[email protected]> * fix json serialize (NVIDIA#4235) Signed-off-by: Yi Dong <[email protected]> * Prompt Learning Typo Fixes (NVIDIA#4238) * Prompt tuning notebook typo fixes Signed-off-by: Virginia Adams <[email protected]> * Update tutorials.rst * Update prompt_learning.rst * Update prompt_learning.rst * fixing bug 3642622 (NVIDIA#4250) * fixing bug 3642622 Signed-off-by: Ghasem Pasandi <[email protected]> * fixing bug 3642622 Signed-off-by: Ghasem Pasandi <[email protected]> Co-authored-by: Ghasem Pasandi <[email protected]> * fix broken link in the tutorial (NVIDIA#4257) Signed-off-by: Alexandra Antonova <[email protected]> Co-authored-by: Alexandra Antonova <[email protected]> * Typo fix, branch change, better download messagae (NVIDIA#4262) Signed-off-by: Virginia Adams <[email protected]> * Raise error if bicleaner is not installed in NMT Data preprocesing notebook (NVIDIA#4264) * Raise error if bicleaner is not installed Signed-off-by: MaximumEntropy <[email protected]> * Clear cells Signed-off-by: MaximumEntropy <[email protected]> * Fix missing validation dataset, whitelist certain keywords for datasets (NVIDIA#4269) * Fix missing validation dataset, whitelist certain keywords for datasets Signed-off-by: smajumdar <[email protected]> * Fix missing validation dataset, whitelist certain keywords for datasets Signed-off-by: smajumdar <[email protected]> * Update asr configs with num_workers and pin_memory (NVIDIA#4270) Signed-off-by: smajumdar <[email protected]> * Fix epoch end (NVIDIA#4265) Signed-off-by: MaximumEntropy <[email protected]> Co-authored-by: Eric Harper <[email protected]> * Set Save on train end to false (NVIDIA#4274) * Set Save on train end to false Signed-off-by: Virginia Adams <[email protected]> * Update prompt_learning.rst * Update prompt_learning.rst * Update YAML (NVIDIA#4261) Signed-off-by: MaximumEntropy <[email protected]> * Updated config to fix CI test OOM error (NVIDIA#4279) * Updated config to fix CI test issue Signed-off-by: Virginia Adams <[email protected]> * Increased num workers Signed-off-by: Virginia Adams <[email protected]> * verbose k2 install, skip if failed (NVIDIA#4289) Signed-off-by: Aleksandr Laptev <[email protected]> Co-authored-by: Aleksandr Laptev <[email protected]> * Changed total virtual prompt tokens (NVIDIA#4295) * Changed total virtual prompt tokens Signed-off-by: Virginia Adams <[email protected]> * put number of workers back Signed-off-by: Virginia Adams <[email protected]> * upper bound lightning Signed-off-by: ericharper <[email protected]> * update branch Signed-off-by: ericharper <[email protected]> * update config Signed-off-by: ericharper <[email protected]> * remove duplicate test Signed-off-by: ericharper <[email protected]> * fix tn test cases Signed-off-by: ericharper <[email protected]> * add another safe.directory Signed-off-by: ericharper <[email protected]> * typo Signed-off-by: ericharper <[email protected]> Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: PeganovAnton <[email protected]> Co-authored-by: treacker <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Co-authored-by: fayejf <[email protected]> Co-authored-by: Yi Dong <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: Zhilin Wang <[email protected]> Co-authored-by: bene-ges <[email protected]> Co-authored-by: Alexandra Antonova <[email protected]> Co-authored-by: ekmb <[email protected]> Co-authored-by: Virginia Adams <[email protected]> Co-authored-by: Ghasem <[email protected]> Co-authored-by: Ghasem Pasandi <[email protected]> Co-authored-by: Aleksandr Laptev <[email protected]> Co-authored-by: Aleksandr Laptev <[email protected]> Signed-off-by: stevehuang52 <[email protected]> * fix full_randn bucket hang Signed-off-by: stevehuang52 <[email protected]> * remove unused variables Signed-off-by: stevehuang52 <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Yang Zhang <[email protected]> Co-authored-by: PeganovAnton <[email protected]> Co-authored-by: treacker <[email protected]> Co-authored-by: Sandeep Subramanian <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Co-authored-by: fayejf <[email protected]> Co-authored-by: Yi Dong <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Co-authored-by: Zhilin Wang <[email protected]> Co-authored-by: bene-ges <[email protected]> Co-authored-by: Alexandra Antonova <[email protected]> Co-authored-by: ekmb <[email protected]> Co-authored-by: Virginia Adams <[email protected]> Co-authored-by: Ghasem <[email protected]> Co-authored-by: Ghasem Pasandi <[email protected]> Co-authored-by: Aleksandr Laptev <[email protected]> Co-authored-by: Aleksandr Laptev <[email protected]>
- Loading branch information