forked from NVIDIA/NeMo
-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
COG Datalyer #2
Open
aasseman
wants to merge
70
commits into
tkornuta-nvidia:master
Choose a base branch
from
aasseman:feat/cog
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
COG Datalyer #2
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
update repo
…VIDIA#693) * update sgd numbers after fix in seen services and slot request loss Signed-off-by: Yang Zhang <[email protected]> * fix table Signed-off-by: Yang Zhang <[email protected]> * add more information to documentation Signed-off-by: Yang Zhang <[email protected]> * fix doc Signed-off-by: Yang Zhang <[email protected]> * fix doc Signed-off-by: Yang Zhang <[email protected]> * fix doc Signed-off-by: Yang Zhang <[email protected]>
* Added user sys tag to TRADE. Signed-off-by: Vahid Noroozi <[email protected]>
…#695) * megatron glue numbers added, default amp level reverted to O0 Signed-off-by: Evelina Bakhturina <[email protected]> * table reformatted Signed-off-by: Evelina Bakhturina <[email protected]>
Signed-off-by: Yang Zhang <[email protected]>
editdistance package for fast WER calculation
Signed-off-by: Oleksii Kuchaiev <[email protected]>
…IA#673) * add VAD Signed-off-by: fayejf <[email protected]> * update with PR comments Signed-off-by: fayejf <[email protected]> * revert change on asr notebook 4&5 Signed-off-by: fayejf <[email protected]> * update with PR comments Signed-off-by: fayejf <[email protected]> * fix typos Signed-off-by: fayejf <[email protected]> * upload docs and resolve parts of PR comments Signed-off-by: fayejf <[email protected]> * fix doc bib issue Signed-off-by: fayejf <[email protected]> * fix jenksin doc issue Signed-off-by: fayejf <[email protected]> * fix some warning/typo Signed-off-by: fayejf <[email protected]> * update notebook#6, improve data process scripts, and some fix Signed-off-by: fayejf <[email protected]> * some minor changes Signed-off-by: fayejf <[email protected]> * fix bib issue Signed-off-by: fayejf <[email protected]> * little fix to avoid misunderstanding Signed-off-by: fayejf <[email protected]>
* update an4 notebook Signed-off-by: Jason <[email protected]> * colab bugfix Signed-off-by: Jason <[email protected]> * update script Signed-off-by: Jason <[email protected]> * fix notebooks Signed-off-by: Jason <[email protected]> * fix notebooks Signed-off-by: Jason <[email protected]>
* Durations extraction with script draft. Signed-off-by: Stanislav Beliaev <[email protected]> * Durations extraction notebooks. Signed-off-by: Stanislav Beliaev <[email protected]> * Finished bulk part of durations predictor. Signed-off-by: Stanislav Beliaev <[email protected]> * Add tensorboard logging. Signed-off-by: Stanislav Beliaev <[email protected]> * Add general-style train logger. Signed-off-by: Stanislav Beliaev <[email protected]> * Change LibriSpeech parts order and move train logger callback to core. Signed-off-by: Stanislav Beliaev <[email protected]> * Add one big file durs saving. Signed-off-by: Stanislav Beliaev <[email protected]> * Big batch params change. Signed-off-by: Stanislav Beliaev <[email protected]> * Add full pad option to data loader as default. Signed-off-by: Stanislav Beliaev <[email protected]> * Complete durs pipeline with evaluation. Signed-off-by: Stanislav Beliaev <[email protected]> * Rename durs ngc script. Signed-off-by: Stanislav Beliaev <[email protected]> * Adjust duration main script default for ngc run. Signed-off-by: Stanislav Beliaev <[email protected]> * Fix problem with torch.bool dist eval. Signed-off-by: Stanislav Beliaev <[email protected]> * Add LibriTTS processing. Signed-off-by: Stanislav Beliaev <[email protected]> * Add FasterSpeech full pipeline reaching about 0.4 MSE for LibriTTS. Signed-off-by: Stanislav Beliaev <[email protected]> * Add QN retrain NGC pipeline, new dur XE steps loss and mel Griffin-Lim sampling. Signed-off-by: Stanislav Beliaev <[email protected]> * Add train logging for mel with audio sampling and super sampler. Signed-off-by: Stanislav Beliaev <[email protected]> * Add length sampler. Signed-off-by: Stanislav Beliaev <[email protected]> * Set SSS as default and introduce local shuffling. Signed-off-by: Stanislav Beliaev <[email protected]> * New defaults. Signed-off-by: Stanislav Beliaev <[email protected]> * W&B Support, new speaker system and some refactoring Signed-off-by: Stanislav Beliaev <[email protected]> * Add simple durs aug. Signed-off-by: Stanislav Beliaev <[email protected]> * New baseline (1) Signed-off-by: Stanislav Beliaev <[email protected]> * Fix trim bug and make default O2. Signed-off-by: Stanislav Beliaev <[email protected]> * Add pad16. Signed-off-by: Stanislav Beliaev <[email protected]> * Fix dist eval error and add variable steps. Signed-off-by: Stanislav Beliaev <[email protected]> * Add WaveGlow inference and fix pad16 bug. Signed-off-by: Stanislav Beliaev <[email protected]> * Generalize mel loss. Signed-off-by: Stanislav Beliaev <[email protected]> * Generalize pad op. Signed-off-by: Stanislav Beliaev <[email protected]> * Move pad16 logic to loss. Signed-off-by: Stanislav Beliaev <[email protected]> * Add fmin/fmax to griffin-lim vocoding. Signed-off-by: Stanislav Beliaev <[email protected]> * Move model params to config. Signed-off-by: Stanislav Beliaev <[email protected]> * Add denoiser argument to WaveGlow inference. Signed-off-by: Stanislav Beliaev <[email protected]> * Add new durs with all 1s by default. Signed-off-by: Stanislav Beliaev <[email protected]> * New baseline (3) Signed-off-by: Stanislav Beliaev <[email protected]> * New baseline (4) Signed-off-by: Stanislav Beliaev <[email protected]> * New baseline (5) Signed-off-by: Stanislav Beliaev <[email protected]> * Refactor durs predictor script. Signed-off-by: Stanislav Beliaev <[email protected]> * Durs predictor baseline Signed-off-by: Stanislav Beliaev <[email protected]> * Adjusted durs scirpt for NGC. Signed-off-by: Stanislav Beliaev <[email protected]> * New durs lj baseline Signed-off-by: Stanislav Beliaev <[email protected]> * Update durs baseline params. Signed-off-by: Stanislav Beliaev <[email protected]> * Add durs/blanks acc metrics. Signed-off-by: Stanislav Beliaev <[email protected]> * Fix durs baseline. Signed-off-by: Stanislav Beliaev <[email protected]> * Current state Signed-off-by: Stanislav Beliaev <[email protected]> * Update NGC scripts and implement shake_all aug. Signed-off-by: Stanislav Beliaev <[email protected]> * Update augmentations implementations. Signed-off-by: Stanislav Beliaev <[email protected]> * Bunch of things Signed-off-by: Stanislav Beliaev <[email protected]> * Change name to TalkNet. Signed-off-by: Stanislav Beliaev <[email protected]> * Latest notebooks changes Signed-off-by: Stanislav Beliaev <[email protected]> * Working scripts with latest master changes Signed-off-by: Stanislav Beliaev <[email protected]> * Finished trimming durs predictor code. Signed-off-by: Stanislav Beliaev <[email protected]> * Trimmed mels part. Signed-off-by: Stanislav Beliaev <[email protected]> * Delete dev folder. Signed-off-by: Stanislav Beliaev <[email protected]> * Fix style errors. Signed-off-by: Stanislav Beliaev <[email protected]> * Fix LGTM errors. Signed-off-by: Stanislav Beliaev <[email protected]> * Revert simple logging changes. Signed-off-by: Stanislav Beliaev <[email protected]> * Fix problems. Signed-off-by: Stanislav Beliaev <[email protected]> * Fix problems. Signed-off-by: Stanislav Beliaev <[email protected]> * Remove WG inference and add type hints for data layer. Signed-off-by: Stanislav Beliaev <[email protected]>
Signed-off-by: Oleksii Kuchaiev <[email protected]>
…s) (NVIDIA#675) * git history clean up Signed-off-by: Evelina Bakhturina <[email protected]> * nlp references to the tutotials Signed-off-by: Evelina Bakhturina <[email protected]> * sphinx fix Signed-off-by: Evelina Bakhturina <[email protected]> * review feedback Signed-off-by: Evelina Bakhturina <[email protected]>
Signed-off-by: Jocelyn Huang <[email protected]>
Signed-off-by: Oleksii Kuchaiev <[email protected]>
Signed-off-by: Oleksii Kuchaiev <[email protected]>
Signed-off-by: Jocelyn Huang <[email protected]>
* initial commit of callback documentation Signed-off-by: Jason <[email protected]> * some syntax fixes Signed-off-by: Jason <[email protected]> * add old callbacks file Signed-off-by: Jason <[email protected]> * finalize docs; change train to action Signed-off-by: Jason <[email protected]> * style Signed-off-by: Jason <[email protected]> * update sphinx style Signed-off-by: Jason <[email protected]> * update sphinx warnings Signed-off-by: Jason <[email protected]> * train->action rename bug Signed-off-by: Jason <[email protected]> * address comments Signed-off-by: Jason <[email protected]> * comments Signed-off-by: Jason <[email protected]>
Update README (pretrained ASR model information)
Bugfix to output ports of Kaldi data layer
Signed-off-by: Oleksii Kuchaiev <[email protected]>
fixes to the asr model
Signed-off-by: Oleksii Kuchaiev <[email protected]>
Signed-off-by: Evelina Bakhturina <[email protected]>
Signed-off-by: Yang Zhang <[email protected]>
Signed-off-by: Jason <[email protected]>
* pm+nlg for multiwoz init Signed-off-by: Evelina Bakhturina <[email protected]> * pipeline is working, init clean up Signed-off-by: Evelina Bakhturina <[email protected]> * headers added Signed-off-by: Evelina Bakhturina <[email protected]> * fixed invalid .json file, added db files to multiwoz preprocessing Signed-off-by: Evelina Bakhturina <[email protected]> * code clean up Signed-off-by: Evelina Bakhturina <[email protected]> * lgtm fixes Signed-off-by: Evelina Bakhturina <[email protected]> * docs for TRADE update, jenkins for ruled_based example Signed-off-by: Evelina Bakhturina <[email protected]> * jenkins fix Signed-off-by: Evelina Bakhturina <[email protected]> * ports refactor wip Signed-off-by: Evelina Bakhturina <[email protected]> * ports refactor wip Signed-off-by: Evelina Bakhturina <[email protected]> * wip works Signed-off-by: Evelina Bakhturina <[email protected]> * neural types refactored Signed-off-by: Evelina Bakhturina <[email protected]> * remove unused Signed-off-by: Evelina Bakhturina <[email protected]> * lgtm fixes Signed-off-by: Evelina Bakhturina <[email protected]> * typo Signed-off-by: Evelina Bakhturina <[email protected]> * state dict splited Signed-off-by: Evelina Bakhturina <[email protected]> * lgtm fixes Signed-off-by: Evelina Bakhturina <[email protected]> * fixing the process script, moved multiwoz_mapping.pair to multiwoz, enabled utilization of relative paths Signed-off-by: Tomasz Kornuta <[email protected]> * formatting fix Signed-off-by: Tomasz Kornuta <[email protected]> * reformatted the code, ready for definition of NG by connecting the modules - and fixing the definitions Signed-off-by: Tomasz Kornuta <[email protected]> * work in progress-ess, not working, internet issues Signed-off-by: Tomasz Kornuta <[email protected]> * UtteranceEncoder neural types wip Signed-off-by: nvidia <[email protected]> * utterance encoder neural types Signed-off-by: nvidia <[email protected]> * updating trade outputs Signed-off-by: nvidia <[email protected]> * updating trade outputs Signed-off-by: nvidia <[email protected]> * fightihg with belief state Signed-off-by: nvidia <[email protected]> * Cannot make second named tuple work Signed-off-by: nvidia <[email protected]> * reorganized files, whole pipeline handshaking works Signed-off-by: nvidia <[email protected]> * reorganized files, whole pipeline handshaking works Signed-off-by: nvidia <[email protected]> * polish Signed-off-by: nvidia <[email protected]> * Fix of my dummy error Signed-off-by: nvidia <[email protected]> * new examples Signed-off-by: nvidia <[email protected]> * style fix Signed-off-by: Evelina Bakhturina <[email protected]> * fixed TRADE training Signed-off-by: Evelina Bakhturina <[email protected]> * Added module responsible for sys uttr dialog history update Signed-off-by: nvidia <[email protected]> * LGTM fix Signed-off-by: nvidia <[email protected]> * moved dialog specific axesc andctypes to nlp/neural_types.py, refactored the modules Signed-off-by: nvidia <[email protected]> * style fix Signed-off-by: nvidia <[email protected]> Co-authored-by: Tomasz Kornuta <[email protected]>
Signed-off-by: Jason <[email protected]>
Callchain fix and update logging
* make test better Signed-off-by: Jason <[email protected]> * fix rename error during topological sort Signed-off-by: Jason <[email protected]> * test fix Signed-off-by: Jason <[email protected]>
FYI, I just rebased to a more recent upstream master. |
Fixed 2_Online_ASR_Microphone_Demo notebook to support new config
Signed-off-by: Oleksii Kuchaiev <[email protected]>
…VIDIA#724) * Added ability to write audio to tensorboard during Tacotron training Signed-off-by: Polezhaev Sergej <[email protected]> * Removed unused import Signed-off-by: Polezhaev Sergej <[email protected]> Co-authored-by: Sergey Polezhaev <[email protected]>
Signed-off-by: Alexis Asseman <[email protected]>
Signed-off-by: Alexis Asseman <[email protected]>
Signed-off-by: Alexis Asseman <[email protected]>
Signed-off-by: Alexis Asseman <[email protected]>
Signed-off-by: Alexis Asseman <[email protected]>
Signed-off-by: Alexis Asseman <[email protected]>
Signed-off-by: Alexis Asseman <[email protected]>
Signed-off-by: Alexis Asseman <[email protected]>
Signed-off-by: Alexis Asseman <[email protected]>
Signed-off-by: Alexis Asseman <[email protected]>
Signed-off-by: Alexis Asseman <[email protected]>
Signed-off-by: Alexis Asseman <[email protected]>
Signed-off-by: Alexis Asseman <[email protected]>
…rent subdirs, as they are not datalayers. Signed-off-by: Alexis Asseman <[email protected]>
Signed-off-by: Alexis Asseman <[email protected]>
Signed-off-by: Alexis Asseman <[email protected]>
Signed-off-by: Alexis Asseman <[email protected]>
Signed-off-by: Alexis Asseman <[email protected]>
Signed-off-by: Alexis Asseman <[email protected]>
Signed-off-by: Alexis Asseman <[email protected]>
Signed-off-by: Alexis Asseman <[email protected]>
Signed-off-by: Alexis Asseman <[email protected]>
tkornuta-nvidia
pushed a commit
that referenced
this pull request
Aug 25, 2020
* Integrated Megatron-LM Signed-off-by: Boris Fomitchev <[email protected]> * Addressing PR comments, trying NER example Signed-off-by: Boris Fomitchev <[email protected]> * manual style fix Signed-off-by: Boris Fomitchev <[email protected]> * manual style fix #2 Signed-off-by: Boris Fomitchev <[email protected]> * Resolving circular import Signed-off-by: Boris Fomitchev <[email protected]> * Static analysys warnings addressed Signed-off-by: Boris Fomitchev <[email protected]> * Addressed code review; Jenkins test added Signed-off-by: Boris Fomitchev <[email protected]> * Removing parallel feom Megatron Signed-off-by: Boris Fomitchev <[email protected]> * Added more info to tokenizer printout, made megatron bert derivative explicit Signed-off-by: Boris Fomitchev <[email protected]> * Bumping Megatron-LM version to get APEX fix Signed-off-by: Boris Fomitchev <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Ported COG dataset from https://github.com/IBM/mi-prometheus as a NeMo datalayer.
Also made lots of cleanup of the original code. Tried to organize the commits in a clean and logical way, so consulting the commits one by one should help with tracking the modifications.
To run test, in the root of the NeMo python module dir:
If X is available, the test will show some samples using matplotlib.