-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Signed-off-by: Jason <[email protected]>
- Loading branch information
Showing
88 changed files
with
6,823 additions
and
619 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
Megatron-LM for Downstream Tasks | ||
================================ | ||
|
||
Megatron :cite:`nlp-megatron-lm-shoeybi2020megatron` is a large, powerful transformer developed by the Applied Deep Learning Research team at NVIDIA. | ||
More details could be found in `Megatron-LM github repo <https://github.com/NVIDIA/Megatron-LM>`_. | ||
|
||
In order to finetune a pretrained Megatron BERT language model on NLP downstream tasks from `examples/nlp <https://github.com/NVIDIA/NeMo/tree/master/examples/nlp>`_, specify the pretrained_model_name like this: | ||
|
||
.. code-block:: bash | ||
--pretrained_model_name megatron-bert-345m-uncased | ||
For example, to finetune SQuAD v1.1 with Megatron-LM, run: | ||
|
||
.. code-block:: bash | ||
python question_answering_squad.py \ | ||
--train_file PATH_TO_DATA_DIR/squad/v1.1/train-v1.1.json \ | ||
--eval_file PATH_TO_DATA_DIR/squad/v1.1/dev-v1.1.json \ | ||
--pretrained_model_name megatron-bert-345m-uncased | ||
If you have a different checkpoint or model configuration, use ``--pretrained_model_name megatron-bert-uncased`` or ``--pretrained_model_name megatron-bert-cased`` and specify ``--bert_config`` and ``--bert_checkpoint`` for your model. | ||
|
||
.. note:: | ||
Megatron-LM has its own set of training arguments (including tokenizer) that are ignored during finetuning in NeMo. Please use downstream task training scripts for all NeMo supported arguments. | ||
|
||
|
||
|
||
References | ||
---------- | ||
|
||
.. bibliography:: nlp_all_refs.bib | ||
:style: plain | ||
:labelprefix: NLP-MEGATRON-LM | ||
:keyprefix: nlp-megatron-lm- |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
Datasets | ||
======== | ||
|
||
HI-MIA | ||
-------- | ||
|
||
Run the script to download and process hi-mia dataset in order to generate files in the supported format of `nemo_asr`. You should set the data folder of | ||
hi-mia using `--data_root`. These scripts are present in <nemo_root>/scripts | ||
|
||
.. code-block:: bash | ||
python get_hi-mia_data.py --data_root=<data directory> | ||
After download and conversion, your `data` folder should contain directories with follwing set of files as: | ||
|
||
* `data/<set>/train.json` | ||
* `data/<set>/dev.json` | ||
* `data/<set>/{set}_all.json` | ||
* `data/<set>/utt2spk` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
.. include:: ../asr/installation.rst |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
.. _speaker-recognition-docs: | ||
|
||
|
||
Speaker Recognition | ||
=================== | ||
|
||
.. toctree:: | ||
:maxdepth: 8 | ||
|
||
installation_link | ||
tutorial | ||
datasets | ||
models | ||
|
||
|
||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
Models | ||
==================== | ||
|
||
.. toctree:: | ||
:maxdepth: 8 | ||
|
||
quartznet | ||
|
||
References | ||
---------- | ||
|
||
.. bibliography:: speaker.bib | ||
:style: plain | ||
:labelprefix: SPEAKER-TUT | ||
:keyprefix: speaker-tut- |
Oops, something went wrong.