Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge r1.1 bugfixes to main. Update dep versions. #2437

Merged
merged 23 commits into from
Jul 2, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
295dbd7
Update notebook branch and Jenkinsfile for 1.1.0 testing (#2378)
ericharper Jun 21, 2021
7c287e3
[BUGFIX] NMT Multi-node was incorrectly computing num_replicas (#2380)
ericharper Jun 21, 2021
01997d3
Update ASR scripts for tokenizer building and tarred dataset building…
titu1994 Jun 22, 2021
c146867
Update notebook (#2391)
titu1994 Jun 23, 2021
4880f7d
ASR Notebooks fix for 1.1.0 (#2395)
fayejf Jun 24, 2021
5525f69
Mean normalization (#2397)
nithinraok Jun 24, 2021
c01740f
Bugfix adaptive spec augment time masking (#2398)
titu1994 Jun 24, 2021
7220778
Correct typos and issues with notebooks (#2402)
titu1994 Jun 24, 2021
c0c78d7
remove accelerator=DDP in tutorial notebooks to avoid errors. (#2403)
khcs Jun 25, 2021
0282844
[BUGFIX] Megatron in NMT was setting vocab_file to None (#2417)
ericharper Jun 29, 2021
41723c8
Link updates in docs and notebooks and typo fix (#2416)
fayejf Jun 29, 2021
76e8a8f
Update onnx (#2420)
titu1994 Jun 29, 2021
60c2d04
Correct version of onnxruntime (#2422)
titu1994 Jun 30, 2021
0dc7bc1
update deployment instructions (#2430)
ericharper Jul 1, 2021
50e7bb1
Bumping version to 1.1.0
okuchaiev Jul 2, 2021
4f08ce1
update jenksinfile
ericharper Jul 2, 2021
02287e1
Merge branch 'r1.1.0' of github.com:NVIDIA/NeMo into r1.1.0
ericharper Jul 2, 2021
8aee9f2
add upper bounds
ericharper Jul 2, 2021
e722f33
update readme
ericharper Jul 2, 2021
31b4de5
update branch
ericharper Jul 2, 2021
38e00f9
update requirements
ericharper Jul 2, 2021
cceb251
update jenkinsfile
ericharper Jul 2, 2021
b22fa12
update version
ericharper Jul 2, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 0 additions & 6 deletions Jenkinsfile
Original file line number Diff line number Diff line change
Expand Up @@ -17,12 +17,6 @@ pipeline {
}
}

stage('Uninstall torchtext') {
steps {
sh 'pip uninstall -y torchtext'
}
}

stage('Install test requirements') {
steps {
sh 'apt-get update && apt-get install -y bc && pip install -r requirements/requirements_test.txt'
Expand Down
34 changes: 21 additions & 13 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -93,19 +93,17 @@ Documentation
:scale: 100%
:target: https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/

+---------+-------------+----------------------------------------------------------------------------------------------------------------------------------+
| Version | Status | Description |
+=========+=============+==================================================================================================================================+
| Latest | |main| | `Documentation of the latest (i.e. main) branch. <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/>`_ |
+---------+-------------+----------------------------------------------------------------------------------------------------------------------------------+
| Next | |v1.0.2| | `Documentation of the most recent release: v1.0.2 <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/v1.0.2/>`_ |
+---------+-------------+----------------------------------------------------------------------------------------------------------------------------------+
| Stable | |stable| | `Documentation of the stable (i.e. stable) branch. <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/>`_ |
+---------+-------------+----------------------------------------------------------------------------------------------------------------------------------+
+---------+-------------+------------------------------------------------------------------------------------------------------------------------------------------+
| Version | Status | Description |
+=========+=============+==========================================================================================================================================+
| Latest | |main| | `Documentation of the latest (i.e. main) branch. <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/>`_ |
+---------+-------------+------------------------------------------------------------------------------------------------------------------------------------------+
| Stable | |stable| | `Documentation of the stable (i.e. most recent release) branch. <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/>`_ |
+---------+-------------+------------------------------------------------------------------------------------------------------------------------------------------+

Tutorials
---------
A great way to start with NeMo is by checking `one of our tutorials <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/v1.0.2/starthere/tutorials.html>`_.
A great way to start with NeMo is by checking `one of our tutorials <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/starthere/tutorials.html>`_.

Getting help with NeMo
----------------------
Expand Down Expand Up @@ -147,6 +145,16 @@ Use this installation mode if you are contributing to NeMo.
cd NeMo
./reinstall.sh

RNNT
~~~~
Note that RNNT requires numba to be installed from conda.

.. code-block:: bash

conda remove numba
pip uninstall numba
conda install -c conda conda

Docker containers:
~~~~~~~~~~~~~~~~~~

Expand All @@ -161,14 +169,14 @@ If you chose to work with main branch, we recommend using NVIDIA's PyTorch conta
Examples
--------

Many example can be found under `"Examples" <https://github.com/NVIDIA/NeMo/tree/main/examples>`_ folder.
Many example can be found under `"Examples" <https://github.com/NVIDIA/NeMo/tree/stable/examples>`_ folder.


Contributing
------------

We welcome community contributions! Please refer to the `CONTRIBUTING.md <https://github.com/NVIDIA/NeMo/blob/main/CONTRIBUTING.md>`_ CONTRIBUTING.md for the process.
We welcome community contributions! Please refer to the `CONTRIBUTING.md <https://github.com/NVIDIA/NeMo/blob/stable/CONTRIBUTING.md>`_ CONTRIBUTING.md for the process.

License
-------
NeMo is under `Apache 2.0 license <https://github.com/NVIDIA/NeMo/blob/main/LICENSE>`_.
NeMo is under `Apache 2.0 license <https://github.com/NVIDIA/NeMo/blob/stable/LICENSE>`_.
16 changes: 8 additions & 8 deletions docs/source/asr/asr_language_modeling.rst
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ Train N-gram LM
===============

The script to train an N-gram language model with KenLM can be found at
`scripts/asr_language_modeling/ngram_lm/train_kenlm.py <https://github.com/NVIDIA/NeMo/blob/main/scripts/asr_language_modeling/ngram_lm/train_kenlm.py>`__.
`scripts/asr_language_modeling/ngram_lm/train_kenlm.py <https://github.com/NVIDIA/NeMo/blob/stable/scripts/asr_language_modeling/ngram_lm/train_kenlm.py>`__.

This script would train an N-gram language model with KenLM library which can be used with the beam search decoders
on top of the ASR models. This script supports both character level and BPE level encodings and models which is
Expand Down Expand Up @@ -95,7 +95,7 @@ Evaluate by Beam Search Decoding and N-gram LM

NeMo's beam search decoders are capable of using the KenLM's N-gram models to find the best candidates.
The script to evaluate an ASR model with beam search decoding and N-gram models can be found at
`scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram.py <https://github.com/NVIDIA/NeMo/blob/main/scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram.py>`__.
`scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram.py <https://github.com/NVIDIA/NeMo/blob/stable/scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram.py>`__.

You may evaluate an ASR model as the following:

Expand Down Expand Up @@ -169,7 +169,7 @@ Width of the beam search (`--beam_width`) specifies the number of top candidates
would search for. Larger beams result in more accurate but slower predictions.

There is also a tutorial to learn more about evaluating the ASR models with N-gram LM here:
`Offline ASR Inference with Beam Search and External Language Model Rescoring <https://colab.research.google.com/github/NVIDIA/NeMo/blob/v1.0.2/tutorials/asr/Offline_ASR.ipynb>`_
`Offline ASR Inference with Beam Search and External Language Model Rescoring <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/asr/Offline_ASR.ipynb>`_

Hyperparameter Grid Search
--------------------------
Expand Down Expand Up @@ -202,19 +202,19 @@ This score is usually combined with the scores from the beam search decoding to
Train Neural Rescorer
=====================

An example script to train such a language model with Transformer can be found at `examples/nlp/language_modeling/transformer_lm.py <https://github.com/NVIDIA/NeMo/blob/main/examples/nlp/language_modeling/transformer_lm.py>`__.
An example script to train such a language model with Transformer can be found at `examples/nlp/language_modeling/transformer_lm.py <https://github.com/NVIDIA/NeMo/blob/stable/examples/nlp/language_modeling/transformer_lm.py>`__.
It trains a TransformerLMModel which can be used as a neural rescorer for an ASR system.


Evaluation
==========

Given a trained TransformerLMModel `.nemo` file, the script available at
`scripts/asr_language_modeling/neural_rescorer/eval_neural_rescorer.py <https://github.com/NVIDIA/NeMo/blob/main/scripts/asr_language_modeling/neural_rescorer/eval_neural_rescorer.py>`__
`scripts/asr_language_modeling/neural_rescorer/eval_neural_rescorer.py <https://github.com/NVIDIA/NeMo/blob/stable/scripts/asr_language_modeling/neural_rescorer/eval_neural_rescorer.py>`__
can be used to re-score beams obtained with ASR model. You need the `.tsv` file containing the candidates produced
by the acoustic model and the beam search decoding to use this script. The candidates can be the result of just the beam
search decoding or the result of fusion with an N-gram LM. You may generate this file by specifying `--preds_output_folder' for
`scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram.py <https://github.com/NVIDIA/NeMo/blob/main/scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram.py>`__.
`scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram.py <https://github.com/NVIDIA/NeMo/blob/stable/scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram.py>`__.

The neural rescorer would rescore the beams/candidates by using two parameters of `rescorer_alpha` and `rescorer_beta` as the following:

Expand All @@ -231,9 +231,9 @@ You may follow the following steps to evaluate a neural LM:
#. Obtain `.tsv` file with beams and their corresponding scores. Scores can be from a regular beam search decoder or
in fusion with an N-gram LM scores. For a given beam size `beam_size` and a number of examples
for evaluation `num_eval_examples`, it should contain (`num_eval_examples` x `beam_size`) lines of
form `beam_candidate_text \t score`. This file can be generated by `scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram.py <https://github.com/NVIDIA/NeMo/blob/main/scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram.py>`__
form `beam_candidate_text \t score`. This file can be generated by `scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram.py <https://github.com/NVIDIA/NeMo/blob/stable/scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram.py>`__

#. Rescore the candidates by `scripts/asr_language_modeling/neural_rescorer/eval_neural_rescorer.py <https://github.com/NVIDIA/NeMo/blob/main/scripts/asr_language_modeling/neural_rescorer/eval_neural_rescorer.py>`__.
#. Rescore the candidates by `scripts/asr_language_modeling/neural_rescorer/eval_neural_rescorer.py <https://github.com/NVIDIA/NeMo/blob/stable/scripts/asr_language_modeling/neural_rescorer/eval_neural_rescorer.py>`__.

.. code::
python eval_neural_rescorer.py
Expand Down
2 changes: 1 addition & 1 deletion docs/source/nemo_text_processing/intro.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ Text Processing

See :doc:`NeMo Introduction <../starthere/intro>` for installation details.

Additional requirements can be found in `setup.sh <https://github.com/NVIDIA/NeMo/blob/main/nemo_text_processing/setup.sh>`_.
Additional requirements can be found in `setup.sh <https://github.com/NVIDIA/NeMo/blob/stable/nemo_text_processing/setup.sh>`_.

.. toctree::
:maxdepth: 1
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ See :doc:`Text Procesing Deployment <../tools/text_processing_deployment>` for d

.. note::

For more details, see the tutorial `NeMo/tutorials/text_processing/Inverse_Text_Normalization.ipynb <https://github.com/NVIDIA/NeMo/blob/main/tutorials/text_processing/Inverse_Text_Normalization.ipynb>`__ in `Google's Colab <https://colab.research.google.com/github/NVIDIA/NeMo/blob/main/tutorials/text_processing/Inverse_Text_Normalization.ipynb>`_.
For more details, see the tutorial `NeMo/tutorials/text_processing/Inverse_Text_Normalization.ipynb <https://github.com/NVIDIA/NeMo/blob/stable/tutorials/text_processing/Inverse_Text_Normalization.ipynb>`__ in `Google's Colab <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/text_processing/Inverse_Text_Normalization.ipynb>`_.



Expand Down
2 changes: 1 addition & 1 deletion docs/source/nemo_text_processing/text_normalization.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ See :doc:`Text Procesing Deployment <../tools/text_processing_deployment>` for d

.. note::

For more details, see the tutorial `NeMo/tutorials/text_processing/Text_Normalization.ipynb <https://github.com/NVIDIA/NeMo/blob/main/tutorials/text_processing/Text_Normalization.ipynb>`__ in `Google's Colab <https://colab.research.google.com/github/NVIDIA/NeMo/blob/main/tutorials/text_processing/Text_Normalization.ipynb>`_.
For more details, see the tutorial `NeMo/tutorials/text_processing/Text_Normalization.ipynb <https://github.com/NVIDIA/NeMo/blob/stable/tutorials/text_processing/Text_Normalization.ipynb>`__ in `Google's Colab <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/text_processing/Text_Normalization.ipynb>`_.



Expand Down
4 changes: 2 additions & 2 deletions docs/source/nlp/bert_pretraining.rst
Original file line number Diff line number Diff line change
Expand Up @@ -61,8 +61,8 @@ and specify the path to the created hd5f files.
Training the BERT model
-----------------------

Example of model configuration for on-the-fly data preprocessing: `NeMo/examples/nlp/language_modeling/conf/bert_pretraining_from_text_config.yaml <https://github.com/NVIDIA/NeMo/blob/main/examples/nlp/language_modeling/conf/bert_pretraining_from_text_config.yaml>`__.
Example of model configuration for offline data preprocessing: `NeMo/examples/nlp/language_modeling/conf/bert_pretraining_from_preprocessed_config.yaml <https://github.com/NVIDIA/NeMo/blob/main/examples/nlp/language_modeling/conf/bert_pretraining_from_preprocessed_config.yaml>`__.
Example of model configuration for on-the-fly data preprocessing: `NeMo/examples/nlp/language_modeling/conf/bert_pretraining_from_text_config.yaml <https://github.com/NVIDIA/NeMo/blob/stable/examples/nlp/language_modeling/conf/bert_pretraining_from_text_config.yaml>`__.
Example of model configuration for offline data preprocessing: `NeMo/examples/nlp/language_modeling/conf/bert_pretraining_from_preprocessed_config.yaml <https://github.com/NVIDIA/NeMo/blob/stable/examples/nlp/language_modeling/conf/bert_pretraining_from_preprocessed_config.yaml>`__.

The specification can be grouped into three categories:

Expand Down
4 changes: 2 additions & 2 deletions docs/source/nlp/glue_benchmark.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@
GLUE Benchmark
==============

We recommend you try the GLUE Benchmark model in a Jupyter notebook (can run on `Google's Colab <https://colab.research.google.com/notebooks/intro.ipynb>`_): `NeMo/tutorials/nlp/GLUE_Benchmark.ipynb <https://github.com/NVIDIA/NeMo/blob/main/tutorials/nlp/GLUE_Benchmark.ipynb>`__.
We recommend you try the GLUE Benchmark model in a Jupyter notebook (can run on `Google's Colab <https://colab.research.google.com/notebooks/intro.ipynb>`_): `NeMo/tutorials/nlp/GLUE_Benchmark.ipynb <https://github.com/NVIDIA/NeMo/blob/stable/tutorials/nlp/GLUE_Benchmark.ipynb>`__.

Connect to an instance with a GPU (**Runtime** -> **Change runtime type** -> select **GPU** for the hardware accelerator).

An example script on how to train the model can be found here: `NeMo/examples/nlp/glue_benchmark/glue_benchmark.py <https://github.com/NVIDIA/NeMo/blob/main/examples/nlp/glue_benchmark/glue_benchmark.py>`__.
An example script on how to train the model can be found here: `NeMo/examples/nlp/glue_benchmark/glue_benchmark.py <https://github.com/NVIDIA/NeMo/blob/stable/examples/nlp/glue_benchmark/glue_benchmark.py>`__.
2 changes: 1 addition & 1 deletion docs/source/nlp/information_retrieval.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
Information Retrieval
=====================

We recommend you try the Information Retrieval model in a Jupyter notebook (can run on `Google's Colab <https://colab.research.google.com/notebooks/intro.ipynb>`_): `NeMo/tutorials/nlp/Information_Retrieval_MSMARCO.ipynb <https://github.com/NVIDIA/NeMo/blob/main/tutorials/nlp/Information_Retrieval_MSMARCO.ipynb>`__.
We recommend you try the Information Retrieval model in a Jupyter notebook (can run on `Google's Colab <https://colab.research.google.com/notebooks/intro.ipynb>`_): `NeMo/tutorials/nlp/Information_Retrieval_MSMARCO.ipynb <https://github.com/NVIDIA/NeMo/blob/stable/tutorials/nlp/Information_Retrieval_MSMARCO.ipynb>`__.

Connect to an instance with a GPU (**Runtime** -> **Change runtime type** -> select **GPU** for hardware the accelerator),

Expand Down
6 changes: 3 additions & 3 deletions docs/source/nlp/joint_intent_slot.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ Our BERT-based model implementation allows you to train and detect both of these

.. note::

We recommend you try the Joint Intent and Slot Classification model in a Jupyter notebook (can run on `Google's Colab <https://colab.research.google.com/notebooks/intro.ipynb>`_.): `NeMo/tutorials/nlp/Joint_Intent_and_Slot_Classification.ipynb <https://github.com/NVIDIA/NeMo/blob/main/tutorials/nlp/Joint_Intent_and_Slot_Classification.ipynb>`__.
We recommend you try the Joint Intent and Slot Classification model in a Jupyter notebook (can run on `Google's Colab <https://colab.research.google.com/notebooks/intro.ipynb>`_.): `NeMo/tutorials/nlp/Joint_Intent_and_Slot_Classification.ipynb <https://github.com/NVIDIA/NeMo/blob/stable/tutorials/nlp/Joint_Intent_and_Slot_Classification.ipynb>`__.

Connect to an instance with a GPU (**Runtime** -> **Change runtime type** -> select **GPU** for the hardware accelerator).

Expand Down Expand Up @@ -115,7 +115,7 @@ For each query, the model classifies it as one the intents from the intent dicti
it as one of the slots from the slot dictionary, including out of scope slot for all the remaining words in the query which does not
fall in another slot category. Out of scope slot (``O``) is a part of slot dictionary that the model is trained on.

Example of model configuration file for training the model can be found at: `NeMo/examples/nlp/intent_slot_classification/conf/intent_slot_classification.yaml <https://github.com/NVIDIA/NeMo/blob/main/examples/nlp/intent_slot_classification/conf/intent_slot_classification_config.yaml>`__.
Example of model configuration file for training the model can be found at: `NeMo/examples/nlp/intent_slot_classification/conf/intent_slot_classification.yaml <https://github.com/NVIDIA/NeMo/blob/stable/examples/nlp/intent_slot_classification/conf/intent_slot_classification_config.yaml>`__.
In the configuration file, define the parameters of the training and the model, although most of the default values will work well.

The specification can be roughly grouped into three categories:
Expand Down Expand Up @@ -152,7 +152,7 @@ More details about parameters in the spec file can be found below:
| **test_ds.prefix** | string | ``test`` | A prefix for the test file names. |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+

For additional config parameters common to all NLP models, refer to the `nlp_model doc <https://github.com/NVIDIA/NeMo/blob/main/docs/source/nlp/nlp_model.rst#model-nlp>`__.
For additional config parameters common to all NLP models, refer to the `nlp_model doc <https://github.com/NVIDIA/NeMo/blob/stable/docs/source/nlp/nlp_model.rst#model-nlp>`__.

The following is an example of the command for training the model:

Expand Down
15 changes: 13 additions & 2 deletions docs/source/nlp/machine_translation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -478,7 +478,7 @@ custom configuration under the ``encoder`` configuration.
HuggingFace
^^^^^^^^^^^

We have provided a `HuggingFace config file <https://github.com/NVIDIA/NeMo/blob/main/examples/nlp/machine_translation/conf/huggingface.yaml>`__
We have provided a `HuggingFace config file <https://github.com/NVIDIA/NeMo/blob/stable/examples/nlp/machine_translation/conf/huggingface.yaml>`__
to use with HuggingFace encoders.

To use the config file from CLI:
Expand Down Expand Up @@ -508,7 +508,7 @@ Note the ``+`` symbol is needed if we're not adding the arguments to the YAML co
Megatron
^^^^^^^^

We have provided a `Megatron config file <https://github.com/NVIDIA/NeMo/blob/main/examples/nlp/machine_translation/conf/megatron.yaml>`__
We have provided a `Megatron config file <https://github.com/NVIDIA/NeMo/blob/stable/examples/nlp/machine_translation/conf/megatron.yaml>`__
to use with Megatron encoders.

To use the config file from CLI:
Expand Down Expand Up @@ -561,6 +561,17 @@ To train a Megatron 345M BERT, we would use
model.encoder.num_layers=24 \
model.encoder.max_position_embeddings=512 \

If the pretrained megatron model used a custom vocab file, then set:

.. code::

model.encoder_tokenizer.vocab_file=/path/to/your/megatron/vocab_file.txt
model.encoder.vocab_file=/path/to/your/megatron/vocab_file.txt


Use ``encoder.model_name=megatron_bert_uncased`` for uncased models with custom vocabularies and
use ``encoder.model_name=megatron_bert_cased`` for cased models with custom vocabularies.


References
----------
Expand Down
Loading