Skip to content

Commit

Permalink
Merge r1.1 bugfixes to main. Update dep versions. (NVIDIA#2437)
Browse files Browse the repository at this point in the history
* Update notebook branch and Jenkinsfile for 1.1.0 testing (NVIDIA#2378)

* update branch

Signed-off-by: ericharper <[email protected]>

* update jenkinsfile

Signed-off-by: ericharper <[email protected]>

* [BUGFIX] NMT Multi-node was incorrectly computing num_replicas (NVIDIA#2380)

* fix property when not using model parallel

Signed-off-by: ericharper <[email protected]>

* fix property when not using model parallel

Signed-off-by: ericharper <[email protected]>

* add debug statement

Signed-off-by: ericharper <[email protected]>

* add debug statement

Signed-off-by: ericharper <[email protected]>

* instantiate with NLPDDPPlugin with num_nodes from trainer config

Signed-off-by: ericharper <[email protected]>

* Update ASR scripts for tokenizer building and tarred dataset building (NVIDIA#2381)

* Update ASR scripts for tokenizer building and tarred dataset building

Signed-off-by: smajumdar <[email protected]>

* Update container

Signed-off-by: smajumdar <[email protected]>

* Add STT Zh Citrinet 1024 Gamma 0.25 model

Signed-off-by: smajumdar <[email protected]>

* Update notebook (NVIDIA#2391)

Signed-off-by: smajumdar <[email protected]>

* ASR Notebooks fix for 1.1.0 (NVIDIA#2395)

* nb fix for spring clean

Signed-off-by: fayejf <[email protected]>

* remove outdated instruction

Signed-off-by: fayejf <[email protected]>

* Mean normalization (NVIDIA#2397)

* norm embeddings

Signed-off-by: nithinraok <[email protected]>

* move to utils

Signed-off-by: nithinraok <[email protected]>

* Bugfix adaptive spec augment time masking (NVIDIA#2398)

* bugfix adaptive spec augment

Signed-off-by: smajumdar <[email protected]>

* Revert freq mask guard

Signed-off-by: smajumdar <[email protected]>

* Revert freq mask guard

Signed-off-by: smajumdar <[email protected]>

* Remove static time width clamping

Signed-off-by: smajumdar <[email protected]>

* Correct typos and issues with notebooks (NVIDIA#2402)

* Fix Primer notebook

Signed-off-by: smajumdar <[email protected]>

* Typo

Signed-off-by: smajumdar <[email protected]>

* remove accelerator=DDP in tutorial notebooks to avoid errors. (NVIDIA#2403)

Signed-off-by: Hoo Chang Shin <[email protected]>

Co-authored-by: Hoo Chang Shin <[email protected]>

* [BUGFIX] Megatron in NMT was setting vocab_file to None (NVIDIA#2417)

* make vocab_file configurable for megatron in nmt

Signed-off-by: ericharper <[email protected]>

* update docs

Signed-off-by: ericharper <[email protected]>

* update docs

Signed-off-by: ericharper <[email protected]>

* Link updates in docs and notebooks and typo fix (NVIDIA#2416)

* typo fix for notebooks

Signed-off-by: fayejf <[email protected]>

* tiny typo fix in docs

Signed-off-by: fayejf <[email protected]>

* docs branch->stable

Signed-off-by: fayejf <[email protected]>

* more docs branch -> stable

Signed-off-by: fayejf <[email protected]>

* tutorial links branch -> stable

Signed-off-by: fayejf <[email protected]>

* small fix

Signed-off-by: fayejf <[email protected]>

* add renamed 06

Signed-off-by: fayejf <[email protected]>

* more fixes

Signed-off-by: fayejf <[email protected]>

* Update onnx (NVIDIA#2420)

Signed-off-by: smajumdar <[email protected]>

* Correct version of onnxruntime (NVIDIA#2422)

Signed-off-by: smajumdar <[email protected]>

* update deployment instructions (NVIDIA#2430)

Signed-off-by: ericharper <[email protected]>

* Bumping version to 1.1.0

Signed-off-by: Oleksii Kuchaiev <[email protected]>

* update jenksinfile

Signed-off-by: ericharper <[email protected]>

* add upper bounds

Signed-off-by: ericharper <[email protected]>

* update readme

Signed-off-by: ericharper <[email protected]>

* update requirements

Signed-off-by: ericharper <[email protected]>

* update jenkinsfile

Signed-off-by: ericharper <[email protected]>

* update version

Signed-off-by: ericharper <[email protected]>

Co-authored-by: Somshubra Majumdar <[email protected]>
Co-authored-by: fayejf <[email protected]>
Co-authored-by: Nithin Rao <[email protected]>
Co-authored-by: khcs <[email protected]>
Co-authored-by: Hoo Chang Shin <[email protected]>
Co-authored-by: Oleksii Kuchaiev <[email protected]>
Signed-off-by: Paarth Neekhara <[email protected]>
  • Loading branch information
7 people authored and paarthneekhara committed Sep 17, 2021
1 parent 014672f commit c131b57
Show file tree
Hide file tree
Showing 51 changed files with 220 additions and 200 deletions.
6 changes: 0 additions & 6 deletions Jenkinsfile
Original file line number Diff line number Diff line change
Expand Up @@ -17,12 +17,6 @@ pipeline {
}
}

stage('Uninstall torchtext') {
steps {
sh 'pip uninstall -y torchtext'
}
}

stage('Install test requirements') {
steps {
sh 'apt-get update && apt-get install -y bc && pip install -r requirements/requirements_test.txt'
Expand Down
34 changes: 21 additions & 13 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -93,19 +93,17 @@ Documentation
:scale: 100%
:target: https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/

+---------+-------------+----------------------------------------------------------------------------------------------------------------------------------+
| Version | Status | Description |
+=========+=============+==================================================================================================================================+
| Latest | |main| | `Documentation of the latest (i.e. main) branch. <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/>`_ |
+---------+-------------+----------------------------------------------------------------------------------------------------------------------------------+
| Next | |v1.0.2| | `Documentation of the most recent release: v1.0.2 <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/v1.0.2/>`_ |
+---------+-------------+----------------------------------------------------------------------------------------------------------------------------------+
| Stable | |stable| | `Documentation of the stable (i.e. stable) branch. <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/>`_ |
+---------+-------------+----------------------------------------------------------------------------------------------------------------------------------+
+---------+-------------+------------------------------------------------------------------------------------------------------------------------------------------+
| Version | Status | Description |
+=========+=============+==========================================================================================================================================+
| Latest | |main| | `Documentation of the latest (i.e. main) branch. <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/>`_ |
+---------+-------------+------------------------------------------------------------------------------------------------------------------------------------------+
| Stable | |stable| | `Documentation of the stable (i.e. most recent release) branch. <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/>`_ |
+---------+-------------+------------------------------------------------------------------------------------------------------------------------------------------+

Tutorials
---------
A great way to start with NeMo is by checking `one of our tutorials <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/v1.0.2/starthere/tutorials.html>`_.
A great way to start with NeMo is by checking `one of our tutorials <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/starthere/tutorials.html>`_.

Getting help with NeMo
----------------------
Expand Down Expand Up @@ -147,6 +145,16 @@ Use this installation mode if you are contributing to NeMo.
cd NeMo
./reinstall.sh
RNNT
~~~~
Note that RNNT requires numba to be installed from conda.

.. code-block:: bash
conda remove numba
pip uninstall numba
conda install -c conda conda
Docker containers:
~~~~~~~~~~~~~~~~~~

Expand All @@ -161,14 +169,14 @@ If you chose to work with main branch, we recommend using NVIDIA's PyTorch conta
Examples
--------

Many example can be found under `"Examples" <https://github.com/NVIDIA/NeMo/tree/main/examples>`_ folder.
Many example can be found under `"Examples" <https://github.com/NVIDIA/NeMo/tree/stable/examples>`_ folder.


Contributing
------------

We welcome community contributions! Please refer to the `CONTRIBUTING.md <https://github.com/NVIDIA/NeMo/blob/main/CONTRIBUTING.md>`_ CONTRIBUTING.md for the process.
We welcome community contributions! Please refer to the `CONTRIBUTING.md <https://github.com/NVIDIA/NeMo/blob/stable/CONTRIBUTING.md>`_ CONTRIBUTING.md for the process.

License
-------
NeMo is under `Apache 2.0 license <https://github.com/NVIDIA/NeMo/blob/main/LICENSE>`_.
NeMo is under `Apache 2.0 license <https://github.com/NVIDIA/NeMo/blob/stable/LICENSE>`_.
16 changes: 8 additions & 8 deletions docs/source/asr/asr_language_modeling.rst
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ Train N-gram LM
===============

The script to train an N-gram language model with KenLM can be found at
`scripts/asr_language_modeling/ngram_lm/train_kenlm.py <https://github.com/NVIDIA/NeMo/blob/main/scripts/asr_language_modeling/ngram_lm/train_kenlm.py>`__.
`scripts/asr_language_modeling/ngram_lm/train_kenlm.py <https://github.com/NVIDIA/NeMo/blob/stable/scripts/asr_language_modeling/ngram_lm/train_kenlm.py>`__.

This script would train an N-gram language model with KenLM library which can be used with the beam search decoders
on top of the ASR models. This script supports both character level and BPE level encodings and models which is
Expand Down Expand Up @@ -95,7 +95,7 @@ Evaluate by Beam Search Decoding and N-gram LM

NeMo's beam search decoders are capable of using the KenLM's N-gram models to find the best candidates.
The script to evaluate an ASR model with beam search decoding and N-gram models can be found at
`scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram.py <https://github.com/NVIDIA/NeMo/blob/main/scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram.py>`__.
`scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram.py <https://github.com/NVIDIA/NeMo/blob/stable/scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram.py>`__.

You may evaluate an ASR model as the following:

Expand Down Expand Up @@ -169,7 +169,7 @@ Width of the beam search (`--beam_width`) specifies the number of top candidates
would search for. Larger beams result in more accurate but slower predictions.

There is also a tutorial to learn more about evaluating the ASR models with N-gram LM here:
`Offline ASR Inference with Beam Search and External Language Model Rescoring <https://colab.research.google.com/github/NVIDIA/NeMo/blob/v1.0.2/tutorials/asr/Offline_ASR.ipynb>`_
`Offline ASR Inference with Beam Search and External Language Model Rescoring <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/asr/Offline_ASR.ipynb>`_

Hyperparameter Grid Search
--------------------------
Expand Down Expand Up @@ -202,19 +202,19 @@ This score is usually combined with the scores from the beam search decoding to
Train Neural Rescorer
=====================

An example script to train such a language model with Transformer can be found at `examples/nlp/language_modeling/transformer_lm.py <https://github.com/NVIDIA/NeMo/blob/main/examples/nlp/language_modeling/transformer_lm.py>`__.
An example script to train such a language model with Transformer can be found at `examples/nlp/language_modeling/transformer_lm.py <https://github.com/NVIDIA/NeMo/blob/stable/examples/nlp/language_modeling/transformer_lm.py>`__.
It trains a TransformerLMModel which can be used as a neural rescorer for an ASR system.


Evaluation
==========

Given a trained TransformerLMModel `.nemo` file, the script available at
`scripts/asr_language_modeling/neural_rescorer/eval_neural_rescorer.py <https://github.com/NVIDIA/NeMo/blob/main/scripts/asr_language_modeling/neural_rescorer/eval_neural_rescorer.py>`__
`scripts/asr_language_modeling/neural_rescorer/eval_neural_rescorer.py <https://github.com/NVIDIA/NeMo/blob/stable/scripts/asr_language_modeling/neural_rescorer/eval_neural_rescorer.py>`__
can be used to re-score beams obtained with ASR model. You need the `.tsv` file containing the candidates produced
by the acoustic model and the beam search decoding to use this script. The candidates can be the result of just the beam
search decoding or the result of fusion with an N-gram LM. You may generate this file by specifying `--preds_output_folder' for
`scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram.py <https://github.com/NVIDIA/NeMo/blob/main/scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram.py>`__.
`scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram.py <https://github.com/NVIDIA/NeMo/blob/stable/scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram.py>`__.

The neural rescorer would rescore the beams/candidates by using two parameters of `rescorer_alpha` and `rescorer_beta` as the following:

Expand All @@ -231,9 +231,9 @@ You may follow the following steps to evaluate a neural LM:
#. Obtain `.tsv` file with beams and their corresponding scores. Scores can be from a regular beam search decoder or
in fusion with an N-gram LM scores. For a given beam size `beam_size` and a number of examples
for evaluation `num_eval_examples`, it should contain (`num_eval_examples` x `beam_size`) lines of
form `beam_candidate_text \t score`. This file can be generated by `scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram.py <https://github.com/NVIDIA/NeMo/blob/main/scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram.py>`__
form `beam_candidate_text \t score`. This file can be generated by `scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram.py <https://github.com/NVIDIA/NeMo/blob/stable/scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram.py>`__

#. Rescore the candidates by `scripts/asr_language_modeling/neural_rescorer/eval_neural_rescorer.py <https://github.com/NVIDIA/NeMo/blob/main/scripts/asr_language_modeling/neural_rescorer/eval_neural_rescorer.py>`__.
#. Rescore the candidates by `scripts/asr_language_modeling/neural_rescorer/eval_neural_rescorer.py <https://github.com/NVIDIA/NeMo/blob/stable/scripts/asr_language_modeling/neural_rescorer/eval_neural_rescorer.py>`__.

.. code::
python eval_neural_rescorer.py
Expand Down
2 changes: 1 addition & 1 deletion docs/source/nemo_text_processing/intro.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ Text Processing

See :doc:`NeMo Introduction <../starthere/intro>` for installation details.

Additional requirements can be found in `setup.sh <https://github.com/NVIDIA/NeMo/blob/main/nemo_text_processing/setup.sh>`_.
Additional requirements can be found in `setup.sh <https://github.com/NVIDIA/NeMo/blob/stable/nemo_text_processing/setup.sh>`_.

.. toctree::
:maxdepth: 1
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ See :doc:`Text Procesing Deployment <../tools/text_processing_deployment>` for d

.. note::

For more details, see the tutorial `NeMo/tutorials/text_processing/Inverse_Text_Normalization.ipynb <https://github.com/NVIDIA/NeMo/blob/main/tutorials/text_processing/Inverse_Text_Normalization.ipynb>`__ in `Google's Colab <https://colab.research.google.com/github/NVIDIA/NeMo/blob/main/tutorials/text_processing/Inverse_Text_Normalization.ipynb>`_.
For more details, see the tutorial `NeMo/tutorials/text_processing/Inverse_Text_Normalization.ipynb <https://github.com/NVIDIA/NeMo/blob/stable/tutorials/text_processing/Inverse_Text_Normalization.ipynb>`__ in `Google's Colab <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/text_processing/Inverse_Text_Normalization.ipynb>`_.



Expand Down
2 changes: 1 addition & 1 deletion docs/source/nemo_text_processing/text_normalization.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ See :doc:`Text Procesing Deployment <../tools/text_processing_deployment>` for d

.. note::

For more details, see the tutorial `NeMo/tutorials/text_processing/Text_Normalization.ipynb <https://github.com/NVIDIA/NeMo/blob/main/tutorials/text_processing/Text_Normalization.ipynb>`__ in `Google's Colab <https://colab.research.google.com/github/NVIDIA/NeMo/blob/main/tutorials/text_processing/Text_Normalization.ipynb>`_.
For more details, see the tutorial `NeMo/tutorials/text_processing/Text_Normalization.ipynb <https://github.com/NVIDIA/NeMo/blob/stable/tutorials/text_processing/Text_Normalization.ipynb>`__ in `Google's Colab <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/text_processing/Text_Normalization.ipynb>`_.



Expand Down
4 changes: 2 additions & 2 deletions docs/source/nlp/bert_pretraining.rst
Original file line number Diff line number Diff line change
Expand Up @@ -61,8 +61,8 @@ and specify the path to the created hd5f files.
Training the BERT model
-----------------------

Example of model configuration for on-the-fly data preprocessing: `NeMo/examples/nlp/language_modeling/conf/bert_pretraining_from_text_config.yaml <https://github.com/NVIDIA/NeMo/blob/main/examples/nlp/language_modeling/conf/bert_pretraining_from_text_config.yaml>`__.
Example of model configuration for offline data preprocessing: `NeMo/examples/nlp/language_modeling/conf/bert_pretraining_from_preprocessed_config.yaml <https://github.com/NVIDIA/NeMo/blob/main/examples/nlp/language_modeling/conf/bert_pretraining_from_preprocessed_config.yaml>`__.
Example of model configuration for on-the-fly data preprocessing: `NeMo/examples/nlp/language_modeling/conf/bert_pretraining_from_text_config.yaml <https://github.com/NVIDIA/NeMo/blob/stable/examples/nlp/language_modeling/conf/bert_pretraining_from_text_config.yaml>`__.
Example of model configuration for offline data preprocessing: `NeMo/examples/nlp/language_modeling/conf/bert_pretraining_from_preprocessed_config.yaml <https://github.com/NVIDIA/NeMo/blob/stable/examples/nlp/language_modeling/conf/bert_pretraining_from_preprocessed_config.yaml>`__.

The specification can be grouped into three categories:

Expand Down
4 changes: 2 additions & 2 deletions docs/source/nlp/glue_benchmark.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@
GLUE Benchmark
==============

We recommend you try the GLUE Benchmark model in a Jupyter notebook (can run on `Google's Colab <https://colab.research.google.com/notebooks/intro.ipynb>`_): `NeMo/tutorials/nlp/GLUE_Benchmark.ipynb <https://github.com/NVIDIA/NeMo/blob/main/tutorials/nlp/GLUE_Benchmark.ipynb>`__.
We recommend you try the GLUE Benchmark model in a Jupyter notebook (can run on `Google's Colab <https://colab.research.google.com/notebooks/intro.ipynb>`_): `NeMo/tutorials/nlp/GLUE_Benchmark.ipynb <https://github.com/NVIDIA/NeMo/blob/stable/tutorials/nlp/GLUE_Benchmark.ipynb>`__.

Connect to an instance with a GPU (**Runtime** -> **Change runtime type** -> select **GPU** for the hardware accelerator).

An example script on how to train the model can be found here: `NeMo/examples/nlp/glue_benchmark/glue_benchmark.py <https://github.com/NVIDIA/NeMo/blob/main/examples/nlp/glue_benchmark/glue_benchmark.py>`__.
An example script on how to train the model can be found here: `NeMo/examples/nlp/glue_benchmark/glue_benchmark.py <https://github.com/NVIDIA/NeMo/blob/stable/examples/nlp/glue_benchmark/glue_benchmark.py>`__.
2 changes: 1 addition & 1 deletion docs/source/nlp/information_retrieval.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
Information Retrieval
=====================

We recommend you try the Information Retrieval model in a Jupyter notebook (can run on `Google's Colab <https://colab.research.google.com/notebooks/intro.ipynb>`_): `NeMo/tutorials/nlp/Information_Retrieval_MSMARCO.ipynb <https://github.com/NVIDIA/NeMo/blob/main/tutorials/nlp/Information_Retrieval_MSMARCO.ipynb>`__.
We recommend you try the Information Retrieval model in a Jupyter notebook (can run on `Google's Colab <https://colab.research.google.com/notebooks/intro.ipynb>`_): `NeMo/tutorials/nlp/Information_Retrieval_MSMARCO.ipynb <https://github.com/NVIDIA/NeMo/blob/stable/tutorials/nlp/Information_Retrieval_MSMARCO.ipynb>`__.

Connect to an instance with a GPU (**Runtime** -> **Change runtime type** -> select **GPU** for hardware the accelerator),

Expand Down
6 changes: 3 additions & 3 deletions docs/source/nlp/joint_intent_slot.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ Our BERT-based model implementation allows you to train and detect both of these

.. note::

We recommend you try the Joint Intent and Slot Classification model in a Jupyter notebook (can run on `Google's Colab <https://colab.research.google.com/notebooks/intro.ipynb>`_.): `NeMo/tutorials/nlp/Joint_Intent_and_Slot_Classification.ipynb <https://github.com/NVIDIA/NeMo/blob/main/tutorials/nlp/Joint_Intent_and_Slot_Classification.ipynb>`__.
We recommend you try the Joint Intent and Slot Classification model in a Jupyter notebook (can run on `Google's Colab <https://colab.research.google.com/notebooks/intro.ipynb>`_.): `NeMo/tutorials/nlp/Joint_Intent_and_Slot_Classification.ipynb <https://github.com/NVIDIA/NeMo/blob/stable/tutorials/nlp/Joint_Intent_and_Slot_Classification.ipynb>`__.

Connect to an instance with a GPU (**Runtime** -> **Change runtime type** -> select **GPU** for the hardware accelerator).

Expand Down Expand Up @@ -115,7 +115,7 @@ For each query, the model classifies it as one the intents from the intent dicti
it as one of the slots from the slot dictionary, including out of scope slot for all the remaining words in the query which does not
fall in another slot category. Out of scope slot (``O``) is a part of slot dictionary that the model is trained on.

Example of model configuration file for training the model can be found at: `NeMo/examples/nlp/intent_slot_classification/conf/intent_slot_classification.yaml <https://github.com/NVIDIA/NeMo/blob/main/examples/nlp/intent_slot_classification/conf/intent_slot_classification_config.yaml>`__.
Example of model configuration file for training the model can be found at: `NeMo/examples/nlp/intent_slot_classification/conf/intent_slot_classification.yaml <https://github.com/NVIDIA/NeMo/blob/stable/examples/nlp/intent_slot_classification/conf/intent_slot_classification_config.yaml>`__.
In the configuration file, define the parameters of the training and the model, although most of the default values will work well.

The specification can be roughly grouped into three categories:
Expand Down Expand Up @@ -152,7 +152,7 @@ More details about parameters in the spec file can be found below:
| **test_ds.prefix** | string | ``test`` | A prefix for the test file names. |
+-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+

For additional config parameters common to all NLP models, refer to the `nlp_model doc <https://github.com/NVIDIA/NeMo/blob/main/docs/source/nlp/nlp_model.rst#model-nlp>`__.
For additional config parameters common to all NLP models, refer to the `nlp_model doc <https://github.com/NVIDIA/NeMo/blob/stable/docs/source/nlp/nlp_model.rst#model-nlp>`__.

The following is an example of the command for training the model:

Expand Down
15 changes: 13 additions & 2 deletions docs/source/nlp/machine_translation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -478,7 +478,7 @@ custom configuration under the ``encoder`` configuration.
HuggingFace
^^^^^^^^^^^

We have provided a `HuggingFace config file <https://github.com/NVIDIA/NeMo/blob/main/examples/nlp/machine_translation/conf/huggingface.yaml>`__
We have provided a `HuggingFace config file <https://github.com/NVIDIA/NeMo/blob/stable/examples/nlp/machine_translation/conf/huggingface.yaml>`__
to use with HuggingFace encoders.

To use the config file from CLI:
Expand Down Expand Up @@ -508,7 +508,7 @@ Note the ``+`` symbol is needed if we're not adding the arguments to the YAML co
Megatron
^^^^^^^^

We have provided a `Megatron config file <https://github.com/NVIDIA/NeMo/blob/main/examples/nlp/machine_translation/conf/megatron.yaml>`__
We have provided a `Megatron config file <https://github.com/NVIDIA/NeMo/blob/stable/examples/nlp/machine_translation/conf/megatron.yaml>`__
to use with Megatron encoders.

To use the config file from CLI:
Expand Down Expand Up @@ -561,6 +561,17 @@ To train a Megatron 345M BERT, we would use
model.encoder.num_layers=24 \
model.encoder.max_position_embeddings=512 \
If the pretrained megatron model used a custom vocab file, then set:

.. code::
model.encoder_tokenizer.vocab_file=/path/to/your/megatron/vocab_file.txt
model.encoder.vocab_file=/path/to/your/megatron/vocab_file.txt
Use ``encoder.model_name=megatron_bert_uncased`` for uncased models with custom vocabularies and
use ``encoder.model_name=megatron_bert_cased`` for cased models with custom vocabularies.


References
----------
Expand Down
Loading

0 comments on commit c131b57

Please sign in to comment.