Skip to content

Commit

Permalink
Fixing Docs for Text Classification task (including sentiment analysi…
Browse files Browse the repository at this point in the history
…s) (#675)

* git history clean up

Signed-off-by: Evelina Bakhturina <[email protected]>

* nlp references to the tutotials

Signed-off-by: Evelina Bakhturina <[email protected]>

* sphinx fix

Signed-off-by: Evelina Bakhturina <[email protected]>

* review feedback

Signed-off-by: Evelina Bakhturina <[email protected]>
  • Loading branch information
ekmb authored Jun 4, 2020
1 parent 7efa348 commit 1273aff
Show file tree
Hide file tree
Showing 21 changed files with 2,157 additions and 938 deletions.
2 changes: 2 additions & 0 deletions docs/sources/source/nlp/asr-improvement.rst
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
.. _asr_improvement:

Tutorial
========

Expand Down
6 changes: 3 additions & 3 deletions docs/sources/source/nlp/bert_pretraining.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
.. _bert_pretraining:


Tutorial
========
BERT Pre-training Tutorial
==========================

In this tutorial, we will build and train a masked language model, either from scratch or from a pretrained BERT model, using the BERT architecture :cite:`nlp-bert-devlin2018bert`.
Make sure you have ``nemo`` and ``nemo_nlp`` installed before starting this tutorial. See the :ref:`installation` section for more details.
Expand Down
3 changes: 3 additions & 0 deletions docs/sources/source/nlp/dialogue_state_tracking.rst
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
.. _trade_tutorial:

TRADE Tutorial
==============

Expand Down Expand Up @@ -265,6 +267,7 @@ References
:keyprefix: nlp-dst-


.. _sgd_tutorial:

SGD Tutorial
============
Expand Down
3 changes: 2 additions & 1 deletion docs/sources/source/nlp/glue.rst
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
.. _glue:

Tutorial
========
Expand Down Expand Up @@ -86,7 +87,7 @@ To use multi-gpu training on MNLI task, run:
export NUM_GPUS=4
python -m torch.distributed.launch --nproc_per_node=$NUM_GPUS glue_benchmark_with_bert.py \
--data_dir=/path_to_data/MNLI \
--data_dir=/path_to_data_dir/MNLI \
--task_name mnli \
--work_dir /path_to_output_folder \
--num_gpus=$NUM_GPUS \
Expand Down
66 changes: 42 additions & 24 deletions docs/sources/source/nlp/intro.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,17 +5,32 @@ Natural Language Processing

Supported Tasks and Models:

* Neural Machine Translation
* :ref:`nmt`
* Language Modelling:
* :ref:`bert_pretraining`
* :ref:`transformer_lm`
* :ref:`megatron_finetuning`
* GLUE Benchmark
* :ref:`glue`
* Intent Detection and Slot Filling
* :ref:`joint_intent_slot_filling`
* Text Classification
* State Tracking for Task-oriented Dialogue Systems
* Language Modelling
* Neural Machine Translation
* Question Answering
* :ref:`text_classification`
* :ref:`sentiment_analysis`
* Name Entity Recognition (NER)
* :ref:`ner`
* Punctuation and Capitalization
* GLUE Benchmark
* :ref:`punctuation`
* Question Answering
* :ref:`squad_model_links`
* State Tracking for Goal-oriented Dialogue Systems:
* :ref:`trade_tutorial`
* :ref:`sgd_tutorial`
* ASR Postprocessing with BERT
* :ref:`asr_improvement`


All examples from NLP collection can be found `here <https://github.com/NVIDIA/NeMo/tree/master/examples/nlp>`__.

Neural Machine Translation (NMT)
Expand All @@ -32,19 +47,19 @@ Pretraining BERT

bert_pretraining

Megatron-LM for Downstream tasks
--------------------------------
Transformer Language Model
--------------------------
.. toctree::
:maxdepth: 8

megatron_finetuning
transformer_language_model

Transformer Language Model
--------------------------
Megatron-LM for Downstream tasks
--------------------------------
.. toctree::
:maxdepth: 8

transformer_language_model
megatron_finetuning

GLUE Benchmark
--------------------------
Expand All @@ -53,14 +68,19 @@ GLUE Benchmark

glue

Dialogue State Tracking
Intent and Slot filling
-----------------------

.. toctree::
:maxdepth: 8

dialogue_state_tracking.rst
joint_intent_slot_filling

Text Classification
-------------------
.. toctree::
:maxdepth: 8

text_classification

Named Entity Recognition
------------------------
Expand All @@ -78,25 +98,23 @@ Punctuation and Word Capitalization

punctuation


Intent and Slot filling
-----------------------
Question Answering
------------------
.. toctree::
:maxdepth: 8

joint_intent_slot_filling

question_answering

Dialogue State Tracking
-----------------------

Question Answering
------------------
.. toctree::
:maxdepth: 8

question_answering
dialogue_state_tracking

Improving Speech Recognition with BERTx2 Post-processing Model
--------------------------------------------------------------
ASR Postprocessing with BERT
----------------------------
.. toctree::
:maxdepth: 8

Expand Down
2 changes: 2 additions & 0 deletions docs/sources/source/nlp/joint_intent_slot_filling.rst
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
.. _joint_intent_slot_filling:

Tutorial
========

Expand Down
2 changes: 2 additions & 0 deletions docs/sources/source/nlp/megatron_finetuning.rst
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
.. _megatron_finetuning:

Megatron-LM for Downstream Tasks
================================

Expand Down
2 changes: 2 additions & 0 deletions docs/sources/source/nlp/ner.rst
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
.. _ner:

Tutorial
========

Expand Down
2 changes: 2 additions & 0 deletions docs/sources/source/nlp/neural_machine_translation.rst
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
.. _nmt:

Tutorial
========

Expand Down
2 changes: 2 additions & 0 deletions docs/sources/source/nlp/punctuation.rst
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
.. _punctuation:

Tutorial
========

Expand Down
112 changes: 112 additions & 0 deletions docs/sources/source/nlp/text_classification.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
.. _text_classification:

Tutorial
========

In this tutorial, we are going to describe how to finetune a BERT-like model \
based on `BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding <https://arxiv.org/abs/1810.04805>`_ :cite:`nlp-tc-devlin2018bert` \
on a text classification task.

Task Description
----------------

Text classification is the task of assigning a predefined label to a given text based on its content.
The text classification task applies to a broad range of problems: sentiment analysis, spam detection, intent detection, and many others.


Data Format
-----------

For the text classification task, NeMo requires the following format:

- the first line of each data file should contain a header with columns ``sentence`` and ``label``
- all subsequent lines in the file should contain some text in the first column and numerical label in the second column
- the columns are separated with tab

.. code-block::
sentence [TAB] label
text [TAB] label_id
text [TAB] label_id
text [TAB] label_id
For example, that final data file could look like this:

.. code-block::
sentence label
the first sentence 0
the second sentence 1
the third sentence 2
By default, the training script assumes that the training data is locating under the specified \
``--data_dir PATH_TO_DATA`` in ``train.tsv`` file, and evaluation file in ``dev.tsv`` file.
Use ``--train_file_prefix`` and ``--eval_file_prefix`` to change the default names.

NeMo provides a conversion script from the original data format to the NeMo format \
for some of the well-known datasets including SST-2 and IMDB, see
`examples/nlp/text_classification/data/import_datasets.py <https://github.com/NVIDIA/NeMo/blob/master/examples/nlp/text_classification/data/import_datasets.py>`_ for details.

Model training
--------------

The code used in this tutorial is based on `examples/nlp/text_classification/text_classification_with_bert.py <https://github.com/NVIDIA/NeMo/blob/master/examples/nlp/text_classification/text_classification_with_bert.py>`_.

.. note::

The script supports multi-class tasks

To run the script on a single GPU, run:

.. code-block:: bash
python text_classification_with_bert.py \
--data_dir /path_to_data_dir \
--work_dir /path_to_output_folder
To use multi-gpu training on this task, run:

.. code-block:: bash
export NUM_GPUS=4
python -m torch.distributed.launch --nproc_per_node=$NUM_GPUS text_classification_with_bert.py \
--data_dir=/path_to_data_dir \
--work_dir /path_to_output_folder \
--num_gpus=$NUM_GPUS
More details about multi-gpu training could be found in the `Fast Training <https://nvidia.github.io/NeMo/training.html>`_ section.

For additional model training parameters, please see ``examples/nlp/text_classification_with_bert.py``.

Evaluating Checkpoints
----------------------

During training, the model is evaluated after every epoch and by default a folder named ``checkpoints`` would be created under the working folder specified by `--work_dir` and \
checkpoints would be stored there. To evaluate a pre-trained checkpoint on a dev set, \
run the same training script by passing ``--checkpoint_dir`` and setting ``--num_epochs`` as zero to avoid the training.

.. code-block:: bash
python text_classification_with_bert.py \
--data_dir /path_to_data_dir/ \
--work_dir /path_to_output_folder \
--checkpoint_dir /path_to_output_folder/checkpoints \
--num_epochs 0
.. _sentiment_analysis:

Sentiment Analysis with BERT
============================

Tutorial on how to finetune a BERT model on Sentiment Analysis task, could be found at
`examples/nlp/text_classification/sentiment_analysis_with_bert.ipynb <https://github.com/NVIDIA/NeMo/blob/master/examples/nlp/text_classification/sentiment_analysis_with_bert.ipynb>`_


References
----------

.. bibliography:: nlp_all_refs.bib
:style: plain
:labelprefix: NLP-TC
:keyprefix: nlp-tc-
6 changes: 4 additions & 2 deletions docs/sources/source/nlp/transformer_language_model.rst
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
Tutorial
========
.. _transformer_lm:

Transformer Language Model Tutorial
===================================

In this tutorial, we will build and train a language model using the Transformer architecture :cite:`nlp-lm-vaswani2017attention`.
Make sure you have ``nemo`` and ``nemo_nlp`` installed before starting this tutorial. See the :ref:`installation` section for more details.
Expand Down
8 changes: 4 additions & 4 deletions examples/nlp/glue_benchmark/glue_benchmark_with_bert.py
Original file line number Diff line number Diff line change
Expand Up @@ -120,7 +120,7 @@
choices=["nemobert", "sentencepiece"],
help="tokenizer to use, only relevant when using custom pretrained checkpoint.",
)
parser.add_argument("--vocab_file", default=None, help="Path to the vocab file.")
parser.add_argument("--vocab_file", default=None, type=str, help="Path to the vocab file.")
parser.add_argument(
"--do_lower_case",
action='store_true',
Expand All @@ -136,10 +136,10 @@
truncated, sequences shorter will be padded.",
)
parser.add_argument("--optimizer_kind", default="adam", type=str, help="Optimizer kind")
parser.add_argument("--lr_policy", default="WarmupAnnealing", type=str)
parser.add_argument("--lr_policy", default="WarmupAnnealing", type=str, help="Learning rate policy")
parser.add_argument("--lr", default=5e-5, type=float, help="The initial learning rate.")
parser.add_argument("--lr_warmup_proportion", default=0.1, type=float)
parser.add_argument("--weight_decay", default=0.0, type=float, help="Weight deay if we apply some.")
parser.add_argument("--lr_warmup_proportion", default=0.1, type=float, help="Learning rate warm up proportion")
parser.add_argument("--weight_decay", default=0.0, type=float, help="Weight decay if we apply some.")
parser.add_argument("--num_epochs", default=3, type=int, help="Total number of training epochs to perform.")
parser.add_argument("--batch_size", default=8, type=int, help="Batch size per GPU/CPU for training/evaluation.")
parser.add_argument("--num_gpus", default=1, type=int, help="Number of GPUs")
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -140,7 +140,7 @@ def parse_args():
help="tokenizer to use, only relevant when using custom pretrained checkpoint.",
)
parser.add_argument("--optimizer", default="adam_w", type=str, help="Optimizer kind")
parser.add_argument("--vocab_file", default=None, help="Path to the vocab file.")
parser.add_argument("--vocab_file", default=None, type=str, help="Path to the vocab file.")
parser.add_argument("--lr_policy", default="WarmupAnnealing", type=str)
parser.add_argument("--lr", default=3e-5, type=float, help="The initial learning rate.")
parser.add_argument("--lr_warmup_proportion", default=0.0, type=float)
Expand Down
Loading

0 comments on commit 1273aff

Please sign in to comment.