Added P-Tuning method #3488

yidong72 · 2022-01-21T21:29:17Z

Added P-Tuning method to use Large Megatron GPT model for downstream NLP tasks.
The PTune_Sentiment_Analysis.ipynb tutorial notebook shows how to use P-Tuning for financial sentiment analysis. It achieve 92% accuracy using GPT 344M model.

Signed-off-by: Yi Dong <[email protected]>

lgtm-com · 2022-01-21T21:40:32Z

This pull request introduces 2 alerts when merging 0f8444f into 0bae758 - view on LGTM.com

new alerts:

1 for Superclass attribute shadows subclass method
1 for Redundant assignment

okuchaiev · 2022-01-21T21:45:17Z

/blossom-ci

lgtm-com · 2022-01-21T21:54:20Z

This pull request introduces 2 alerts when merging 0552a40 into 0bae758 - view on LGTM.com

new alerts:

1 for Superclass attribute shadows subclass method
1 for Redundant assignment

Signed-off-by: Yi Dong <[email protected]>

lgtm-com · 2022-01-24T21:36:02Z

This pull request introduces 1 alert when merging b1005fe into 7c97e33 - view on LGTM.com

new alerts:

1 for Superclass attribute shadows subclass method

Signed-off-by: Yi Dong <[email protected]>

…tune

lgtm-com · 2022-01-25T01:39:28Z

This pull request introduces 1 alert when merging 593125e into 7c97e33 - view on LGTM.com

new alerts:

1 for Superclass attribute shadows subclass method

examples/nlp/text_classification/conf/ptune_text_classification_config.yaml

Signed-off-by: Yi Dong <[email protected]>

lgtm-com · 2022-01-26T00:15:00Z

This pull request introduces 1 alert when merging d4e2cdd into 3146fca - view on LGTM.com

new alerts:

1 for Superclass attribute shadows subclass method

ericharper

LGTM. Thanks!

ericharper · 2022-01-26T00:34:05Z

Please add a CI test as well.

yidong72 · 2022-01-26T13:55:15Z

Please add a CI test as well.

I added a CI test for the whole p-tuning workflow.

Signed-off-by: Yi Dong <[email protected]>

lgtm-com · 2022-01-26T14:09:15Z

This pull request introduces 1 alert when merging b3db907 into 360fa7c - view on LGTM.com

new alerts:

1 for Superclass attribute shadows subclass method

lgtm-com · 2022-01-26T18:05:50Z

This pull request introduces 1 alert when merging df55257 into 9dc612e - view on LGTM.com

new alerts:

1 for Superclass attribute shadows subclass method

Signed-off-by: Yi Dong <[email protected]>

lgtm-com · 2022-01-26T20:38:50Z

This pull request introduces 1 alert when merging f70542c into 9dc612e - view on LGTM.com

new alerts:

1 for Superclass attribute shadows subclass method

yidong72 · 2022-01-26T20:52:21Z

nemo/core/classes/modelPT.py

@@ -471,6 +473,9 @@ def setup_optimization(self, optim_config: Optional[Union[DictConfig, Dict]] = N
                    optim_config['sched']['t_num_workers'] = self._trainer.num_processes * self._trainer.num_nodes
                elif self._trainer.accelerator == "ddp":
                    optim_config['sched']['t_num_workers'] = self._trainer.num_gpus * self._trainer.num_nodes
+                elif isinstance(self._trainer.accelerator.training_type_plugin, NLPDDPPlugin):
+                    app = AppState()
+                    optim_config['sched']['t_num_workers'] = app.data_parallel_size


The t_num_workers should be the data parallel size in the presence of model parallelism workers. Otherwise, the max_step calculation for optim scheduler will be off

This is fine from my side.

Signed-off-by: Yi Dong <[email protected]>

lgtm-com · 2022-01-26T22:21:43Z

This pull request introduces 1 alert when merging 2856e84 into 9dc612e - view on LGTM.com

new alerts:

1 for Superclass attribute shadows subclass method

ericharper

LGTM. Thanks!

lgtm-com · 2022-01-26T23:45:56Z

This pull request introduces 1 alert when merging 105f6db into 6b51350 - view on LGTM.com

new alerts:

1 for Superclass attribute shadows subclass method

okuchaiev · 2022-01-27T00:53:38Z

/blossom-ci

Signed-off-by: Yi Dong <[email protected]>

…tune

lgtm-com · 2022-01-27T15:16:16Z

This pull request introduces 1 alert when merging 11cb31b into d8354a2 - view on LGTM.com

new alerts:

1 for Superclass attribute shadows subclass method

* init checking of p-tune method Signed-off-by: Yi Dong <[email protected]> * training is working Signed-off-by: Yi Dong <[email protected]> * refactor to seperate prediction and loss computation Signed-off-by: Yi Dong <[email protected]> * updated the notebook Signed-off-by: Yi Dong <[email protected]> * match the original hyper parameters Signed-off-by: Yi Dong <[email protected]> * fixed the loss bug Signed-off-by: Yi Dong <[email protected]> * better scheduler Signed-off-by: Yi Dong <[email protected]> * notebook runs Signed-off-by: Yi Dong <[email protected]> * added neural types Signed-off-by: Yi Dong <[email protected]> * updated the doc Signed-off-by: Yi Dong <[email protected]> * fixed the notebook Signed-off-by: Yi Dong <[email protected]> * updated expected result Signed-off-by: Yi Dong <[email protected]> * added accuracy Signed-off-by: Yi Dong <[email protected]> * style fix Signed-off-by: Yi Dong <[email protected]> * fix reassgin Signed-off-by: Yi Dong <[email protected]> * log accuracy Signed-off-by: Yi Dong <[email protected]> * load the best checkpoint Signed-off-by: Yi Dong <[email protected]> * address PR comments Signed-off-by: Yi Dong <[email protected]> * added ci test Signed-off-by: Yi Dong <[email protected]> * fixed max_step calculation error due to wrong number of workers Signed-off-by: Yi Dong <[email protected]> * add import guard for nlp plugin Signed-off-by: Yi Dong <[email protected]> * fixed the metric report issue when using tensor parallel Signed-off-by: Yi Dong <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Eric Harper <[email protected]>

yidong72 added 14 commits January 18, 2022 19:28

init checking of p-tune method

a16598a

Signed-off-by: Yi Dong <[email protected]>

training is working

06d1344

Signed-off-by: Yi Dong <[email protected]>

refactor to seperate prediction and loss computation

1ee3972

Signed-off-by: Yi Dong <[email protected]>

updated the notebook

1c14df0

Signed-off-by: Yi Dong <[email protected]>

match the original hyper parameters

7cb3640

Signed-off-by: Yi Dong <[email protected]>

fixed the loss bug

c748a81

Signed-off-by: Yi Dong <[email protected]>

better scheduler

80f48c2

Signed-off-by: Yi Dong <[email protected]>

notebook runs

8a85024

Signed-off-by: Yi Dong <[email protected]>

added neural types

7168808

Signed-off-by: Yi Dong <[email protected]>

updated the doc

e59c2e4

Signed-off-by: Yi Dong <[email protected]>

fixed the notebook

d234cf6

Signed-off-by: Yi Dong <[email protected]>

updated expected result

51cbea2

Signed-off-by: Yi Dong <[email protected]>

added accuracy

12e0a25

Signed-off-by: Yi Dong <[email protected]>

style fix

0f8444f

Signed-off-by: Yi Dong <[email protected]>

yidong72 requested review from ericharper, okuchaiev and vadam5 January 21, 2022 21:29

Merge branch 'main' into feature_ptune

0552a40

yidong72 and others added 4 commits January 24, 2022 09:30

Merge branch 'main' into feature_ptune

0dfb9e0

fix reassgin

b2ebf81

Signed-off-by: Yi Dong <[email protected]>

log accuracy

8991080

Signed-off-by: Yi Dong <[email protected]>

Merge branch 'main' into feature_ptune

b1005fe

yidong72 added 2 commits January 24, 2022 17:31

load the best checkpoint

1d76088

Signed-off-by: Yi Dong <[email protected]>

Merge branch 'feature_ptune' of github.com:NVIDIA/NeMo into feature_p…

593125e

…tune

ericharper reviewed Jan 25, 2022

View reviewed changes

examples/nlp/text_classification/conf/ptune_text_classification_config.yaml Outdated Show resolved Hide resolved

address PR comments

d4e2cdd

Signed-off-by: Yi Dong <[email protected]>

ericharper previously approved these changes Jan 26, 2022

View reviewed changes

yidong72 dismissed ericharper’s stale review via b3db907 January 26, 2022 13:54

added ci test

b3db907

Signed-off-by: Yi Dong <[email protected]>

Merge branch 'main' into feature_ptune

df55257

fixed max_step calculation error due to wrong number of workers

f70542c

Signed-off-by: Yi Dong <[email protected]>

yidong72 requested a review from titu1994 January 26, 2022 20:50

yidong72 commented Jan 26, 2022

View reviewed changes

add import guard for nlp plugin

2856e84

Signed-off-by: Yi Dong <[email protected]>

Merge branch 'main' into feature_ptune

105f6db

ericharper previously approved these changes Jan 26, 2022

View reviewed changes

yidong72 dismissed ericharper’s stale review via 034a429 January 27, 2022 15:02

yidong72 and others added 3 commits January 27, 2022 07:03

fixed the metric report issue when using tensor parallel

f54da3c

Signed-off-by: Yi Dong <[email protected]>

Merge branch 'feature_ptune' of github.com:NVIDIA/NeMo into feature_p…

034a429

…tune

Merge branch 'main' into feature_ptune

11cb31b

ericharper approved these changes Jan 27, 2022

View reviewed changes

ericharper merged commit cdb409b into main Jan 27, 2022

ericharper deleted the feature_ptune branch January 27, 2022 18:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added P-Tuning method #3488

Added P-Tuning method #3488

yidong72 commented Jan 21, 2022

lgtm-com bot commented Jan 21, 2022

okuchaiev commented Jan 21, 2022

lgtm-com bot commented Jan 21, 2022

lgtm-com bot commented Jan 24, 2022

lgtm-com bot commented Jan 25, 2022

lgtm-com bot commented Jan 26, 2022

ericharper left a comment

ericharper commented Jan 26, 2022

yidong72 commented Jan 26, 2022

lgtm-com bot commented Jan 26, 2022

lgtm-com bot commented Jan 26, 2022

lgtm-com bot commented Jan 26, 2022

yidong72 Jan 26, 2022

titu1994 Jan 26, 2022

lgtm-com bot commented Jan 26, 2022

ericharper left a comment

lgtm-com bot commented Jan 26, 2022

okuchaiev commented Jan 27, 2022

lgtm-com bot commented Jan 27, 2022

Added P-Tuning method #3488

Added P-Tuning method #3488

Conversation

yidong72 commented Jan 21, 2022

lgtm-com bot commented Jan 21, 2022

okuchaiev commented Jan 21, 2022

lgtm-com bot commented Jan 21, 2022

lgtm-com bot commented Jan 24, 2022

lgtm-com bot commented Jan 25, 2022

lgtm-com bot commented Jan 26, 2022

ericharper left a comment

Choose a reason for hiding this comment

ericharper commented Jan 26, 2022

yidong72 commented Jan 26, 2022

lgtm-com bot commented Jan 26, 2022

lgtm-com bot commented Jan 26, 2022

lgtm-com bot commented Jan 26, 2022

yidong72 Jan 26, 2022

Choose a reason for hiding this comment

titu1994 Jan 26, 2022

Choose a reason for hiding this comment

lgtm-com bot commented Jan 26, 2022

ericharper left a comment

Choose a reason for hiding this comment

lgtm-com bot commented Jan 26, 2022

okuchaiev commented Jan 27, 2022

lgtm-com bot commented Jan 27, 2022