-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added P-Tuning method #3488
Added P-Tuning method #3488
Conversation
Signed-off-by: Yi Dong <[email protected]>
Signed-off-by: Yi Dong <[email protected]>
Signed-off-by: Yi Dong <[email protected]>
Signed-off-by: Yi Dong <[email protected]>
Signed-off-by: Yi Dong <[email protected]>
Signed-off-by: Yi Dong <[email protected]>
Signed-off-by: Yi Dong <[email protected]>
Signed-off-by: Yi Dong <[email protected]>
Signed-off-by: Yi Dong <[email protected]>
Signed-off-by: Yi Dong <[email protected]>
Signed-off-by: Yi Dong <[email protected]>
Signed-off-by: Yi Dong <[email protected]>
Signed-off-by: Yi Dong <[email protected]>
Signed-off-by: Yi Dong <[email protected]>
This pull request introduces 2 alerts when merging 0f8444f into 0bae758 - view on LGTM.com new alerts:
|
/blossom-ci |
This pull request introduces 2 alerts when merging 0552a40 into 0bae758 - view on LGTM.com new alerts:
|
Signed-off-by: Yi Dong <[email protected]>
Signed-off-by: Yi Dong <[email protected]>
This pull request introduces 1 alert when merging b1005fe into 7c97e33 - view on LGTM.com new alerts:
|
Signed-off-by: Yi Dong <[email protected]>
This pull request introduces 1 alert when merging 593125e into 7c97e33 - view on LGTM.com new alerts:
|
examples/nlp/text_classification/conf/ptune_text_classification_config.yaml
Outdated
Show resolved
Hide resolved
Signed-off-by: Yi Dong <[email protected]>
This pull request introduces 1 alert when merging d4e2cdd into 3146fca - view on LGTM.com new alerts:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks!
Please add a CI test as well. |
I added a CI test for the whole p-tuning workflow. |
Signed-off-by: Yi Dong <[email protected]>
This pull request introduces 1 alert when merging b3db907 into 360fa7c - view on LGTM.com new alerts:
|
This pull request introduces 1 alert when merging df55257 into 9dc612e - view on LGTM.com new alerts:
|
Signed-off-by: Yi Dong <[email protected]>
This pull request introduces 1 alert when merging f70542c into 9dc612e - view on LGTM.com new alerts:
|
@@ -471,6 +473,9 @@ def setup_optimization(self, optim_config: Optional[Union[DictConfig, Dict]] = N | |||
optim_config['sched']['t_num_workers'] = self._trainer.num_processes * self._trainer.num_nodes | |||
elif self._trainer.accelerator == "ddp": | |||
optim_config['sched']['t_num_workers'] = self._trainer.num_gpus * self._trainer.num_nodes | |||
elif isinstance(self._trainer.accelerator.training_type_plugin, NLPDDPPlugin): | |||
app = AppState() | |||
optim_config['sched']['t_num_workers'] = app.data_parallel_size |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The t_num_workers should be the data parallel size in the presence of model parallelism workers. Otherwise, the max_step
calculation for optim scheduler will be off
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fine from my side.
Signed-off-by: Yi Dong <[email protected]>
This pull request introduces 1 alert when merging 2856e84 into 9dc612e - view on LGTM.com new alerts:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks!
This pull request introduces 1 alert when merging 105f6db into 6b51350 - view on LGTM.com new alerts:
|
/blossom-ci |
This pull request introduces 1 alert when merging 11cb31b into d8354a2 - view on LGTM.com new alerts:
|
* init checking of p-tune method Signed-off-by: Yi Dong <[email protected]> * training is working Signed-off-by: Yi Dong <[email protected]> * refactor to seperate prediction and loss computation Signed-off-by: Yi Dong <[email protected]> * updated the notebook Signed-off-by: Yi Dong <[email protected]> * match the original hyper parameters Signed-off-by: Yi Dong <[email protected]> * fixed the loss bug Signed-off-by: Yi Dong <[email protected]> * better scheduler Signed-off-by: Yi Dong <[email protected]> * notebook runs Signed-off-by: Yi Dong <[email protected]> * added neural types Signed-off-by: Yi Dong <[email protected]> * updated the doc Signed-off-by: Yi Dong <[email protected]> * fixed the notebook Signed-off-by: Yi Dong <[email protected]> * updated expected result Signed-off-by: Yi Dong <[email protected]> * added accuracy Signed-off-by: Yi Dong <[email protected]> * style fix Signed-off-by: Yi Dong <[email protected]> * fix reassgin Signed-off-by: Yi Dong <[email protected]> * log accuracy Signed-off-by: Yi Dong <[email protected]> * load the best checkpoint Signed-off-by: Yi Dong <[email protected]> * address PR comments Signed-off-by: Yi Dong <[email protected]> * added ci test Signed-off-by: Yi Dong <[email protected]> * fixed max_step calculation error due to wrong number of workers Signed-off-by: Yi Dong <[email protected]> * add import guard for nlp plugin Signed-off-by: Yi Dong <[email protected]> * fixed the metric report issue when using tensor parallel Signed-off-by: Yi Dong <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Eric Harper <[email protected]>
* init checking of p-tune method Signed-off-by: Yi Dong <[email protected]> * training is working Signed-off-by: Yi Dong <[email protected]> * refactor to seperate prediction and loss computation Signed-off-by: Yi Dong <[email protected]> * updated the notebook Signed-off-by: Yi Dong <[email protected]> * match the original hyper parameters Signed-off-by: Yi Dong <[email protected]> * fixed the loss bug Signed-off-by: Yi Dong <[email protected]> * better scheduler Signed-off-by: Yi Dong <[email protected]> * notebook runs Signed-off-by: Yi Dong <[email protected]> * added neural types Signed-off-by: Yi Dong <[email protected]> * updated the doc Signed-off-by: Yi Dong <[email protected]> * fixed the notebook Signed-off-by: Yi Dong <[email protected]> * updated expected result Signed-off-by: Yi Dong <[email protected]> * added accuracy Signed-off-by: Yi Dong <[email protected]> * style fix Signed-off-by: Yi Dong <[email protected]> * fix reassgin Signed-off-by: Yi Dong <[email protected]> * log accuracy Signed-off-by: Yi Dong <[email protected]> * load the best checkpoint Signed-off-by: Yi Dong <[email protected]> * address PR comments Signed-off-by: Yi Dong <[email protected]> * added ci test Signed-off-by: Yi Dong <[email protected]> * fixed max_step calculation error due to wrong number of workers Signed-off-by: Yi Dong <[email protected]> * add import guard for nlp plugin Signed-off-by: Yi Dong <[email protected]> * fixed the metric report issue when using tensor parallel Signed-off-by: Yi Dong <[email protected]> Co-authored-by: Oleksii Kuchaiev <[email protected]> Co-authored-by: Eric Harper <[email protected]>
Added P-Tuning method to use Large Megatron GPT model for downstream NLP tasks.
The
PTune_Sentiment_Analysis.ipynb
tutorial notebook shows how to use P-Tuning for financial sentiment analysis. It achieve 92% accuracy using GPT 344M model.