Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: add doc for multitask fine-tuning #3717

Merged
merged 3 commits into from
May 6, 2024

Conversation

iProzd
Copy link
Collaborator

@iProzd iProzd commented Apr 29, 2024

Add docs for multitask fine-tuning.

Summary by CodeRabbit

  • Documentation
    • Updated the fine-tuning guide with new sections on TensorFlow and PyTorch implementations.
    • Added detailed instructions for fine-tuning methods in PyTorch, including specific commands and configurations.
    • Modified the multi-task training guide to redirect users to the fine-tuning section for more comprehensive instructions.
    • Corrected a typo in the multi-task training TensorFlow documentation for improved clarity.

Copy link
Contributor

coderabbitai bot commented Apr 29, 2024

Walkthrough

Walkthrough

The recent modifications enhance the documentation on fine-tuning models in TensorFlow and PyTorch. It now includes detailed sections on implementation strategies and methods for fine-tuning, offering specific commands and configurations for PyTorch. Users can explore single-task and multi-task fine-tuning approaches extensively through this updated guide.

Changes

File Path Change Summary
doc/train/finetuning.md Introduces sections on TensorFlow and PyTorch implementations, providing insights into fine-tuning methods and configurations.
doc/train/multi-task-training-pt.md Redirects users to a dedicated section for fine-tuning from a multi-task pre-trained model, altering the document's approach.
doc/train/multi-task-training-tf.md Corrects a typo in the documentation by changing "pretrained" to "pre-trained" for consistency and clarity.

Recent Review Details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits Files that changed from the base of the PR and between 0500713 and 3cdef47.
Files selected for processing (3)
  • doc/train/finetuning.md (4 hunks)
  • doc/train/multi-task-training-pt.md (1 hunks)
  • doc/train/multi-task-training-tf.md (1 hunks)
Additional Context Used
LanguageTool (48)
doc/train/finetuning.md (14)

Near line 11: Consider adding a comma after ‘Recently’ for more clarity.
Context: ...or other diversities of training data. Recently the emerging of methods such as [DPA-1]...


Near line 19: Do not mix variants of the same word (‘pretrain’ and ‘pre-train’) within a single text.
Context: ...con }} If you have a pre-trained model pretrained.pb (here we support models using [`se_...


Near line 28: Do not mix variants of the same word (‘pretrain’ and ‘pre-train’) within a single text.
Context: ...in the last layer of the fitting net in pretrained.pb, according to the training dataset ...


Near line 35: Do not mix variants of the same word (‘pretrain’ and ‘pre-train’) within a single text.
Context: ...re will inherit the model structures in pretrained.pb, and thus it will ignore the model ...


Near line 54: Did you mean “and”?
Context: ... In PyTorch version, we have introduced an updated, more adaptable approach to fin...


Near line 62: Do not mix variants of the same word (‘pretrain’ and ‘pre-train’) within a single text.
Context: ...ilizes a single-task pre-trained model (pretrained.pt) and modifies the energy bias withi...


Near line 88: This word is normally spelled as one.
Context: ...e version. ::: #### Fine-tuning from a multi-task pre-trained model Additionally, within...


Near line 90: This word is normally spelled as one.
Context: ...bility offered by the framework and the multi-task training capabilities provided by DPA2,...


Near line 94: It appears that a comma is missing.
Context: .../abs/2312.15492). For fine-tuning using this multitask pre-trained model (`multitask...


Near line 94: The verb ‘multitask’ is plural. Did you mean: “multitasks”? Did you use a verb instead of a noun?
Context: ...312.15492). For fine-tuning using this multitask pre-trained model (`multitask_pretraine...


Near line 94: Do not mix variants of the same word (‘pretrain’ and ‘pre-train’) within a single text.
Context: ...using this multitask pre-trained model (multitask_pretrained.pt), one can select a specific branch ...


Near line 95: Do not mix variants of the same word (‘pretrain’ and ‘pre-train’) within a single text.
Context: ...ch (e.g., CHOOSEN_BRANCH) included in multitask_pretrained.pt for fine-tuning with the following ...


Near line 112: This word is normally spelled as one.
Context: ...tialized fitting net will be used. ### Multi-task fine-tuning In typical scenarios, rely...


Near line 131: Unpaired symbol: ‘"’ seems to be missing
Context: ...put.json` should appear as follows ("..." means copied from input script of pre-t...

doc/train/multi-task-training-pt.md (10)

Near line 11: This word is normally spelled as one.
Context: ...ith the PyTorch one --> ## Theory The multi-task training process can simultaneously han...


Near line 19: This word is normally spelled as one.
Context: ... the Pytorch implementation, during the multi-task training process, all tasks can share a...


Near line 25: This word is normally spelled as one.
Context: ...t their tasks. In the DPA-2 model, this multi-task training framework is adopted.[^1] [^1...


Near line 31: This word is normally spelled as one.
Context: ... enabling larger-scale and more general multi-task training to obtain more general pre-tra...


Near line 33: This word is normally spelled as one.
Context: ...ral pre-trained models. ## Perform the multi-task training using PyTorch Training on mul...


Near line 35: This word is normally spelled as one.
Context: ...veral data systems) can be performed in multi-task mode, typically with one common descrip...


Near line 39: This word is normally spelled as one.
Context: ...dict>`, and then expand other parts for multi-model settings. Specifically, there are sever...


Near line 70: This word is normally spelled as one.
Context: ...th equal weights. An example input for multi-task training two models in water system is ...


Near line 77: This word is normally spelled as one.
Context: ...: ``` ## Finetune from the pre-trained multi-task model To finetune based on the checkpo...


Near line 79: This word is normally spelled as one.
Context: ... on the checkpoint model.pt after the multi-task pre-training is completed, users can re...

doc/train/multi-task-training-tf.md (24)

Near line 11: This word is normally spelled as one.
Context: ...ith the PyTorch one --> ## Theory The multi-task training process can simultaneously han...


Near line 19: This word is normally spelled as one.
Context: ... \quad t=1, \dots, n_t. ``` During the multi-task training process, all tasks share one d...


Near line 27: This word is normally spelled as one.
Context: ....org/licenses/by/4.0/). ## Perform the multi-task training Training on multiple data set...


Near line 29: This word is normally spelled as one.
Context: ...veral data systems) can be performed in multi-task mode, with one common descriptor and mu...


Near line 31: This word is normally spelled as one.
Context: ...ers in training input script to perform multi-task mode: - {ref}`fitting_net <model/fitti...


Near line 46: This word is normally spelled as one.
Context: ...ill automatically choose single-task or multi-task mode, based on the above parameters. No...


Near line 47: This word is normally spelled as one.
Context: ...that parameters of single-task mode and multi-task mode can not be mixed. An example inpu...


Near line 47: Unless you want to emphasize “not”, use “cannot” which is more common.
Context: ...of single-task mode and multi-task mode can not be mixed. An example input for trainin...


Near line 49: This word is normally spelled as one.
Context: ...ole in water system can be found here: [multi-task input on water](../../examples/water_mu...


Near line 51: This word is normally spelled as one.
Context: ...t.json). The supported descriptors for multi-task mode are listed: - {ref}`se_a (se_e2_a...


Near line 60: This word is normally spelled as one.
Context: ...brid]> The supported fitting nets for multi-task mode are listed: - {ref}ener <model/f...


Near line 66: This word is normally spelled as one.
Context: ... The output ofdp freeze` command in multi-task mode can be seen in [freeze command](.....


Near line 68: This word is normally spelled as one.
Context: ...d). ## Initialization from pre-trained multi-task model For advance training in multi-ta...


Near line 70: This word is normally spelled as one.
Context: ...lti-task model For advance training in multi-task mode, one can first train the descripto...


Near line 70: The preposition ‘to’ seems more likely in this position.
Context: ... upstream datasets and then transfer it on new downstream ones with newly added fi...


Near line 74: This word is normally spelled as one.
Context: ...ight <training/fitting_weight>`. Take [multi-task input on water](../../examples/water_mu...


Near line 75: This word is normally spelled as one.
Context: ...gain for example. You can first train a multi-task model using input script with the follo...


Near line 101: This word is normally spelled as one.
Context: ...`` After training, you can freeze this multi-task model into one unit graph: ```bash $ d...


Near line 106: Consider adding a comma here.
Context: ... freeze -o graph.pb --united-model ``` Then if you want to transfer the trained des...


Near line 126: The preposition ‘of’ seems more likely in this position.
Context: ...wly added fitting net keys, other parts in the input script, including {ref}`data_...


Near line 129: This word is normally spelled as one.
Context: ... Finally, you can perform the modified multi-task training from the frozen model with com...


Near line 137: This word is normally spelled as one.
Context: ...yers among energy fitting networks The multi-task training can be used to train multiple ...


Near line 138: After ‘some of’, you should use ‘the’ (“some of the layers”) or simply say “some layers”.
Context: ...fitting_net[ener]/layer_name>` to share some of layers among fitting networks. The architectur...


Near line 141: Possible subject-verb agreement error.
Context: ...hould be the same. For example, if one want to share the first and the third layers...


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@github-actions github-actions bot added the Docs label Apr 29, 2024
@iProzd iProzd changed the title Update finetuning.md Add docs for multitask fine-tuning Apr 29, 2024
@iProzd iProzd changed the title Add docs for multitask fine-tuning docs: add doc for multitask fine-tuning Apr 29, 2024
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Out of diff range and nitpick comments (1)
doc/train/finetuning.md (1)

Line range hint 7-40: Consider revising the section to enhance clarity and correct minor grammatical issues.

- Pretraining-and-finetuning is a widely used approach in other fields such as Computer Vision (CV) or Natural Language Processing (NLP)
+ Pretraining-and-finetuning is a widely used approach in fields such as Computer Vision (CV) and Natural Language Processing (NLP),
- to vastly reduce the training cost, while it's not trivial in potential models.
+ which significantly reduces training costs, although it's not trivial in potential models.
- Recently the emerging of methods such as [DPA-1](https://arxiv.org/abs/2208.08236) has brought us to a new stage where we can
+ Recently, the emergence of methods such as [DPA-1](https://arxiv.org/abs/2208.08236) has brought us to a new stage where we can
- perform similar pretraining-finetuning approaches.
+ perform similar pretraining-and-finetuning approaches.
- DPA-1 can hopefully learn the common knowledge in the pretrained dataset (especially the `force` information)
+ DPA-1 aims to learn the common knowledge in the pretrained dataset (especially the `force` information)
- and thus reduce the computational cost in downstream training tasks.
+ thereby reducing the computational costs in downstream training tasks.

Copy link

codecov bot commented Apr 29, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 82.19%. Comparing base (981ce44) to head (3cdef47).
Report is 12 commits behind head on devel.

Additional details and impacted files
@@            Coverage Diff             @@
##            devel    #3717      +/-   ##
==========================================
+ Coverage   82.12%   82.19%   +0.06%     
==========================================
  Files         511      513       +2     
  Lines       47363    47642     +279     
  Branches     2952     2980      +28     
==========================================
+ Hits        38898    39159     +261     
  Misses       7572     7572              
- Partials      893      911      +18     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@iProzd iProzd requested review from wanghan-iapcm and njzjz May 6, 2024 07:32
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 14

Out of diff range and nitpick comments (13)
doc/train/multi-task-training-pt.md (10)

Line range hint 11-11: Consider merging "multi-task" into "multitask" for consistency throughout the document.


Line range hint 19-19: Consider merging "multi-task" into "multitask" for consistency throughout the document.


Line range hint 25-25: Consider merging "multi-task" into "multitask" for consistency throughout the document.


Line range hint 31-31: Consider merging "multi-task" into "multitask" for consistency throughout the document.


Line range hint 33-33: Consider merging "multi-task" into "multitask" for consistency throughout the document.


Line range hint 35-35: Consider merging "multi-task" into "multitask" for consistency throughout the document.


Line range hint 39-39: Consider merging "multi-task" into "multitask" for consistency throughout the document.


Line range hint 70-70: Consider merging "multi-task" into "multitask" for consistency throughout the document.


77-77: Consider merging "multi-task" into "multitask" for consistency throughout the document.


79-79: Consider merging "multi-task" into "multitask" for consistency throughout the document.

doc/train/finetuning.md (3)

88-88: Consider merging "multi-task" into "multitask" for consistency throughout the document.


90-90: Consider merging "multi-task" into "multitask" for consistency throughout the document.


112-112: Consider merging "multi-task" into "multitask" for consistency throughout the document.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 18

Out of diff range and nitpick comments (28)
doc/train/multi-task-training-pt.md (8)

Line range hint 11-11: Ensure consistency in terminology: use either "multi-task" or "multitask" throughout the document.

- multi-task
+ multitask

Line range hint 19-19: Ensure consistency in terminology: use either "multi-task" or "multitask" throughout the document.

- multi-task
+ multitask

Line range hint 25-25: Ensure consistency in terminology: use either "multi-task" or "multitask" throughout the document.

- multi-task
+ multitask

Line range hint 31-31: Ensure consistency in terminology: use either "multi-task" or "multitask" throughout the document.

- multi-task
+ multitask

Line range hint 33-33: Ensure consistency in terminology: use either "multi-task" or "multitask" throughout the document.

- multi-task
+ multitask

Line range hint 35-35: Ensure consistency in terminology: use either "multi-task" or "multitask" throughout the document.

- multi-task
+ multitask

Line range hint 39-39: Ensure consistency in terminology: use either "multi-task" or "multitask" throughout the document.

- multi-task
+ multitask

Line range hint 70-70: Ensure consistency in terminology: use either "multi-task" or "multitask" throughout the document.

- multi-task
+ multitask
doc/train/multi-task-training-tf.md (19)

Line range hint 11-11: Ensure consistency in terminology: use either "multi-task" or "multitask" throughout the document.

- multi-task
+ multitask

Line range hint 19-19: Ensure consistency in terminology: use either "multi-task" or "multitask" throughout the document.

- multi-task
+ multitask

Line range hint 27-27: Ensure consistency in terminology: use either "multi-task" or "multitask" throughout the document.

- multi-task
+ multitask

Line range hint 29-29: Ensure consistency in terminology: use either "multi-task" or "multitask" throughout the document.

- multi-task
+ multitask

Line range hint 31-31: Ensure consistency in terminology: use either "multi-task" or "multitask" throughout the document.

- multi-task
+ multitask

Line range hint 46-46: Ensure consistency in terminology: use either "multi-task" or "multitask" throughout the document.

- multi-task
+ multitask

Line range hint 47-47: Ensure consistency in terminology: use either "multi-task" or "multitask" throughout the document.

- multi-task
+ multitask

Line range hint 49-49: Ensure consistency in terminology: use either "multi-task" or "multitask" throughout the document.

- multi-task
+ multitask

Line range hint 51-51: Ensure consistency in terminology: use either "multi-task" or "multitask" throughout the document.

- multi-task
+ multitask

Line range hint 60-60: Ensure consistency in terminology: use either "multi-task" or "multitask" throughout the document.

- multi-task
+ multitask

Line range hint 74-74: Ensure consistency in terminology: use either "multi-task" or "multitask" throughout the document.

- multi-task
+ multitask

Line range hint 75-75: Ensure consistency in terminology: use either "multi-task" or "multitask" throughout the document.

- multi-task
+ multitask

Line range hint 101-101: Ensure consistency in terminology: use either "multi-task" or "multitask" throughout the document.

- multi-task
+ multitask

Line range hint 106-106: Consider adding a comma here for clarity.

- Then if you want to transfer the trained descriptor and some fitting nets (take `water_ener` for example) to newly added datasets with new fitting net `water_ener_2`,
+ Then, if you want to transfer the trained descriptor and some fitting nets (take `water_ener` for example) to newly added datasets with new fitting net `water_ener_2`,

Line range hint 126-126: The preposition ‘of’ seems more likely in this position.

- other parts in the input script, including {ref}`data_dict <training/data_dict>` and {ref}`loss_dict <loss_dict>` (optionally {ref}`fitting_weight <training/fitting_weight>`),
+ other parts of the input script, including {ref}`data_dict <training/data_dict>` and {ref}`loss_dict <loss_dict>` (optionally {ref}`fitting_weight <training/fitting_weight>`),

Line range hint 129-129: This word is normally spelled as one.

- multi-task
+ multitask

Line range hint 137-137: This word is normally spelled as one.

- multi-task
+ multitask

Line range hint 138-138: After ‘some of’, you should use ‘the’ (“some of the layers”) or simply say “some layers”.

- if one want to share some of layers among fitting networks,
+ if one wants to share some of the layers among fitting networks,

Line range hint 141-141: Possible subject-verb agreement error.

- For example, if one want to share the first and the third layers for two three-hidden-layer fitting networks, the following parameters should be set.
+ For example, if one wants to share the first and the third layers for two three-hidden-layer fitting networks, the following parameters should be set.
doc/train/finetuning.md (1)

Line range hint 28-28: Do not mix variants of the same word (‘pretrain’ and ‘pre-train’) within a single text.

- pre-trained
+ pretrained

@iProzd iProzd added this pull request to the merge queue May 6, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks May 6, 2024
@iProzd iProzd added this pull request to the merge queue May 6, 2024
Merged via the queue into deepmodeling:devel with commit 0ec6719 May 6, 2024
60 checks passed
@iProzd iProzd deleted the doc_multi_finetune branch May 6, 2024 15:32
mtaillefumier pushed a commit to mtaillefumier/deepmd-kit that referenced this pull request Sep 18, 2024
Add docs for multitask fine-tuning.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit


- **Documentation**
- Updated the fine-tuning guide with new sections on TensorFlow and
PyTorch implementations.
- Added detailed instructions for fine-tuning methods in PyTorch,
including specific commands and configurations.
- Modified the multi-task training guide to redirect users to the
fine-tuning section for more comprehensive instructions.
- Corrected a typo in the multi-task training TensorFlow documentation
for improved clarity.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants