Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(pt): consistent fine-tuning with init-model #3803

Merged
merged 38 commits into from
Jun 13, 2024

Conversation

iProzd
Copy link
Collaborator

@iProzd iProzd commented May 22, 2024

Fix #3747. Fix #3455.

  • Consistent fine-tuning with init-model, now in pt, fine-tuning include three steps:
  1. Change model params (for multitask fine-tuning, random fitting and type-related params),
  2. Init-model,
  3. Change bias
  • By default, input will use user input while fine-tuning, instead of being overwritten by that in the pre-trained model. When adding “--use-pretrain-script”, user can use that in the pre-trained model.

  • Now type_map will use that in the user input instead of overwritten by that in the pre-trained model.

Note:

  1. After discussed with @wanghan-iapcm, behavior of fine-tuning in TF is kept as before. If needed in the future, it can be implemented then.
  2. Fine-tuning using DOSModel in PT need to be fixed. (an issue will be opened, maybe fixed in another PR, cc @anyangml )

Summary by CodeRabbit

  • New Features

    • Added support for using model parameters from a pretrained model script.
    • Introduced new methods to handle type-related parameters and fine-tuning configurations.
  • Documentation

    • Updated documentation to clarify the model section requirements and the new --use-pretrain-script option for fine-tuning.
  • Refactor

    • Simplified and improved the readability of key functions related to model training and fine-tuning.
  • Tests

    • Added new test methods and utility functions to ensure consistency of type mapping and parameter updates.

Copy link
Contributor

coderabbitai bot commented May 22, 2024

Walkthrough

Walkthrough

The updates primarily enhance the finetuning process in the DeePMD-kit by allowing users to use model parameters from a pretrained model script instead of manually inputting them. Additionally, the changes address issues related to type mapping during finetuning, ensuring the type_map in the pretrained model is correctly handled and updated, providing a more consistent user experience.

Changes

File(s) Change Summary
deepmd/main.py Added support for using model parameters from a pretrained model script via the --use-pretrain-script argument.
deepmd/pt/entrypoints/main.py Refactored get_trainer and prepare_trainer_input_single functions to handle fine-tuning configurations and simplify logic.
doc/train/finetuning.md Updated documentation to clarify requirements and introduce the --use-pretrain-script option.
deepmd/dpmodel/atomic_model/... Added methods slim_type_map and update_type_params to handle type-related parameters in pretrained models.
deepmd/dpmodel/descriptor/... Added methods for handling type maps, including get_type_map, slim_type_map, and change_type_map across various descriptor files.
deepmd/utils/finetune.py Introduced FinetuneRuleItem class to manage fine-tuning rules, including type map handling.
source/tests/universal/common/... Added abstract methods for converting to and from numpy arrays in Backend class.
source/tests/universal/common/cases/descriptor/utils.py Added utility functions and a test method test_change_type_map for updating input dictionaries related to type mapping.
source/tests/universal/dpmodel/backend.py Added methods for numpy conversion in Backend class.
source/tests/universal/pt/backend.py Added methods for conversion between torch.Tensor and np.ndarray.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant MainParser
    participant Trainer
    participant FinetuneRuleItem
    participant Descriptor

    User->>MainParser: Run with --use-pretrain-script
    MainParser->>Trainer: Initialize with pretrained model parameters
    Trainer->>FinetuneRuleItem: Apply fine-tuning rules
    FinetuneRuleItem->>Descriptor: Update type maps and statistics
    Descriptor-->>FinetuneRuleItem: Confirm updates
    FinetuneRuleItem-->>Trainer: Return updated model
    Trainer-->>User: Provide finetuned model
Loading

Assessment against linked issues

Objective Addressed Explanation
Consistent user experience for finetuning and init-model (#3747)
Correct handling of type_map during finetuning (#3455)

Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between af6c8b2 and fd64ee5.

Files selected for processing (19)
  • deepmd/dpmodel/descriptor/dpa1.py (5 hunks)
  • deepmd/dpmodel/descriptor/dpa2.py (6 hunks)
  • deepmd/dpmodel/descriptor/hybrid.py (2 hunks)
  • deepmd/dpmodel/descriptor/make_base_descriptor.py (2 hunks)
  • deepmd/dpmodel/descriptor/se_e2_a.py (6 hunks)
  • deepmd/dpmodel/descriptor/se_r.py (6 hunks)
  • deepmd/dpmodel/descriptor/se_t.py (7 hunks)
  • deepmd/main.py (1 hunks)
  • deepmd/pt/model/descriptor/dpa1.py (5 hunks)
  • deepmd/pt/model/descriptor/dpa2.py (6 hunks)
  • deepmd/pt/model/descriptor/hybrid.py (2 hunks)
  • deepmd/pt/model/descriptor/se_a.py (6 hunks)
  • deepmd/pt/model/descriptor/se_r.py (7 hunks)
  • deepmd/pt/model/descriptor/se_t.py (8 hunks)
  • deepmd/utils/finetune.py (1 hunks)
  • source/tests/universal/common/backend.py (1 hunks)
  • source/tests/universal/common/cases/descriptor/utils.py (4 hunks)
  • source/tests/universal/dpmodel/backend.py (2 hunks)
  • source/tests/universal/pt/backend.py (2 hunks)
Additional context used
Ruff
source/tests/universal/pt/backend.py

34-34: Found useless expression. Either assign it to a variable or remove it. (B018)

deepmd/dpmodel/descriptor/hybrid.py

204-204: Loop control variable ii not used within loop body (B007)

Rename unused ii to _ii

deepmd/pt/model/descriptor/hybrid.py

174-174: Loop control variable des not used within loop body (B007)

Rename unused des to _des


218-218: Loop control variable ii not used within loop body (B007)

Rename unused ii to _ii

deepmd/dpmodel/descriptor/se_r.py

105-105: Do not use mutable data structures for argument defaults (B006)

Replace with None; initialize within function


109-109: Do not use mutable data structures for argument defaults (B006)

Replace with None; initialize within function


380-380: Local variable env_mat is assigned to but never used (F841)

Remove assignment to unused variable env_mat

deepmd/dpmodel/descriptor/se_t.py

93-93: Do not use mutable data structures for argument defaults (B006)

Replace with None; initialize within function


98-98: Do not use mutable data structures for argument defaults (B006)

Replace with None; initialize within function


252-252: Do not use mutable data structures for argument defaults (B006)

Replace with None; initialize within function


381-381: Local variable env_mat is assigned to but never used (F841)

Remove assignment to unused variable env_mat

source/tests/universal/common/cases/descriptor/utils.py

49-49: Loop control variable vv not used within loop body (B007)

Rename unused vv to _vv

deepmd/pt/model/descriptor/se_r.py

64-64: Do not use mutable data structures for argument defaults (B006)

Replace with None; initialize within function


69-69: Do not use mutable data structures for argument defaults (B006)

Replace with None; initialize within function


257-261: Use ternary operator sampled = merged() if callable(merged) else merged instead of if-else-block (SIM108)


297-297: Do not use mutable data structures for argument defaults (B006)

Replace with None; initialize within function


435-435: Local variable env_mat is assigned to but never used (F841)

Remove assignment to unused variable env_mat

deepmd/dpmodel/descriptor/se_e2_a.py

147-147: Do not use mutable data structures for argument defaults (B006)

Replace with None; initialize within function


152-152: Do not use mutable data structures for argument defaults (B006)

Replace with None; initialize within function


325-325: Do not use mutable data structures for argument defaults (B006)

Replace with None; initialize within function


455-455: Local variable env_mat is assigned to but never used (F841)

Remove assignment to unused variable env_mat

deepmd/pt/model/descriptor/se_a.py

78-78: Do not use mutable data structures for argument defaults (B006)

Replace with None; initialize within function


84-84: Do not use mutable data structures for argument defaults (B006)

Replace with None; initialize within function


224-224: Do not use mutable data structures for argument defaults (B006)

Replace with None; initialize within function


323-323: Local variable env_mat is assigned to but never used (F841)

Remove assignment to unused variable env_mat


376-376: Do not use mutable data structures for argument defaults (B006)

Replace with None; initialize within function


382-382: Do not use mutable data structures for argument defaults (B006)

Replace with None; initialize within function


565-569: Use ternary operator sampled = merged() if callable(merged) else merged instead of if-else-block (SIM108)


589-589: Do not use mutable data structures for argument defaults (B006)

Replace with None; initialize within function


639-639: Loop control variable ii not used within loop body (B007)

deepmd/pt/model/descriptor/dpa1.py

215-215: Do not use mutable data structures for argument defaults (B006)

Replace with None; initialize within function


227-227: Do not use mutable data structures for argument defaults (B006)

Replace with None; initialize within function


517-517: Local variable env_mat is assigned to but never used (F841)

Remove assignment to unused variable env_mat


585-585: Local variable nall is assigned to but never used (F841)

Remove assignment to unused variable nall

deepmd/pt/model/descriptor/se_t.py

113-113: Do not use mutable data structures for argument defaults (B006)

Replace with None; initialize within function


118-118: Do not use mutable data structures for argument defaults (B006)

Replace with None; initialize within function


253-253: Do not use mutable data structures for argument defaults (B006)

Replace with None; initialize within function


348-348: Local variable env_mat is assigned to but never used (F841)

Remove assignment to unused variable env_mat


401-401: Do not use mutable data structures for argument defaults (B006)

Replace with None; initialize within function


406-406: Do not use mutable data structures for argument defaults (B006)

Replace with None; initialize within function


592-596: Use ternary operator sampled = merged() if callable(merged) else merged instead of if-else-block (SIM108)


616-616: Do not use mutable data structures for argument defaults (B006)

Replace with None; initialize within function

deepmd/main.py

83-83: No explicit stacklevel keyword argument found (B028)


114-114: Use key not in dict instead of key not in dict.keys() (SIM118)

Remove .keys()

deepmd/pt/model/descriptor/dpa2.py

84-84: Do not use mutable data structures for argument defaults (B006)

Replace with None; initialize within function


429-429: Loop control variable ii not used within loop body (B007)

Rename unused ii to _ii


549-549: Local variable env_mat is assigned to but never used (F841)

Remove assignment to unused variable env_mat

deepmd/dpmodel/descriptor/dpa2.py

67-67: Do not use mutable data structures for argument defaults (B006)

Replace with None; initialize within function


325-325: Do not use mutable data structures for argument defaults (B006)

Replace with None; initialize within function


804-804: Local variable env_mat is assigned to but never used (F841)

Remove assignment to unused variable env_mat

deepmd/dpmodel/descriptor/dpa1.py

226-226: Do not use mutable data structures for argument defaults (B006)

Replace with None; initialize within function


237-237: Do not use mutable data structures for argument defaults (B006)

Replace with None; initialize within function


465-465: Local variable nall is assigned to but never used (F841)

Remove assignment to unused variable nall


545-545: Local variable env_mat is assigned to but never used (F841)

Remove assignment to unused variable env_mat


607-607: Do not use mutable data structures for argument defaults (B006)

Replace with None; initialize within function


617-617: Do not use mutable data structures for argument defaults (B006)

Replace with None; initialize within function


807-807: Do not use mutable data structures for argument defaults (B006)

Replace with None; initialize within function

GitHub Check: codecov/patch
deepmd/dpmodel/descriptor/make_base_descriptor.py

[warning] 84-84: deepmd/dpmodel/descriptor/make_base_descriptor.py#L84
Added line #L84 was not covered by tests


[warning] 128-128: deepmd/dpmodel/descriptor/make_base_descriptor.py#L128
Added line #L128 was not covered by tests


[warning] 133-133: deepmd/dpmodel/descriptor/make_base_descriptor.py#L133
Added line #L133 was not covered by tests


[warning] 138-138: deepmd/dpmodel/descriptor/make_base_descriptor.py#L138
Added line #L138 was not covered by tests

deepmd/dpmodel/descriptor/hybrid.py

[warning] 129-129: deepmd/dpmodel/descriptor/hybrid.py#L129
Added line #L129 was not covered by tests

deepmd/dpmodel/descriptor/se_r.py

[warning] 239-239: deepmd/dpmodel/descriptor/se_r.py#L239
Added line #L239 was not covered by tests


[warning] 251-251: deepmd/dpmodel/descriptor/se_r.py#L251
Added line #L251 was not covered by tests


[warning] 263-264: deepmd/dpmodel/descriptor/se_r.py#L263-L264
Added lines #L263 - L264 were not covered by tests


[warning] 268-268: deepmd/dpmodel/descriptor/se_r.py#L268
Added line #L268 was not covered by tests

deepmd/dpmodel/descriptor/se_t.py

[warning] 177-177: deepmd/dpmodel/descriptor/se_t.py#L177
Added line #L177 was not covered by tests


[warning] 231-231: deepmd/dpmodel/descriptor/se_t.py#L231
Added line #L231 was not covered by tests


[warning] 243-244: deepmd/dpmodel/descriptor/se_t.py#L243-L244
Added lines #L243 - L244 were not covered by tests


[warning] 248-248: deepmd/dpmodel/descriptor/se_t.py#L248
Added line #L248 was not covered by tests

deepmd/dpmodel/descriptor/se_e2_a.py

[warning] 281-281: deepmd/dpmodel/descriptor/se_e2_a.py#L281
Added line #L281 was not covered by tests


[warning] 293-293: deepmd/dpmodel/descriptor/se_e2_a.py#L293
Added line #L293 was not covered by tests


[warning] 305-306: deepmd/dpmodel/descriptor/se_e2_a.py#L305-L306
Added lines #L305 - L306 were not covered by tests


[warning] 310-310: deepmd/dpmodel/descriptor/se_e2_a.py#L310
Added line #L310 was not covered by tests

deepmd/dpmodel/descriptor/dpa2.py

[warning] 512-512: deepmd/dpmodel/descriptor/dpa2.py#L512
Added line #L512 was not covered by tests

deepmd/dpmodel/descriptor/dpa1.py

[warning] 335-335: deepmd/dpmodel/descriptor/dpa1.py#L335
Added line #L335 was not covered by tests

Additional comments not posted (17)
source/tests/universal/common/backend.py (2)

27-28: LGTM! The abstract method convert_to_numpy is well-defined and encourages consistent implementation across subclasses.


32-33: LGTM! The abstract method convert_from_numpy is correctly defined to ensure consistent behavior across subclasses.

source/tests/universal/dpmodel/backend.py (2)

23-24: LGTM! The method convert_to_numpy correctly implements the abstract method by returning the input numpy array.


27-28: LGTM! The method convert_from_numpy provides a correct and straightforward implementation of the abstract method.

source/tests/universal/pt/backend.py (2)

37-38: LGTM! The method convert_to_numpy correctly utilizes the utility function to_numpy_array to convert PyTorch tensors to numpy arrays.


41-42: LGTM! The method convert_from_numpy effectively uses the utility function to_torch_tensor for converting numpy arrays to PyTorch tensors.

deepmd/utils/finetune.py (4)

11-65: LGTM! The FinetuneRuleItem class is well-structured and provides clear methods for accessing fine-tuning rules and properties. The documentation is clear and the methods are well-defined.


76-111: LGTM! The function get_index_between_two_maps correctly calculates the mapping index and handles new types appropriately, including logging a warning when new types are detected.


114-136: LGTM! The function map_atom_exclude_types correctly remaps atom exclude types based on the provided index map. The implementation is straightforward and effective.


139-164: LGTM! The function map_pair_exclude_types correctly remaps pair exclude types based on the provided index map. The implementation is straightforward and effective.

deepmd/dpmodel/descriptor/se_t.py (1)

Line range hint 351-368: The serialization method is updated to include type_map and trainable in the serialized data, aligning with the changes in the descriptor's properties. This is crucial for ensuring consistency in serialized and deserialized objects.

deepmd/pt/model/descriptor/se_r.py (1)

392-399: Serialization methods look well-implemented.

The methods serialize, deserialize, and related statistical methods are correctly implemented and align with the class's functionality.

Also applies to: 404-422, 432-432

deepmd/dpmodel/descriptor/se_e2_a.py (1)

299-311: Serialization methods look well-implemented.

The methods serialize, deserialize, and related statistical methods are correctly implemented and align with the class's functionality.

Also applies to: 421-443, 450-450

Tools
GitHub Check: codecov/patch

[warning] 305-306: deepmd/dpmodel/descriptor/se_e2_a.py#L305-L306
Added lines #L305 - L306 were not covered by tests


[warning] 310-310: deepmd/dpmodel/descriptor/se_e2_a.py#L310
Added line #L310 was not covered by tests

deepmd/pt/model/descriptor/se_a.py (3)

138-141: The implementation of get_type_map method is straightforward and aligns with the PR's objective to handle type maps correctly.


276-276: The methods set_stat_mean_and_stddev and get_stat_mean_and_stddev are well-implemented and provide clear functionality for managing statistics mean and standard deviation.

Also applies to: 280-283


308-308: The serialization of type_map within the serialize method ensures that the type map is preserved, which is crucial for maintaining consistency across different model states.

deepmd/pt/model/descriptor/dpa1.py (1)

430-456: Review the implementation of change_type_map.

The method change_type_map is critical for handling type map changes. It appears to correctly update the type map and related statistics. However, ensure that the method handles all edge cases, especially when model_with_new_type_stat is None and new types are present.


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

source/tests/pt/test_finetune.py Fixed Show fixed Hide fixed
source/tests/pt/test_finetune.py Fixed Show fixed Hide fixed
Copy link

codecov bot commented May 22, 2024

Codecov Report

Attention: Patch coverage is 87.50000% with 76 lines in your changes missing coverage. Please review.

Project coverage is 82.70%. Comparing base (a7ab1af) to head (fd64ee5).
Report is 119 commits behind head on devel.

Files with missing lines Patch % Lines
deepmd/tf/fit/ener.py 87.27% 7 Missing ⚠️
deepmd/dpmodel/atomic_model/base_atomic_model.py 25.00% 6 Missing ⚠️
deepmd/dpmodel/descriptor/se_e2_a.py 54.54% 5 Missing ⚠️
deepmd/dpmodel/descriptor/se_r.py 54.54% 5 Missing ⚠️
deepmd/dpmodel/descriptor/se_t.py 54.54% 5 Missing ⚠️
deepmd/dpmodel/atomic_model/dp_atomic_model.py 20.00% 4 Missing ⚠️
deepmd/dpmodel/descriptor/make_base_descriptor.py 66.66% 4 Missing ⚠️
deepmd/dpmodel/atomic_model/linear_atomic_model.py 25.00% 3 Missing ⚠️
...eepmd/pt/model/atomic_model/linear_atomic_model.py 25.00% 3 Missing ⚠️
deepmd/pt/model/descriptor/se_a.py 62.50% 3 Missing ⚠️
... and 21 more
Additional details and impacted files
@@            Coverage Diff             @@
##            devel    #3803      +/-   ##
==========================================
+ Coverage   82.66%   82.70%   +0.03%     
==========================================
  Files         517      517              
  Lines       49724    50141     +417     
  Branches     2984     2984              
==========================================
+ Hits        41105    41467     +362     
- Misses       7709     7764      +55     
  Partials      910      910              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 19

Outside diff range and nitpick comments (9)
deepmd/pt/infer/inference.py (1)

Line range hint 14-14: Consider adding error handling for file not found or corrupted model checkpoint scenarios.

doc/train/finetuning.md (1)

Line range hint 11-11: Standardize the terminology and fix grammatical issues.

  • Standardize the use of "pretrain" vs. "pre-train" to maintain consistency throughout the document.
  • Add a comma after "Recently" in line 11 for clarity.
  • Correct the word "multitask" to "multi-task" where applicable.
  • Fix the missing comma before "which" in line 96.

Also applies to: 19-19, 28-28, 35-35, 62-62, 90-90, 92-92, 96-96, 97-97, 114-114

deepmd/dpmodel/descriptor/se_e2_a.py (1)

267-274: Clarify the purpose of the update_type_params method.

Consider adding a comment or expanding the docstring to explain that this method is a placeholder and outline any plans for its future implementation. This will help maintainers and other developers understand the current state and future expectations.

source/tests/pt/test_training.py (1)

38-47: Enhance robustness of file cleanup in tearDown.

Consider adding error handling in the tearDown method to gracefully manage exceptions that may occur during file deletion. This can prevent the test suite from failing due to issues unrelated to the test logic.

Also applies to: 51-67

deepmd/pt/model/descriptor/se_a.py (1)

168-193: Consider adding more inline comments in the forward method of DescrptBlockSeA to explain the tensor operations, which will enhance readability and maintainability.

deepmd/main.py (1)

258-261: Add documentation for the new --use-pretrain-script argument.

It would be beneficial to include a brief explanation in the documentation about when and why to use the --use-pretrain-script option, especially since it's specific to the PyTorch backend.

deepmd/dpmodel/descriptor/dpa1.py (1)

368-375: Ensure proper documentation for the update_type_params method.

It would be beneficial to provide a more detailed docstring for the update_type_params method, explaining the parameters and the expected behavior, especially since it raises NotImplementedError.

deepmd/pt/train/training.py (2)

Line range hint 34-465: Consider refactoring the Trainer constructor to improve readability and maintainability.

The constructor of the Trainer class is quite lengthy and handles multiple aspects of the training setup. It would be beneficial to break down this method into smaller, more manageable functions. This can improve readability and make the code easier to maintain and test.


Line range hint 561-1196: Optimize the training loop for performance and correctness.

The training loop method is critical for the performance of the training process. Consider optimizing the data handling and computation steps to improve efficiency. Additionally, ensure that all tensor operations are correctly managed to avoid memory leaks and ensure computational correctness.

deepmd/pt/model/descriptor/hybrid.py Outdated Show resolved Hide resolved
deepmd/pt/model/descriptor/se_r.py Outdated Show resolved Hide resolved
deepmd/pt/model/task/ener.py Outdated Show resolved Hide resolved
deepmd/pt/model/task/dos.py Outdated Show resolved Hide resolved
deepmd/dpmodel/descriptor/make_base_descriptor.py Outdated Show resolved Hide resolved
deepmd/dpmodel/atomic_model/make_base_atomic_model.py Outdated Show resolved Hide resolved
deepmd/utils/finetune.py Outdated Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 21

Outside diff range and nitpick comments (7)
doc/train/finetuning.md (6)

Line range hint 11-11: Add a comma after "Recently" for clarity.

Consider revising the sentence to: "Recently, the emerging of methods such as [DPA-1]..."


Line range hint 19-35: Ensure consistent use of the term "pre-trained".

The document inconsistently uses "pretrain" and "pre-trained". It's important to maintain consistency to avoid confusion. Consider using "pre-trained" throughout the document.


Line range hint 54-54: Clarify the conjunction in the sentence.

The sentence "In PyTorch version, we have introduced an updated, more adaptable approach to fine-tuning." might be clearer with "and" instead of "an". Consider revising to: "In the PyTorch version, we have introduced an updated and more adaptable approach to fine-tuning."


Line range hint 90-114: Use "multitask" as one word.

The document inconsistently uses "multi-task" and "multitask". For consistency, consider using "multitask" as one word throughout the document.


Line range hint 96-96: Add a comma after "multitask".

Consider revising the sentence to: "For fine-tuning using this multitask, pre-trained model..."


Line range hint 133-133: Correct the unpaired quotation mark.

There appears to be an unpaired quotation mark in the sentence. Consider revising to ensure proper pairing of quotation marks.

source/tests/pt/test_finetune.py (1)

Line range hint 95-154: Consider adding more detailed comments explaining the steps in the test_finetune_change_out_bias method, especially for complex tensor manipulations and assertions.

deepmd/dpmodel/descriptor/dpa2.py Outdated Show resolved Hide resolved
deepmd/dpmodel/descriptor/dpa1.py Outdated Show resolved Hide resolved
deepmd/dpmodel/descriptor/dpa1.py Outdated Show resolved Hide resolved
deepmd/pt/model/descriptor/dpa2.py Outdated Show resolved Hide resolved
deepmd/main.py Outdated Show resolved Hide resolved
deepmd/dpmodel/fitting/dipole_fitting.py Outdated Show resolved Hide resolved
deepmd/dpmodel/atomic_model/base_atomic_model.py Outdated Show resolved Hide resolved
deepmd/dpmodel/descriptor/hybrid.py Outdated Show resolved Hide resolved
deepmd/dpmodel/fitting/polarizability_fitting.py Outdated Show resolved Hide resolved
source/tests/pt/test_finetune.py Fixed Show fixed Hide fixed
source/tests/pt/test_finetune.py Fixed Show fixed Hide fixed
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

source/tests/pt/model/test_atomic_model_global_stat.py Outdated Show resolved Hide resolved
@Chengqian-Zhang
Copy link
Collaborator

Chengqian-Zhang commented May 23, 2024

(I realize that the following behavior is all expected, and you can ignore this commit....)

There is an obvious problem with this design, when I don't use --use-pretrain-script, if the model parameters in the input.json I provide is not the same as the model parameters of the pre-trained model, then there must be a failure to load the parameter during self.wrapper.load_state_dict(state_dict). Because the network parameters of self.wrapper are initialized based on the model parameters of input.json, while state_dict is the network parameters of the pre-trained model.

For example, if sel = 120 in pretrained model while sel = 80 in input.json:
Singletask finetuning from a single model:
dp --pt train finetune_single.json --finetune single.pt
RuntimeError: Error(s) in loading state_dict for ModelWrapper: size mismatch for model.Default.atomic_model.descriptor.repinit.mean: copying a param with shape torch.Size([3, 120, 4]) from checkpoint, the shape in current model is torch.Size([3, 80, 4]).

Singletask finetuning from a multitask model:
dp --pt train finetune_single.json --finetune multi.pt
RuntimeError: Error(s) in loading state_dict for ModelWrapper: size mismatch for model.Default.atomic_model.descriptor.repinit.mean: copying a param with shape torch.Size([3, 120, 4]) from checkpoint, the shape in current model is torch.Size([3, 80, 4]).

deepmd/main.py Outdated Show resolved Hide resolved
deepmd/pt/model/task/ener.py Outdated Show resolved Hide resolved
deepmd/pt/model/task/ener.py Show resolved Hide resolved
@wanghan-iapcm
Copy link
Collaborator

I will review the PR after we discuss it online @iProzd @njzjz

@njzjz njzjz marked this pull request as draft May 29, 2024 20:49
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Outside diff range and nitpick comments (3)
doc/train/finetuning.md (3)

Line range hint 11-11: Consider adding a comma after "Recently" for clarity.

- Recently the emerging of methods such as [DPA-1]...
+ Recently, the emerging of methods such as [DPA-1]...

Line range hint 54-54: Correct the grammatical error in "an updated".

- In PyTorch version, we have introduced an updated, more adaptable approach...
+ In the PyTorch version, we have introduced an updated, more adaptable approach...

Line range hint 134-134: Correct the typographical error in the documentation.

- ...put.json` should appear as follows ("..." means copied from input script of pre-t...
+ ...put.json` should appear as follows ("..." means copied from the input script of the pre-t...

github-merge-queue bot pushed a commit that referenced this pull request Jun 12, 2024
Fix #3747. Fix #3455. 

- Consistent fine-tuning with init-model, now in pt, fine-tuning include
three steps:
1. Change model params (for multitask fine-tuning, random fitting and
type-related params),
2. Init-model, 
3. Change bias

- By default, input will use user input while fine-tuning, instead of
being overwritten by that in the pre-trained model. When adding
“--use-pretrain-script”, user can use that in the pre-trained model.

- Now `type_map` will use that in the user input instead of overwritten
by that in the pre-trained model.

Note:
1. After discussed with @wanghan-iapcm, **behavior of fine-tuning in TF
is kept as before**. If needed in the future, it can be implemented
then.
2. Fine-tuning using DOSModel in PT need to be fixed. (an issue will be
opened, maybe fixed in another PR, cc @anyangml )

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Added support for using model parameters from a pretrained model
script.
- Introduced new methods to handle type-related parameters and
fine-tuning configurations.

- **Documentation**
- Updated documentation to clarify the model section requirements and
the new `--use-pretrain-script` option for fine-tuning.

- **Refactor**
- Simplified and improved the readability of key functions related to
model training and fine-tuning.

- **Tests**
- Added new test methods and utility functions to ensure consistency of
type mapping and parameter updates.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Duo <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Han Wang <[email protected]>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jun 12, 2024
@njzjz njzjz added this pull request to the merge queue Jun 13, 2024
Merged via the queue into deepmodeling:devel with commit a1a3840 Jun 13, 2024
60 checks passed
mtaillefumier pushed a commit to mtaillefumier/deepmd-kit that referenced this pull request Sep 18, 2024
Fix deepmodeling#3747. Fix deepmodeling#3455. 

- Consistent fine-tuning with init-model, now in pt, fine-tuning include
three steps:
1. Change model params (for multitask fine-tuning, random fitting and
type-related params),
2. Init-model, 
3. Change bias

- By default, input will use user input while fine-tuning, instead of
being overwritten by that in the pre-trained model. When adding
“--use-pretrain-script”, user can use that in the pre-trained model.

- Now `type_map` will use that in the user input instead of overwritten
by that in the pre-trained model.

Note:
1. After discussed with @wanghan-iapcm, **behavior of fine-tuning in TF
is kept as before**. If needed in the future, it can be implemented
then.
2. Fine-tuning using DOSModel in PT need to be fixed. (an issue will be
opened, maybe fixed in another PR, cc @anyangml )

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Added support for using model parameters from a pretrained model
script.
- Introduced new methods to handle type-related parameters and
fine-tuning configurations.

- **Documentation**
- Updated documentation to clarify the model section requirements and
the new `--use-pretrain-script` option for fine-tuning.

- **Refactor**
- Simplified and improved the readability of key functions related to
model training and fine-tuning.

- **Tests**
- Added new test methods and utility functions to ensure consistency of
type mapping and parameter updates.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Duo <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Han Wang <[email protected]>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
@coderabbitai coderabbitai bot mentioned this pull request Nov 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants