Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add option to select backends TF/PT #1545

Merged
merged 52 commits into from
May 11, 2024

Conversation

thangckt
Copy link
Contributor

@thangckt thangckt commented May 9, 2024

reopen PR #1541 due to branch is deleted

add a new key in param.json file

"train_backend": "pytorch"/"tensorflow",

relate to this issue #1462

Summary by CodeRabbit

  • New Features

    • Improved model management by dynamically generating model suffixes based on the selected backend, enhancing compatibility.
  • Enhancements

    • Updated model-related functions to incorporate backend-specific model suffixes for accurate file handling during training processes.

thangckt and others added 30 commits March 17, 2024 00:01
add Option to choose backend TF/PT
@thangckt thangckt changed the base branch from master to devel May 9, 2024 08:35
Copy link
Contributor

coderabbitai bot commented May 9, 2024

Walkthrough

Walkthrough

The changes entail introducing a new function _get_model_suffix in dpgen/generator/run.py to determine model suffixes based on the backend. This addition is then reflected across various functions for managing model files, ensuring consistency in file naming conventions across different backend options.

Changes

Files Changes
dpgen/generator/run.py - Added _get_model_suffix(jdata) -> str function.
- Modified copy_model to include suffix=".pb" parameter.
- Updated functions to use the determined model suffix for file paths.
dpgen/generator/run.py - Added suffix = _get_model_suffix(jdata) within make_train, run_train, post_train functions.
- Adjusted functions to utilize the determined suffix for file operations.
dpgen/generator/run.py - Integrated suffix = _get_model_suffix(jdata) in functions related to model deviation.
- Updated file paths to include the determined model suffix.

Possibly related issues


Recent Review Details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits Files that changed from the base of the PR and between 4023ccd and cf68032.
Files selected for processing (1)
  • dpgen/generator/run.py (16 hunks)
Additional Context Used
GitHub Check Runs (1)
codecov/patch success (12)

dpgen/generator/run.py: [warning] 135-135: dpgen/generator/run.py#L135
Added line #L135 was not covered by tests


dpgen/generator/run.py: [warning] 336-336: dpgen/generator/run.py#L336
Added line #L336 was not covered by tests


dpgen/generator/run.py: [warning] 344-344: dpgen/generator/run.py#L344
Added line #L344 was not covered by tests


dpgen/generator/run.py: [warning] 754-754: dpgen/generator/run.py#L754
Added line #L754 was not covered by tests


dpgen/generator/run.py: [warning] 817-818: dpgen/generator/run.py#L817-L818
Added lines #L817 - L818 were not covered by tests


dpgen/generator/run.py: [warning] 823-824: dpgen/generator/run.py#L823-L824
Added lines #L823 - L824 were not covered by tests


dpgen/generator/run.py: [warning] 835-835: dpgen/generator/run.py#L835
Added line #L835 was not covered by tests


dpgen/generator/run.py: [warning] 843-844: dpgen/generator/run.py#L843-L844
Added lines #L843 - L844 were not covered by tests


dpgen/generator/run.py: [warning] 916-916: dpgen/generator/run.py#L916
Added line #L916 was not covered by tests


dpgen/generator/run.py: [warning] 918-920: dpgen/generator/run.py#L918-L920
Added lines #L918 - L920 were not covered by tests


dpgen/generator/run.py: [warning] 922-922: dpgen/generator/run.py#L922
Added line #L922 was not covered by tests

Additional comments not posted (1)
dpgen/generator/run.py (1)

128-138: Ensure the function _get_model_suffix handles all potential backend values appropriately.

Verification successful

The function _get_model_suffix in dpgen/generator/run.py is designed to handle two specific backends ('tensorflow' and 'pytorch') and raises an exception for any unsupported backends. This implementation is consistent across its usage in the codebase, and the error handling is adequate for notifying about unsupported backends. Therefore, the function appears to handle all potential backend values appropriately as per its design.

  • File and usage points:
    • dpgen/generator/run.py: Multiple usages within various parts of the file.
    • dpgen/simplify/simplify.py: Used to determine the model suffix based on backend.
Scripts Executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Description: Verify that all potential backend values are handled in `_get_model_suffix`.

# Test: Search for the function usage. Expect: Only occurances of the new signature.
rg --type python $'_get_model_suffix'

Length of output: 67



Script:

#!/bin/bash
# Description: Verify that all potential backend values are handled in `_get_model_suffix`.

# Test: Search for the function usage. Expect: Only occurrences of the new signature.
rg --type py $'_get_model_suffix'

Length of output: 764


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

codecov bot commented May 9, 2024

Codecov Report

Attention: Patch coverage is 63.26531% with 18 lines in your changes are missing coverage. Please review.

Project coverage is 49.51%. Comparing base (e13c186) to head (cf68032).

Files Patch % Lines
dpgen/generator/run.py 65.21% 16 Missing ⚠️
dpgen/simplify/simplify.py 0.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##            devel    #1545      +/-   ##
==========================================
+ Coverage   49.48%   49.51%   +0.02%     
==========================================
  Files          83       83              
  Lines       14818    14844      +26     
==========================================
+ Hits         7333     7350      +17     
- Misses       7485     7494       +9     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@njzjz njzjz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good to me. Two minor changes should be made.

dpgen/generator/run.py Outdated Show resolved Hide resolved
dpgen/generator/run.py Outdated Show resolved Hide resolved
dpgen/generator/run.py Outdated Show resolved Hide resolved
@thangckt thangckt requested a review from njzjz May 10, 2024 01:40
Signed-off-by: Jinzhe Zeng <[email protected]>
Copy link
Member

@njzjz njzjz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good to me. I expect someone to test it (I don't have time to do so)

Signed-off-by: Jinzhe Zeng <[email protected]>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Out of diff range and nitpick comments (2)
dpgen/generator/run.py (2)

[!TIP]
Codebase Verification

Several issues were identified in the copy_model function regarding error handling and path management:

  • The function changes the current working directory within a loop, which can lead to unexpected behavior if not managed correctly.
  • There is no explicit error handling for operations like os.symlink(), which can fail due to reasons such as lack of permissions or existing links.
Analysis chain

Line range hint 188-201: Check the logic in copy_model for potential file path issues and ensure robust error handling.

Scripts Executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Description: Verify file operations in `copy_model` function.

# Test: Search for file operations in the function. Expect: Proper error handling and path management.
rg --type python $'copy_model'

Length of output: 60



Script:

#!/bin/bash
# Description: Re-verify file operations in `copy_model` function.

# Test: Search for the `copy_model` function in the codebase to verify file operations. Expect: Proper error handling and path management.
rg 'def copy_model' -C 10

Length of output: 1416


Line range hint 719-755: Ensure that run_train securely constructs and executes system commands to prevent potential command injection vulnerabilities.

-    train_command += " --pt"
+    # Ensure that train_command is safely appended
+    train_command = shlex.join([train_command, "--pt"])

dpgen/generator/run.py Show resolved Hide resolved
dpgen/generator/run.py Show resolved Hide resolved
@njzjz njzjz linked an issue May 10, 2024 that may be closed by this pull request
@Yi-FanLi
Copy link
Collaborator

I have just tested it on my system. This PR works fine itself, but the PyTorch model cannot be correctly trained due to the inconsistency with the TensorFlow backend. See deepmodeling/deepmd-kit#3770 and deepmodeling/deepmd-kit#3771.

@wanghan-iapcm wanghan-iapcm merged commit 9d29459 into deepmodeling:devel May 11, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request] Support different backends for DeePMD-kit
4 participants