Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build images for mainnet full nodes #2032

Merged
merged 1 commit into from
Aug 5, 2024
Merged

Build images for mainnet full nodes #2032

merged 1 commit into from
Aug 5, 2024

Conversation

roy-dydx
Copy link
Contributor

@roy-dydx roy-dydx commented Aug 5, 2024

Changelist

Build these images here instead of separate repo.

Test Plan

Cherry-picked this PR onto release branch and ran workflows to test images.

Author/Reviewer Checklist

  • If this PR has changes that result in a different app state given the same prior state and transaction list, manually add the state-breaking label.
  • If the PR has breaking postgres changes to the indexer add the indexer-postgres-breaking label.
  • If this PR isn't state-breaking but has changes that modify behavior in PrepareProposal or ProcessProposal, manually add the label proposal-breaking.
  • If this PR is one of many that implement a specific feature, manually label them all feature:[feature-name].
  • If you wish to for mergify-bot to automatically create a PR to backport your change to a release branch, manually add the label backport/[branch-name].
  • Manually add any of the following labels: refactor, chore, bug.

Summary by CodeRabbit

  • New Features
    • Introduced a GitHub Actions workflow for automated Docker image building and deployment to AWS ECR.
    • Added scripts for setting up and managing full nodes, including mainnet.sh, snapshot.sh, start.sh, and vars.sh, enhancing operational efficiency for the dYdX mainnet.
  • Bug Fixes
    • Modified .gitignore to now track the bin directory, allowing for better management of compiled binaries.

Copy link
Contributor

coderabbitai bot commented Aug 5, 2024

Walkthrough

The recent changes introduce several key automation features for the DYDX protocol, including a new GitHub Actions workflow for building and pushing Docker images to AWS ECR. Additionally, scripts and Dockerfiles have been added to facilitate the setup and management of full nodes on the mainnet, ensuring robust functionalities like snapshot creation and environment variable configuration.

Changes

Files Change Summary
.github/workflows/protocol-build-and-push-mainnet.yml New workflow to automate Docker image build and push to AWS ECR on certain events.
protocol/.gitignore Removed line ignoring bin directory, allowing tracking of compiled binaries.
protocol/testing/mainnet/Dockerfile New Dockerfile for mainnet testing, setting up the environment with necessary tools and dependencies.
protocol/testing/mainnet/mainnet.sh New script to initialize a non-validating full node, install prerequisites, and set up binaries.
protocol/testing/mainnet/snapshot.sh New script for managing blockchain snapshots, including uploads to S3.
protocol/testing/mainnet/start.sh New startup script for full nodes that configures and runs cosmovisor.
protocol/testing/mainnet/vars.sh New script to set environment variables for full node configuration, specifying version and indices.

Sequence Diagram(s)

sequenceDiagram
    participant PR as Pull Request
    participant GitHub as GitHub Actions
    participant ECR as AWS ECR
    participant Node as Full Node

    PR->>GitHub: Trigger on push or PR
    GitHub->>ECR: Build and push Docker image
    Node->>Node: Initialize full node
    Node->>Node: Create snapshots periodically
    Node->>ECR: Upload snapshots to S3
Loading

🐰 In a meadow where the code does flow,
New scripts and workflows help us grow!
With Docker images ready to fly,
Full nodes running, oh my, oh my!
Snapshots uploaded, all neat and bright,
Hooray for the changes that bring such delight! 🌼


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 10d9f1e and b2af0e5.

Files selected for processing (7)
  • .github/workflows/protocol-build-and-push-mainnet.yml (1 hunks)
  • protocol/.gitignore (1 hunks)
  • protocol/testing/mainnet/Dockerfile (1 hunks)
  • protocol/testing/mainnet/mainnet.sh (1 hunks)
  • protocol/testing/mainnet/snapshot.sh (1 hunks)
  • protocol/testing/mainnet/start.sh (1 hunks)
  • protocol/testing/mainnet/vars.sh (1 hunks)
Files skipped from review due to trivial changes (1)
  • protocol/.gitignore
Additional context used
Shellcheck
protocol/testing/mainnet/vars.sh

[warning] 4-4: CURRENT_VERSION_DIR appears unused. Verify use (or export if used externally).

(SC2034)


[warning] 7-7: LAST_FULL_NODE_INDEX appears unused. Verify use (or export if used externally).

(SC2034)

protocol/testing/mainnet/snapshot.sh

[warning] 54-54: RPC_ADDRESS appears unused. Verify use (or export if used externally).

(SC2034)


[warning] 85-85: p2p_seeds is referenced but not assigned.

(SC2154)


[warning] 85-85: dd_agent_host is referenced but not assigned.

(SC2154)


[warning] 87-87: upload_period is referenced but not assigned.

(SC2154)


[warning] 88-88: Quote this to prevent word splitting.

(SC2046)


[warning] 92-92: s3_snapshot_bucket is referenced but not assigned.

(SC2154)


[warning] 93-93: Use "${var:?}" to ensure this never expands to / .

(SC2115)

Additional comments not posted (28)
protocol/testing/mainnet/vars.sh (4)

1-1: Shebang line is correct.

The shebang line is correctly set for a bash script.


2-2: Error handling is correctly set.

The set -eo pipefail command ensures the script exits on error and handles pipeline errors correctly.


4-4: Verify the usage of CURRENT_VERSION_DIR.

The CURRENT_VERSION_DIR variable is declared but not used within the script. Verify if it is intended for external use or future use within the script.

Tools
Shellcheck

[warning] 4-4: CURRENT_VERSION_DIR appears unused. Verify use (or export if used externally).

(SC2034)


7-7: Verify the usage of LAST_FULL_NODE_INDEX.

The LAST_FULL_NODE_INDEX variable is declared but not used within the script. Verify if it is intended for external use or future use within the script.

Tools
Shellcheck

[warning] 7-7: LAST_FULL_NODE_INDEX appears unused. Verify use (or export if used externally).

(SC2034)

protocol/testing/mainnet/Dockerfile (7)

1-1: Verify the base image dydxprotocol-base.

Ensure that the base image dydxprotocol-base is appropriate and available.


3-3: Dependencies are correctly installed.

The apk add command installs bash, jq, and aws-cli, which are necessary dependencies.


4-4: cosmovisor is correctly installed.

The go install command installs [email protected], which is necessary for the Docker image.


6-6: Verify the source directory.

Ensure that the ./testing/mainnet/ directory exists and contains the necessary files to be copied to /dydxprotocol/.


8-9: Environment variable and working directory are correctly set.

The ENV and WORKDIR commands set the environment variable HOME and the working directory to /dydxprotocol/.


11-11: Verify the mainnet.sh script.

Ensure that the /dydxprotocol/mainnet.sh script exists and is executable.


13-13: Verify the start.sh script.

Ensure that the /dydxprotocol/start.sh script exists and is executable.

protocol/testing/mainnet/start.sh (6)

1-1: Shebang line is correct.

The shebang line is correctly set for a bash script.


2-2: Error handling is correctly set.

The set -eo pipefail command ensures the script exits on error and handles pipeline errors correctly.


4-7: Comments are clear and accurate.

The comments explain the purpose and usage of the script clearly and accurately.


9-9: Verify the vars.sh script.

Ensure that the vars.sh script exists and is correctly sourced.


11-17: Setup process for cosmovisor is correct.

The setup process for cosmovisor is correct, copying binaries to the home directories of full nodes.


19-19: Command execution is correct.

The cosmovisor command is executed with any passed arguments, ensuring the node starts correctly.

.github/workflows/protocol-build-and-push-mainnet.yml (3)

3-11: LGTM!

The workflow triggers are correctly specified for the relevant branches.


13-18: LGTM!

The job configuration is standard and correctly sets the working directory.


20-48: LGTM! But verify the presence of required secrets.

The steps are well-defined and follow best practices.

However, ensure that the secrets AWS_ACCESS_KEY_ID_VALIDATOR_MAINNET and AWS_SECRET_ACCESS_KEY_VALIDATOR_MAINNET are correctly set up in the repository.

protocol/testing/mainnet/mainnet.sh (4)

1-8: LGTM!

The initial setup is standard and follows best practices.


20-45: LGTM!

The function is correctly defined and follows best practices.


47-78: LGTM!

The function is well-defined and follows best practices.


16-18: LGTM! But verify the environment.

The function is correctly defined to install dasel and jq using apk.

However, ensure that the environment uses Alpine Linux.

protocol/testing/mainnet/snapshot.sh (4)

1-28: LGTM!

The initial setup is standard and follows best practices.


30-39: LGTM!

The environment setup is well-defined and follows best practices.


48-80: LGTM!

The snapshot configuration is well-defined and follows best practices.

Tools
Shellcheck

[warning] 54-54: RPC_ADDRESS appears unused. Verify use (or export if used externally).

(SC2034)


41-43: LGTM! But verify the environment.

The function is correctly defined to install dasel, jq, and curl using apk.

However, ensure that the environment uses Alpine Linux.

# Do not index tx_index.db
sed -i 's/indexer = "kv"/indexer = "null"/' /dydxprotocol/chain/.full-node-2/config/config.toml

# TODO: add metrics around snapshot upload latency/frequency/success rate
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add metrics around snapshot upload latency/frequency/success rate.

The TODO comment indicates that metrics are missing.

Do you want me to generate the code for adding these metrics or open a GitHub issue to track this task?

SNAP_NAME=$(echo "${CHAIN_ID}_$(date '+%Y-%m-%d-%H-%M').tar.gz")
tar cvzf ${SNAP_PATH}/${SNAP_NAME} ${DATA_PATH}
aws s3 cp ${SNAP_PATH}/${SNAP_NAME} s3://${s3_snapshot_bucket}/ --region ap-northeast-1 --debug || true
rm -rf ${SNAP_PATH}/${SNAP_NAME}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ensure the variable never expands to /.

Use ${var:?} to ensure this never expands to /.

-  rm -rf ${SNAP_PATH}/${SNAP_NAME}
+  rm -rf ${SNAP_PATH:?}/${SNAP_NAME}
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
rm -rf ${SNAP_PATH}/${SNAP_NAME}
rm -rf ${SNAP_PATH:?}/${SNAP_NAME}
Tools
Shellcheck

[warning] 93-93: Use "${var:?}" to ensure this never expands to / .

(SC2115)

cosmovisor run start --log_level info --home /dydxprotocol/chain/.full-node-2 --p2p.seeds "${p2p_seeds}" --non-validating-full-node=true --dd-agent-host=${dd_agent_host} &

sleep ${upload_period}
kill -TERM $(pidof cosmovisor)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quote the variable to prevent word splitting.

The variable $(pidof cosmovisor) should be quoted to prevent word splitting.

-  kill -TERM $(pidof cosmovisor)
+  kill -TERM "$(pidof cosmovisor)"
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
kill -TERM $(pidof cosmovisor)
kill -TERM "$(pidof cosmovisor)"
Tools
Shellcheck

[warning] 88-88: Quote this to prevent word splitting.

(SC2046)


SNAP_NAME=$(echo "${CHAIN_ID}_$(date '+%Y-%m-%d-%H-%M').tar.gz")
tar cvzf ${SNAP_PATH}/${SNAP_NAME} ${DATA_PATH}
aws s3 cp ${SNAP_PATH}/${SNAP_NAME} s3://${s3_snapshot_bucket}/ --region ap-northeast-1 --debug || true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assign value to s3_snapshot_bucket.

The variable s3_snapshot_bucket is referenced but not assigned.

-  aws s3 cp ${SNAP_PATH}/${SNAP_NAME} s3://${s3_snapshot_bucket}/ --region ap-northeast-1 --debug || true
+  aws s3 cp ${SNAP_PATH}/${SNAP_NAME} s3://${s3_snapshot_bucket:-default_bucket}/ --region ap-northeast-1 --debug || true
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
aws s3 cp ${SNAP_PATH}/${SNAP_NAME} s3://${s3_snapshot_bucket}/ --region ap-northeast-1 --debug || true
aws s3 cp ${SNAP_PATH}/${SNAP_NAME} s3://${s3_snapshot_bucket:-default_bucket}/ --region ap-northeast-1 --debug || true
Tools
Shellcheck

[warning] 92-92: s3_snapshot_bucket is referenced but not assigned.

(SC2154)

# p2p.seeds taken from --p2p.persistent_peers flag of full node
cosmovisor run start --log_level info --home /dydxprotocol/chain/.full-node-2 --p2p.seeds "${p2p_seeds}" --non-validating-full-node=true --dd-agent-host=${dd_agent_host} &

sleep ${upload_period}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quote the variable to prevent word splitting.

The variable upload_period should be quoted to prevent word splitting.

-  sleep ${upload_period}
+  sleep "${upload_period}"
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
sleep ${upload_period}
sleep "${upload_period}"
Tools
Shellcheck

[warning] 87-87: upload_period is referenced but not assigned.

(SC2154)

# TODO: add metrics around snapshot upload latency/frequency/success rate
while true; do
# p2p.seeds taken from --p2p.persistent_peers flag of full node
cosmovisor run start --log_level info --home /dydxprotocol/chain/.full-node-2 --p2p.seeds "${p2p_seeds}" --non-validating-full-node=true --dd-agent-host=${dd_agent_host} &
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assign values to p2p_seeds and dd_agent_host.

The variables p2p_seeds and dd_agent_host are referenced but not assigned.

-  cosmovisor run start --log_level info --home /dydxprotocol/chain/.full-node-2 --p2p.seeds "${p2p_seeds}" --non-validating-full-node=true --dd-agent-host=${dd_agent_host} &
+  cosmovisor run start --log_level info --home /dydxprotocol/chain/.full-node-2 --p2p.seeds "${p2p_seeds:-default_seed}" --non-validating-full-node=true --dd-agent-host=${dd_agent_host:-default_host} &
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
cosmovisor run start --log_level info --home /dydxprotocol/chain/.full-node-2 --p2p.seeds "${p2p_seeds}" --non-validating-full-node=true --dd-agent-host=${dd_agent_host} &
cosmovisor run start --log_level info --home /dydxprotocol/chain/.full-node-2 --p2p.seeds "${p2p_seeds:-default_seed}" --non-validating-full-node=true --dd-agent-host=${dd_agent_host:-default_host} &
Tools
Shellcheck

[warning] 85-85: p2p_seeds is referenced but not assigned.

(SC2154)


[warning] 85-85: dd_agent_host is referenced but not assigned.

(SC2154)


# Define the mapping from version to URL
declare -A version_to_url
# version_to_url["v5.1.0"]="https://github.com/dydxprotocol/v4-chain/releases/download/protocol%2Fv5.1.0-dev4/dydxprotocold-v5.1.0-dev4-linux-amd64.tar.gz"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this still need to be defined?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, since mainnet is on v5.1.x we are using the current commit for v5.1.0 in this PR

@roy-dydx roy-dydx merged commit 340ea7c into main Aug 5, 2024
15 of 17 checks passed
@roy-dydx roy-dydx deleted the roy/mainnettest branch August 5, 2024 18:45
@roy-dydx
Copy link
Contributor Author

roy-dydx commented Aug 5, 2024

@Mergifyio backport release/protocol/v5.1.x

Copy link
Contributor

mergify bot commented Aug 5, 2024

backport release/protocol/v5.1.x

✅ Backports have been created

mergify bot pushed a commit that referenced this pull request Aug 5, 2024
@roy-dydx
Copy link
Contributor Author

roy-dydx commented Aug 5, 2024

https://github.com/Mergifyio backport release/protocol/v5.2.x

roy-dydx added a commit that referenced this pull request Aug 5, 2024
Copy link
Contributor

mergify bot commented Aug 5, 2024

backport release/protocol/v5.2.x

✅ Backports have been created

mergify bot pushed a commit that referenced this pull request Aug 5, 2024
roy-dydx added a commit that referenced this pull request Aug 6, 2024
@roy-dydx
Copy link
Contributor Author

roy-dydx commented Aug 7, 2024

https://github.com/Mergifyio backport release/protocol/v6.x

Copy link
Contributor

mergify bot commented Aug 7, 2024

backport release/protocol/v6.x

✅ Backports have been created

mergify bot pushed a commit that referenced this pull request Aug 7, 2024
roy-dydx added a commit that referenced this pull request Aug 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging this pull request may close these issues.

2 participants