feat:finish backup and restore #2968

buzhimingyonghu · 2024-12-06T07:26:53Z

Provide BackupRepo

Fellow the kubeblock docs kubeblocks
Fellow [tools/kubeblocks_helm/BackupRepo_config]

Create a Backup

kbcli cluster backup pika-master-slave-cluster --method datafile

View Backups

kbcli cluster list-backups

Restore a Cluster from a Backup
Select a backup and create a cluster:

kbcli cluster restore test-pika --backup backup-default-pika-master-slave-cluster-20241203170414

View the Backup Policy of a Cluster

kbcli cluster list-backup-policy mycluster

Summary by CodeRabbit

Release Notes

New Features
- Introduced a command for creating a backup repository using kbcli.
- Added new scripts for backing up and restoring Redis databases.
- New YAML configurations for managing backup actions and policies within the Kubeblocks framework.
- Added commands for installing Helm charts for pika-master-slave and pika-master-slave-cluster.
Documentation
- Updated README to include instructions for backing up and restoring a cluster.
Bug Fixes
- Improved error handling in backup and restore scripts.
Chores
- Added uninstall scripts for removing Helm releases.

coderabbitai · 2024-12-06T07:27:00Z

Walkthrough

A new command for the kbcli tool has been introduced to create a backup repository, allowing users to specify various configuration parameters. Additionally, the documentation has been updated to include a section on backing up and restoring clusters. Several installation scripts have been modified or added, including the removal of uninstallation commands and the introduction of new Helm installation commands. New YAML configuration files for backup actions and policies have been created, and enhancements have been made to existing scripts for better error handling and control flow.

Changes

File	Change Summary
`tools/kubeblocks_helm/BackupRepo_config`	Added command `kbcli backuprepo create my-repo` with parameters for backup repository configuration.
`tools/kubeblocks_helm/README.md`	Added section on backing up/restoring cluster; included commands for backup and restore operations.
`tools/kubeblocks_helm/install.sh`	Removed uninstallation commands and `sleep` command; streamlined installation process for Helm charts.
`tools/kubeblocks_helm/install_ms.sh`	Added commands to install `pika-master-slave` and `pika-master-slave-cluster` Helm charts.
`tools/kubeblocks_helm/installdebug.sh`	File deleted; contained commented-out Helm commands and `helm template` command.
`tools/kubeblocks_helm/pika-master-slave-cluster/templates/role.yaml`	Updated Kubernetes Role with new rules for `clusters` and `pods` resources.
`tools/kubeblocks_helm/pika-master-slave/dataprotection/backup.sh`	New script for backing up Redis database with error handling and backup process management.
`tools/kubeblocks_helm/pika-master-slave/dataprotection/restore.sh`	Enhanced restore script with error handling and control flow improvements.
`tools/kubeblocks_helm/pika-master-slave/templates/backupactionset.yaml`	New YAML file for managing backup actions within Kubeblocks framework.
`tools/kubeblocks_helm/pika-master-slave/templates/backuppolicytemplate.yaml`	New YAML file for defining a backup policy for the Pika cluster.
`tools/kubeblocks_helm/pika-master-slave/templates/componentdefinition-pika.yaml`	Modified `ComponentDefinition` to include new roles and lifecycle actions, and added a new container.
`tools/kubeblocks_helm/uninstall.sh`	New script to uninstall `pika-cluster` and `pika` Helm releases.
`tools/kubeblocks_helm/uninstall_ms.sh`	New script to uninstall `pika-master-slave-cluster` and `pika-master-slave`.

Possibly related PRs

feat:Update kubeblocks version to 0.9 #2860: The changes in this PR involve updates to the README.md file, which includes commands related to backup and restore operations using the kbcli tool, directly connecting to the new backup repository command introduced in the main PR.

Suggested labels

✏️ Feature, 🧹 Updates

Suggested reviewers

chejinge
machinly

🐰 In the world of code, a new command takes flight,
To back up our treasures, making everything right.
With scripts and YAML, we dance and we play,
Protecting our data in a seamless ballet.
So hop along, friends, let’s cheer and rejoice,
For backups are here, let’s all raise our voice! 🎉

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between a1cdd24 and daf5bfc.

📒 Files selected for processing (1)

tools/kubeblocks_helm/BackupRepo_config (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

tools/kubeblocks_helm/BackupRepo_config

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 12

🧹 Outside diff range and nitpick comments (6)

tools/kubeblocks_helm/BackupRepo_config (1)

3-8: Improve command documentation with parameter descriptions

The placeholder values (<...>) should be documented with:

Expected format and constraints for each parameter

Whether parameters are optional or required

Example values for common scenarios
tools/kubeblocks_helm/pika-master-slave/templates/backuppolicytemplate.yaml (2)
22-26: Review backup schedule and retention policy

Consider the following for the backup schedule:

Weekly backups might not be sufficient for production workloads

3-month retention period (3m) - ensure this meets compliance requirements

Document the rationale for these choices in comments

10-11: Fix YAML formatting issues

Remove trailing spaces in line 11 as flagged by yamllint.
-  - componentDefs: 
-      - pika 
+  - componentDefs:
+      - pika
🧰 Tools

🪛 yamllint (1.35.1)

[error] 11-11: trailing spaces

(trailing-spaces)
tools/kubeblocks_helm/pika-master-slave/dataprotection/restore.sh (1)
6-11: Enhance directory validation error message

The error message could be more descriptive to help users understand and resolve the issue.

Consider updating the error message:
-  echo "${DATA_DIR} is not empty! Please make sure that the directory is empty before restoring the backup."
+  echo "Error: ${DATA_DIR} contains existing data and no .kb-data-protection file found."
+  echo "This could indicate:"
+  echo "1. An incomplete previous restore"
+  echo "2. Existing data that needs to be backed up first"
+  echo "Please backup or clear the directory before proceeding with restore."
tools/kubeblocks_helm/pika-master-slave/templates/backupactionset.yaml (2)
11-14: Consider adding validation for environment variables

The environment variables are critical for backup/restore operations but lack validation.

Consider adding a validation mechanism in the scripts to ensure these environment variables are set and contain valid values. Additionally, document the expected format and constraints for each variable in comments.

21-23: Review progress sync interval

The 5-second progress sync interval might be too frequent for large backups.

Consider making the interval configurable through a template value:
       syncProgress:
         enabled: true
-        intervalSeconds: 5
+        intervalSeconds: {{ .Values.backup.syncIntervalSeconds | default 5 }}

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 937ebde and a1cdd24.

📒 Files selected for processing (13)

tools/kubeblocks_helm/BackupRepo_config (1 hunks)
tools/kubeblocks_helm/README.md (1 hunks)
tools/kubeblocks_helm/install.sh (0 hunks)
tools/kubeblocks_helm/install_ms.sh (1 hunks)
tools/kubeblocks_helm/installdebug.sh (0 hunks)
tools/kubeblocks_helm/pika-master-slave-cluster/templates/role.yaml (1 hunks)
tools/kubeblocks_helm/pika-master-slave/dataprotection/backup.sh (1 hunks)
tools/kubeblocks_helm/pika-master-slave/dataprotection/restore.sh (1 hunks)
tools/kubeblocks_helm/pika-master-slave/templates/backupactionset.yaml (1 hunks)
tools/kubeblocks_helm/pika-master-slave/templates/backuppolicytemplate.yaml (1 hunks)
tools/kubeblocks_helm/pika-master-slave/templates/componentdefinition-pika.yaml (2 hunks)
tools/kubeblocks_helm/uninstall.sh (1 hunks)
tools/kubeblocks_helm/uninstall_ms.sh (1 hunks)

💤 Files with no reviewable changes (2)

tools/kubeblocks_helm/installdebug.sh
tools/kubeblocks_helm/install.sh

🧰 Additional context used

🪛 Shellcheck (0.10.0)

tools/kubeblocks_helm/install_ms.sh

[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.

(SC2148)

tools/kubeblocks_helm/pika-master-slave/dataprotection/backup.sh

[warning] 29-29: Use 'cd ... || exit' or 'cd ... || return' in case cd fails.

(SC2164)

tools/kubeblocks_helm/pika-master-slave/dataprotection/restore.sh

[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.

(SC2148)

tools/kubeblocks_helm/uninstall.sh

[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.

(SC2148)

tools/kubeblocks_helm/uninstall_ms.sh

[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.

(SC2148)

🪛 yamllint (1.35.1)

tools/kubeblocks_helm/pika-master-slave/templates/backuppolicytemplate.yaml

[error] 8-8: syntax error: could not find expected ':'

(syntax)

[error] 11-11: trailing spaces

(trailing-spaces)

tools/kubeblocks_helm/pika-master-slave/templates/backupactionset.yaml

[error] 8-8: syntax error: could not find expected ':'

(syntax)

tools/kubeblocks_helm/pika-master-slave/templates/componentdefinition-pika.yaml

[error] 51-51: trailing spaces

(trailing-spaces)

[error] 113-113: trailing spaces

(trailing-spaces)

🔇 Additional comments (8)

tools/kubeblocks_helm/install_ms.sh (1)

1-1: Verify chart existence and dependencies

Let's verify if the referenced Helm charts exist in the repository.

✅ Verification successful

Both Helm charts exist and have valid structure

The verification confirms that both Helm charts referenced in the install script exist and have the required structure:

./tools/kubeblocks_helm/pika-master-slave/ contains Chart.yaml, values.yaml, and templates directory
./tools/kubeblocks_helm/pika-master-slave-cluster/ contains Chart.yaml, values.yaml, and templates directory

Both charts appear to be properly structured Helm charts with all the necessary components, making the installation command valid.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check if the Helm charts exist and have the required structure

# Check for chart directories
echo "Checking chart directories..."
fd -t d "pika-master-slave$" -x echo "Found chart directory: {}"

# Check for Chart.yaml files
echo -e "\nChecking Chart.yaml files..."
fd "Chart.yaml" -p "pika-master-slave" -x echo "Found Chart.yaml: {}"

# Check for values.yaml files
echo -e "\nChecking values.yaml files..."
fd "values.yaml" -p "pika-master-slave" -x echo "Found values.yaml: {}"

Length of output: 696

Script:

#!/bin/bash
# Let's check the content of the found directory and its parent
echo "Checking contents of the helm directory..."
ls -la ./tools/kubeblocks_helm/

echo -e "\nChecking if charts exist..."
ls -la ./tools/kubeblocks_helm/pika-master-slave/ 2>/dev/null || echo "pika-master-slave directory not accessible"
ls -la ./tools/kubeblocks_helm/pika-master-slave-cluster/ 2>/dev/null || echo "pika-master-slave-cluster directory not accessible"

# Let's also check if these might be helm chart archives (.tgz)
echo -e "\nChecking for helm chart archives..."
fd -e tgz . ./tools/kubeblocks_helm/ -x echo "Found chart archive: {}"

Length of output: 2044

🧰 Tools

🪛 Shellcheck (0.10.0)

[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.

(SC2148)

tools/kubeblocks_helm/uninstall.sh (1)

1-1: 🛠️ Refactor suggestion

Essential improvements needed for robust uninstallation

The current implementation needs several improvements for reliability and maintainability:

Apply these changes to make the script more robust:

+#!/bin/bash
+set -eo pipefail
+
+# This script uninstalls the Pika cluster and its components
+
+echo "Uninstalling Pika cluster components..."
+
+# Function to safely uninstall a helm release
+uninstall_release() {
+    local release=$1
+    if helm status "$release" >/dev/null 2>&1; then
+        echo "Uninstalling $release..."
+        helm uninstall "$release" || echo "Failed to uninstall $release"
+    else
+        echo "Release $release not found, skipping..."
+    fi
+}
+
+# Uninstall releases
+uninstall_release "pika-cluster"
+uninstall_release "pika"
+
+echo "Uninstallation completed"

The improvements include:

Added shebang line for proper shell specification
Added error handling with set -eo pipefail
Added release existence validation
Added informative output messages
Modularized uninstallation logic into a function
Separated operations for better error handling

Let's verify if there are any additional cleanup steps needed:

✅ Verification successful

Let me gather more information about potential cleanup requirements by checking the Helm chart and related files.

Let me check the cluster definition and component templates to understand what resources need cleanup.

Enhanced uninstallation script with comprehensive resource cleanup

+#!/bin/bash
+set -eo pipefail
+
+# This script uninstalls Pika cluster and its components, including cleanup of associated resources
+
+echo "Starting Pika cluster uninstallation..."
+
+# Function to safely uninstall a helm release
+uninstall_release() {
+    local release=$1
+    if helm status "$release" >/dev/null 2>&1; then
+        echo "Uninstalling $release..."
+        helm uninstall "$release" || echo "Failed to uninstall $release"
+    else
+        echo "Release $release not found, skipping..."
+    fi
+}
+
+# Function to cleanup resources
+cleanup_resources() {
+    local namespace=$1
+    local component=$2
+    
+    echo "Cleaning up resources for $component..."
+    
+    # Clean up PVCs
+    kubectl delete pvc -l app.kubernetes.io/instance="$component" -n "$namespace" --ignore-not-found=true
+    
+    # Clean up ConfigMaps
+    kubectl delete cm -l app.kubernetes.io/instance="$component" -n "$namespace" --ignore-not-found=true
+    
+    # Clean up Services
+    kubectl delete svc -l app.kubernetes.io/instance="$component" -n "$namespace" --ignore-not-found=true
+}
+
+NAMESPACE=${NAMESPACE:-default}
+
+# Uninstall Helm releases
+uninstall_release "pika-cluster"
+uninstall_release "pika"
+
+# Cleanup resources
+cleanup_resources "$NAMESPACE" "pika-cluster"
+cleanup_resources "$NAMESPACE" "pika"
+
+echo "Uninstallation and cleanup completed"

The improved script includes:

Proper shell environment setup
Comprehensive cleanup of PVCs, ConfigMaps, and Services
Namespace-aware resource cleanup
Error handling and status messages
Resource cleanup for both pika and pika-cluster components

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check for any potential leftover resources
echo "Checking for potential leftover resources..."

# Check for PVCs
kubectl get pvc -l app.kubernetes.io/instance=pika-cluster 2>/dev/null || true
kubectl get pvc -l app.kubernetes.io/instance=pika 2>/dev/null || true

# Check for ConfigMaps
kubectl get cm -l app.kubernetes.io/instance=pika-cluster 2>/dev/null || true
kubectl get cm -l app.kubernetes.io/instance=pika 2>/dev/null || true

Length of output: 352

Script:

#!/bin/bash
# Check for Helm chart files and resources
echo "Checking Helm chart resources..."

# Look for helm chart files
fd -e yaml -e tpl . tools/kubeblocks_helm

# Check for any cleanup or post-delete hooks in values files
rg -A 5 "helm.sh/hook.*delete" tools/kubeblocks_helm

# Check for any CRDs or other resources that might need cleanup
rg -l "kind: CustomResourceDefinition" tools/kubeblocks_helm

Length of output: 3215

Script:

#!/bin/bash
# Check cluster and component definitions for resource types
echo "Analyzing cluster resources..."

# Check cluster template for resource types
cat tools/kubeblocks_helm/pika-cluster/templates/cluster.yaml

# Check component definitions for resource types
cat tools/kubeblocks_helm/pika/templates/componentdefinition-pika-group.yaml
cat tools/kubeblocks_helm/pika/templates/componentdefinition-codis-proxy.yaml
cat tools/kubeblocks_helm/pika/templates/componentdefinition-pika-etcd.yaml

# Look for any service or statefulset definitions
rg -l "kind: (Service|StatefulSet)" tools/kubeblocks_helm

Length of output: 15075

🧰 Tools

🪛 Shellcheck (0.10.0)

[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.

(SC2148)

tools/kubeblocks_helm/uninstall_ms.sh (1)

1-1: 🛠️ Refactor suggestion

Reduce code duplication with uninstall.sh

This script has identical issues to uninstall.sh and performs similar operations. Consider consolidating both scripts to reduce maintenance overhead.

Create a single unified script:

+#!/bin/bash
+set -eo pipefail
+
+# This script uninstalls Pika cluster components
+# Usage: ./uninstall.sh [--all|--master-slave]
+
+# Function to safely uninstall a helm release
+uninstall_release() {
+    local release=$1
+    if helm status "$release" >/dev/null 2>&1; then
+        echo "Uninstalling $release..."
+        helm uninstall "$release" || echo "Failed to uninstall $release"
+    else
+        echo "Release $release not found, skipping..."
+    fi
+}
+
+case "${1:-}" in
+    --all)
+        echo "Uninstalling all Pika components..."
+        uninstall_release "pika-cluster"
+        uninstall_release "pika"
+        uninstall_release "pika-master-slave-cluster"
+        uninstall_release "pika-master-slave"
+        ;;
+    --master-slave)
+        echo "Uninstalling Pika master-slave components..."
+        uninstall_release "pika-master-slave-cluster"
+        uninstall_release "pika-master-slave"
+        ;;
+    *)
+        echo "Usage: $0 [--all|--master-slave]"
+        exit 1
+        ;;
+esac
+
+echo "Uninstallation completed"

Consider these architectural improvements:

Move all Helm-related scripts to a common directory
Create a shared library for common functions
Add documentation about the relationship between these components and the backup/restore feature

Let's verify the integration with backup functionality:

✅ Verification successful

Let me gather more information about the codebase structure and related scripts.

Based on the gathered information, I can now provide the final response:

Consolidate uninstall scripts and add backup cleanup

The review suggestion is valid but needs enhancement. The master-slave variant has additional backup-related components that need proper cleanup as shown in pika-master-slave/templates/backupactionset.yaml and backuppolicytemplate.yaml. Here's a refined suggestion:

+#!/bin/bash
+set -eo pipefail
+
+# This script uninstalls Pika cluster components
+# Usage: ./uninstall.sh [--all|--master-slave]
+
+# Function to safely uninstall a helm release
+uninstall_release() {
+    local release=$1
+    if helm status "$release" >/dev/null 2>&1; then
+        echo "Uninstalling $release..."
+        helm uninstall "$release" || echo "Failed to uninstall $release"
+    else
+        echo "Release $release not found, skipping..."
+    fi
+}
+
+# Function to cleanup backup resources
+cleanup_backups() {
+    echo "Cleaning up backup resources..."
+    kubectl delete backups.dataprotection.kubeblocks.io --all 2>/dev/null || true
+    kubectl delete backuppolicies --all 2>/dev/null || true
+}
+
+case "${1:-}" in
+    --all)
+        echo "Uninstalling all Pika components..."
+        uninstall_release "pika-cluster"
+        uninstall_release "pika"
+        uninstall_release "pika-master-slave-cluster"
+        uninstall_release "pika-master-slave"
+        cleanup_backups
+        ;;
+    --master-slave)
+        echo "Uninstalling Pika master-slave components..."
+        uninstall_release "pika-master-slave-cluster"
+        uninstall_release "pika-master-slave"
+        cleanup_backups
+        ;;
+    *)
+        echo "Usage: $0 [--all|--master-slave]"
+        exit 1
+        ;;
+esac
+
+echo "Uninstallation completed"

Key improvements:

Added backup resource cleanup for master-slave variant
Maintained alignment with documented uninstall procedures in README.md
Preserved the order of uninstallation (cluster first, then definition)

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if there are any backup resources that need cleanup
echo "Checking for backup resources..."

# Check for backup CRDs and resources
kubectl get backups.dataprotection.kubeblocks.io 2>/dev/null || true
kubectl get backuprepositories.dataprotection.kubeblocks.io 2>/dev/null || true

Length of output: 201

Script:

#!/bin/bash
# Check for related helm scripts and their structure
fd -t f . tools/kubeblocks_helm/

# Look for any backup related code
rg -l "backup" tools/kubeblocks_helm/

# Check content of uninstall.sh for comparison
cat tools/kubeblocks_helm/uninstall.sh

# Look for any documentation about these scripts
cat tools/kubeblocks_helm/README.md 2>/dev/null || echo "No README found"

Length of output: 6398

🧰 Tools

🪛 Shellcheck (0.10.0)

[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.

(SC2148)

tools/kubeblocks_helm/pika-master-slave-cluster/templates/role.yaml (2)

21-27: Validate pod access requirements

The role grants broad read access to all pods. Consider:

Limiting pod access to specific labels/names
Documenting why pod-level access is needed for backup operations

✅ Verification successful

Pod access permissions are appropriately scoped

The pod access permissions in the role are justified and appropriately scoped:

The values.yaml explicitly documents that RBAC permissions are "used by cluster component pod" including event creation
The role is tightly bound to a specific service account via RoleBinding in the same namespace
The permissions are read-only (get, list, watch) without any modification rights
The role is namespace-scoped rather than cluster-wide

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for pod interactions in backup/restore logic
rg -l "pods" | grep -i "backup\|restore"

Length of output: 40

Script:

#!/bin/bash
# Let's first check the broader context of the role.yaml file
cat tools/kubeblocks_helm/pika-master-slave-cluster/templates/role.yaml

# Then search for any pod-related operations in the codebase
rg -A 5 -B 5 "pods.*get|pods.*list|pods.*watch" 

# Also check if there are any comments or documentation about pod access
rg -l "pod.*access|access.*pod" --type md

Length of output: 1398

Script:

#!/bin/bash
# Let's check what this role is used for by examining the service account bindings
cat tools/kubeblocks_helm/pika-master-slave-cluster/templates/rolebinding.yaml

# Let's also check if there are any other related RBAC files
fd "role.*\.yaml|rolebinding.*\.yaml" tools/kubeblocks_helm/pika-master-slave-cluster/templates/

# Check for any pod-related operations in the chart's values or other template files
rg -A 3 "pod" tools/kubeblocks_helm/pika-master-slave-cluster/

Length of output: 1442

14-20: Review cluster-level permissions scope

The role grants broad read access to all clusters. Consider:

Limiting scope to specific cluster names if possible
Adding resource names restriction for better granularity

✅ Verification successful

Let me gather more context about the cluster-related permissions in the codebase.

Let me check the rolebinding to understand how this role is being used and its scope.

Let me check the helpers template to understand how the cluster name is scoped.

The current RBAC permissions are appropriately scoped

The role and rolebinding are already well-scoped for the following reasons:

The role is namespace-scoped (not cluster-wide)
The role name is templated with specific cluster name (kb-{{ include "clustername" . }})
The permissions are minimal and read-only for clusters (get/list/watch)
The role is bound to a specific service account that's unique to the cluster instance
The implementation follows the principle of least privilege by only granting necessary permissions

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check for other roles/bindings that might grant similar permissions
rg -l "apps.kubeblocks.io" | grep -i "role\|binding"

Length of output: 120

Script:

#!/bin/bash
# Check the complete role definition and other RBAC-related configurations
cat tools/kubeblocks_helm/pika-master-slave-cluster/templates/role.yaml

# Look for any resource name restrictions or selectors
rg "resourceNames:" tools/kubeblocks_helm/pika-master-slave-cluster/templates/role.yaml

# Check if there are any rolebindings to understand the scope of access
fd rolebinding -t f tools/kubeblocks_helm/pika-master-slave-cluster/templates/

Length of output: 823

Script:

#!/bin/bash
# Check the rolebinding configuration
cat tools/kubeblocks_helm/pika-master-slave-cluster/templates/rolebinding.yaml

# Check if there are any cluster-specific templates or values that might restrict scope
fd "values.yaml|_helpers.tpl" -t f tools/kubeblocks_helm/pika-master-slave-cluster/

Length of output: 745

Script:

#!/bin/bash
# Check the helpers template for cluster name scoping
cat tools/kubeblocks_helm/pika-master-slave-cluster/templates/_helpers.tpl

# Check values.yaml for any default configurations
cat tools/kubeblocks_helm/pika-master-slave-cluster/values.yaml

Length of output: 3077

tools/kubeblocks_helm/pika-master-slave/templates/backuppolicytemplate.yaml (1)

14-21: Review backup method configuration

The backup configuration has potential concerns:

snapshotVolumes: false means no volume snapshots - ensure this aligns with data consistency requirements
Verify that the /data mountPath contains all necessary data for a complete backup

✅ Verification successful

Let's search for volume configurations more broadly to understand the data persistence and backup setup in the Pika templates.

Backup configuration is correctly aligned with data persistence setup

The backup configuration is properly configured based on the following findings:

The /data mountPath in backup policy matches the primary data volume mount in the Pika component definition
The volume mount is consistently used across containers (init, main pika container) for data persistence
The backup action set (pika-backup) is configured to use the same /data directory and includes proper backup/restore scripts
snapshotVolumes: false is appropriate since Pika's data files are being backed up directly from the mounted volume

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for volume mount configurations
rg -l "mountPath.*data" | grep -i "volume\|mount"

Length of output: 51

Script:

#!/bin/bash
# Search for volume mount configurations with broader pattern
rg -l "/data" --type yaml

# Search for volume configurations in pika templates
fd . "tools/kubeblocks_helm/pika-master-slave/templates" -e yaml -x cat {}

Length of output: 6294

tools/kubeblocks_helm/pika-master-slave/templates/backupactionset.yaml (1)

19-20: Verify Redis image security implications

Running backup operations with the Redis image on target pod nodes requires careful security consideration.

Let's verify the image configuration and permissions:

✅ Verification successful

Redis backup operations appear safe with minimal privileges

The backup process performs standard Redis operations (BGSAVE) and file operations within a confined data directory. The script uses basic shell commands and doesn't require elevated privileges. Key observations:

Uses redis-cli for database operations
File operations limited to specific directories (log, db)
No privileged operations or system-level modifications detected
Runs with standard container permissions

While runOnTargetPodNode is set to true, the operations performed are safe and contained within the expected scope of a backup process.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check Redis image references in the codebase
rg -l 'image:.*redis' --type yaml

# Look for any security context configurations
rg -l 'securityContext' --type yaml

Length of output: 336

Script:

#!/bin/bash
# Check the content of files with Redis image references to understand the security context
cat tools/kubeblocks_helm/pika-master-slave/templates/backupactionset.yaml tools/kubeblocks_helm/pika-master-slave/templates/componentdefinition-pika.yaml

# Check for privilege-related configurations in the backup action
rg -A 5 'runOnTargetPodNode' --type yaml

Length of output: 4952

Script:

#!/bin/bash
# Check the backup script content to understand what operations are performed
cat tools/kubeblocks_helm/pika-master-slave/dataprotection/backup.sh

# Check if there are any specific security configurations for the backup process
rg -A 10 'securityContext|capabilities|privileged' tools/kubeblocks_helm/pika-master-slave/templates/

Length of output: 1153

tools/kubeblocks_helm/pika-master-slave/templates/componentdefinition-pika.yaml (1)

28-34: LGTM! Well-defined role configuration.

The roles are properly defined with clear attributes for master and slave nodes, following Kubernetes best practices.

coderabbitai · 2024-12-06T07:30:20Z

tools/kubeblocks_helm/install_ms.sh

@@ -0,0 +1 @@
+helm install pika-master-slave ./pika-master-slave && helm install pika-master-slave-cluster ./pika-master-slave-cluster


🛠️ Refactor suggestion

Consider adding pre-installation checks and cleanup options

The script should verify the environment and provide cleanup options:

Check if Helm is installed

Check if charts are already installed

Provide options to clean up existing installations

Would you like me to provide an enhanced implementation with these features?

🧰 Tools

🪛 Shellcheck (0.10.0)

[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.

(SC2148)

⚠️ Potential issue

Add shebang and proper error handling

The script needs several improvements for reliability and portability:

Missing shebang line

No error handling

Relative paths could fail depending on execution context

Consider this improved implementation:

+#!/bin/bash +set -eo pipefail + +# Script directory +SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )" + +# Function to check if chart exists +check_chart() { + local chart_path="$1" + if [ ! -d "$chart_path" ]; then + echo "Error: Chart not found at $chart_path" + exit 1 + fi +} + +# Check charts exist +check_chart "${SCRIPT_DIR}/pika-master-slave" +check_chart "${SCRIPT_DIR}/pika-master-slave-cluster" + +# Install charts with error handling +echo "Installing pika-master-slave chart..." +if ! helm install pika-master-slave "${SCRIPT_DIR}/pika-master-slave"; then + echo "Failed to install pika-master-slave chart" + exit 1 +fi + +echo "Installing pika-master-slave-cluster chart..." +if ! helm install pika-master-slave-cluster "${SCRIPT_DIR}/pika-master-slave-cluster"; then + echo "Failed to install pika-master-slave-cluster chart" + exit 1 +fi + +echo "Installation completed successfully" -helm install pika-master-slave ./pika-master-slave && helm install pika-master-slave-cluster ./pika-master-slave-cluster

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

helm install pika-master-slave ./pika-master-slave && helm install pika-master-slave-cluster ./pika-master-slave-cluster

#!/bin/bash

set -eo pipefail

# Script directory

SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"

# Function to check if chart exists

check_chart() {

local chart_path="$1"

if [ ! -d "$chart_path" ]; then

echo "Error: Chart not found at $chart_path"

exit 1

fi

}

# Check charts exist

check_chart "${SCRIPT_DIR}/pika-master-slave"

check_chart "${SCRIPT_DIR}/pika-master-slave-cluster"

# Install charts with error handling

echo "Installing pika-master-slave chart..."

if ! helm install pika-master-slave "${SCRIPT_DIR}/pika-master-slave"; then

echo "Failed to install pika-master-slave chart"

exit 1

fi

echo "Installing pika-master-slave-cluster chart..."

if ! helm install pika-master-slave-cluster "${SCRIPT_DIR}/pika-master-slave-cluster"; then

echo "Failed to install pika-master-slave-cluster chart"

exit 1

fi

echo "Installation completed successfully"

🧰 Tools

🪛 Shellcheck (0.10.0)

[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.

(SC2148)

coderabbitai · 2024-12-06T07:30:20Z

tools/kubeblocks_helm/BackupRepo_config

+kbcli backuprepo create my-repo \
+  --provider s3 \
+  --region <...>\
+  --bucket <...> \
+  --endpoint <...>\
+  --access-key-id <...>   \
+  --secret-access-key <...> \
+  --access-method <...> \ 
+  --default \


⚠️ Potential issue

Consider security implications of exposing credentials in command

The command contains sensitive credentials that should not be exposed in plain text. Consider:

Using environment variables or credential files

Supporting AWS IAM roles for authentication

Adding documentation about secure credential management

Example secure usage:

export S3_ACCESS_KEY=<access_key> export S3_SECRET_KEY=<secret_key> kbcli backuprepo create my-repo \ --provider s3 \ --region ${S3_REGION} \ --bucket ${S3_BUCKET} \ --endpoint ${S3_ENDPOINT} \ --access-key-id ${S3_ACCESS_KEY} \ --secret-access-key ${S3_SECRET_KEY} \ --access-method ${S3_ACCESS_METHOD} \ --default

coderabbitai · 2024-12-06T07:30:21Z

tools/kubeblocks_helm/pika-master-slave/dataprotection/restore.sh

+backupFile="${DP_BACKUP_NAME}.tar.zst"
+if [ "$(datasafed list ${backupFile})" == "${backupFile}" ]; then
+   datasafed pull -d zstd-fastest "${backupFile}" - | tar -xvf - -C ${DATA_DIR}
+else
+   datasafed pull "${DP_BACKUP_NAME}.tar.gz" - | tar -xzvf - -C ${DATA_DIR}
+fi


⚠️ Potential issue

Verify backup file existence before restoration

The script should explicitly verify the backup file's existence before attempting restoration.

Add explicit validation:

backupFile="${DP_BACKUP_NAME}.tar.zst" +if ! datasafed list | grep -q "${backupFile}\|${DP_BACKUP_NAME}.tar.gz"; then + echo "Error: Backup file not found in repository" + exit 1 +fi + if [ "$(datasafed list ${backupFile})" == "${backupFile}" ]; then

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

backupFile="${DP_BACKUP_NAME}.tar.zst"

if [ "$(datasafed list ${backupFile})" == "${backupFile}" ]; then

datasafed pull -d zstd-fastest "${backupFile}" - | tar -xvf - -C ${DATA_DIR}

else

datasafed pull "${DP_BACKUP_NAME}.tar.gz" - | tar -xzvf - -C ${DATA_DIR}

fi

backupFile="${DP_BACKUP_NAME}.tar.zst"

if ! datasafed list | grep -q "${backupFile}\|${DP_BACKUP_NAME}.tar.gz"; then

echo "Error: Backup file not found in repository"

exit 1

fi

if [ "$(datasafed list ${backupFile})" == "${backupFile}" ]; then

datasafed pull -d zstd-fastest "${backupFile}" - | tar -xvf - -C ${DATA_DIR}

else

datasafed pull "${DP_BACKUP_NAME}.tar.gz" - | tar -xzvf - -C ${DATA_DIR}

fi

coderabbitai · 2024-12-06T07:30:21Z

tools/kubeblocks_helm/pika-master-slave/dataprotection/restore.sh

+set -e
+set -o pipefail


⚠️ Potential issue

Add shebang and improve error handling setup

The script is missing a shebang line and could benefit from stricter error handling.

Apply this diff to improve the script initialization:

+#!/bin/bash +set -euo pipefail -set -e -set -o pipefail

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

set -e

set -o pipefail

#!/bin/bash

set -euo pipefail

🧰 Tools

🪛 Shellcheck (0.10.0)

[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.

(SC2148)

coderabbitai · 2024-12-06T07:30:21Z

tools/kubeblocks_helm/pika-master-slave/dataprotection/backup.sh

+  sleep 1
+done
+
+cd ${DATA_DIR}


⚠️ Potential issue

Add error handling for directory change

The cd command should include error handling as suggested by shellcheck.

Add error handling for directory change:

-cd ${DATA_DIR} +cd "${DATA_DIR}" || { + echo "Error: Failed to change to directory ${DATA_DIR}" + exit 1 +}

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

cd ${DATA_DIR}

cd "${DATA_DIR}" || {

echo "Error: Failed to change to directory ${DATA_DIR}"

exit 1

}

🧰 Tools

🪛 Shellcheck (0.10.0)

[warning] 29-29: Use 'cd ... || exit' or 'cd ... || return' in case cd fails.

(SC2164)

coderabbitai · 2024-12-06T07:30:21Z

tools/kubeblocks_helm/pika-master-slave/dataprotection/backup.sh

+if [ -d "log" ] || [ -d "db" ]; then
+    tar -cvf - ./log ./db | datasafed push -z zstd-fastest - "${DP_BACKUP_NAME}.tar.zst"
+else
+    echo "no log db"
+    exit 1
+fi


🛠️ Refactor suggestion

Improve error message and add size validation

The error message for missing directories is too brief, and there's no validation of directory sizes.

Enhance the validation and error handling:

+# Check directory existence and size +MIN_BACKUP_SIZE=1024 # 1KB minimum + if [ -d "log" ] || [ -d "db" ]; then + total_size=$(du -sb ./log ./db 2>/dev/null | awk '{sum += $1} END {print sum}') + if [ "$total_size" -lt "$MIN_BACKUP_SIZE" ]; then + echo "Warning: Backup size ($total_size bytes) is unusually small" + fi tar -cvf - ./log ./db | datasafed push -z zstd-fastest - "${DP_BACKUP_NAME}.tar.zst" else - echo "no log db" + echo "Error: Required directories 'log' and/or 'db' not found in ${DATA_DIR}" + echo "Please verify that the Redis instance is properly configured and running" exit 1 fi

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if [ -d "log" ] || [ -d "db" ]; then

tar -cvf - ./log ./db | datasafed push -z zstd-fastest - "${DP_BACKUP_NAME}.tar.zst"

else

echo "no log db"

exit 1

fi

# Check directory existence and size

MIN_BACKUP_SIZE=1024 # 1KB minimum

if [ -d "log" ] || [ -d "db" ]; then

total_size=$(du -sb ./log ./db 2>/dev/null | awk '{sum += $1} END {print sum}')

if [ "$total_size" -lt "$MIN_BACKUP_SIZE" ]; then

echo "Warning: Backup size ($total_size bytes) is unusually small"

fi

tar -cvf - ./log ./db | datasafed push -z zstd-fastest - "${DP_BACKUP_NAME}.tar.zst"

else

echo "Error: Required directories 'log' and/or 'db' not found in ${DATA_DIR}"

echo "Please verify that the Redis instance is properly configured and running"

exit 1

fi

coderabbitai · 2024-12-06T07:30:21Z

tools/kubeblocks_helm/pika-master-slave/dataprotection/backup.sh

+while true; do
+  end_save=$(${connect_url} LASTSAVE)
+  if [ $end_save -ne $last_save ];then
+     break
+  fi
+  sleep 1
+done


⚠️ Potential issue

Add timeout to BGSAVE wait loop

The while loop waiting for BGSAVE completion lacks a timeout mechanism.

Add a timeout to prevent infinite waiting:

+MAX_WAIT_SECONDS=300 +start_time=$(date +%s) while true; do end_save=$(${connect_url} LASTSAVE) if [ $end_save -ne $last_save ];then break fi + current_time=$(date +%s) + if [ $((current_time - start_time)) -gt $MAX_WAIT_SECONDS ]; then + echo "Error: BGSAVE operation timed out after ${MAX_WAIT_SECONDS} seconds" + exit 1 + fi sleep 1 done

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

while true; do

end_save=$(${connect_url} LASTSAVE)

if [ $end_save -ne $last_save ];then

break

fi

sleep 1

done

MAX_WAIT_SECONDS=300

start_time=$(date +%s)

while true; do

end_save=$(${connect_url} LASTSAVE)

if [ $end_save -ne $last_save ];then

break

fi

current_time=$(date +%s)

if [ $((current_time - start_time)) -gt $MAX_WAIT_SECONDS ]; then

echo "Error: BGSAVE operation timed out after ${MAX_WAIT_SECONDS} seconds"

exit 1

fi

sleep 1

done

coderabbitai · 2024-12-06T07:30:21Z

tools/kubeblocks_helm/README.md

+### Back up and restore a cluster
+Ensure that the default BackupRepo is defined
+Fellow the kubeblock docs [kubeblocks](https://www.kubeblocks.io/docs/preview/user_docs/maintenance/backup-and-restore/backup/backup-repo)
+
+create backup
+```bash
+kbcli cluster backup pika-master-slave-cluster --method datafile
+```
+
+Select a backup and create a cluster.
+
+```bash
+kbcli cluster restore <clusterName> --backup <backup-name>


🛠️ Refactor suggestion

Documentation needs enhancement for better user guidance.

The backup and restore section requires several improvements:

Add prerequisites and setup steps for BackupRepo

Include the command to view existing backups: kbcli cluster list-backups

Provide a complete example of the restore command with actual values

Fix typo: "Fellow" should be "Follow"

Add proper closure for the last code block

Here's the suggested improvement:

### Back up and restore a cluster -Ensure that the default BackupRepo is defined -Fellow the kubeblock docs [kubeblocks](https://www.kubeblocks.io/docs/preview/user_docs/maintenance/backup-and-restore/backup/backup-repo) +#### Prerequisites +1. Ensure that the default BackupRepo is defined: + ```bash + kbcli backuprepo create my-repo + ``` +2. Follow the Kubeblocks docs for detailed setup: [Backup Repository](https://www.kubeblocks.io/docs/preview/user_docs/maintenance/backup-and-restore/backup/backup-repo) -create backup +#### Create a backup ```bash kbcli cluster backup pika-master-slave-cluster --method datafile

-Select a backup and create a cluster.
+#### List available backups
+bash +kbcli cluster list-backups +

+#### Restore from a backup

-kbcli cluster restore <clusterName> --backup <backup-name> +kbcli cluster restore test-pika --backup backup-default-pika-master-slave-cluster-20241203170414



coderabbitai · 2024-12-06T07:30:22Z

tools/kubeblocks_helm/pika-master-slave/templates/componentdefinition-pika.yaml

+    roleProbe:
+      builtinHandler: custom 
      customHandler:
        image: {{ include "redis.image" .}}
        exec:
          command:
-           - /bin/bash
-           - -c
-           - |
-             IFS=',' read -r -a pod_name_array <<< ${KB_CLUSTER_COMPONENT_POD_NAME_LIST}
-             # Ergodic name array
-             headless=".pika-master-slave-cluster-pika-headless"
-             for i in "${!pod_name_array[@]}"; do
-               pod_name=${pod_name_array[${i}]}
-               # Extracting the digital portion of the pod
-               pod_number=$(echo "${pod_name}" | grep -o '[0-9]*')
-               # If the pod number is greater than 0
-               if (( pod_number > 0 )); then
-                 pod0_svc=${pod_name_array[0]}${headless}
-                 echo "exec:redis-cli -h ${pod_name}${headless} -p 9221 slaveof ${pod0_svc}:9221"
-                 redis-cli -h "${pod_name}${headless}" -p 9221 slaveof "${pod0_svc}" 9221
-               fi
-             done
-        preCondition: ComponentReady
+            - /bin/bash
+            - -c
+            - |
+              replication_info=$(redis-cli -h 127.0.0.1 -p 9221 info replication)
+              role=$(echo "${replication_info}" | awk -F':' '/^role:/ {print $2}' | tr -d '[:space:]')
+              if [ "$role"x = "master"x ]; then
+                echo -n "master"
+              else
+                echo -n "slave"
+              fi  
+      periodSeconds: 1
+      timeoutSeconds: 1


🛠️ Refactor suggestion

Enhance roleProbe reliability with proper error handling.

The current implementation has several potential reliability issues:

No error handling for redis-cli command failures

No timeout handling for redis-cli command

Hardcoded port number should be parameterized

Suggested improvement:

customHandler: image: {{ include "redis.image" .}} exec: command: - /bin/bash - -c - | - replication_info=$(redis-cli -h 127.0.0.1 -p 9221 info replication) + REDIS_PORT=${REDIS_PORT:-9221} + timeout 5s redis-cli -h 127.0.0.1 -p ${REDIS_PORT} info replication > /tmp/replication_info || { + echo "Failed to get replication info" >&2 + exit 1 + } + replication_info=$(cat /tmp/replication_info) role=$(echo "${replication_info}" | awk -F':' '/^role:/ {print $2}' | tr -d '[:space:]') + if [ -z "$role" ]; then + echo "Failed to determine role" >&2 + exit 1 + } if [ "$role"x = "master"x ]; then echo -n "master" else echo -n "slave" - fi + fi

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

roleProbe:

builtinHandler: custom

customHandler:

image: {{ include "redis.image" .}}

exec:

command:

- /bin/bash

- -c

- |

IFS=',' read -r -a pod_name_array <<< ${KB_CLUSTER_COMPONENT_POD_NAME_LIST}

# Ergodic name array

headless=".pika-master-slave-cluster-pika-headless"

for i in "${!pod_name_array[@]}"; do

pod_name=${pod_name_array[${i}]}

# Extracting the digital portion of the pod

pod_number=$(echo "${pod_name}" | grep -o '[0-9]*')

# If the pod number is greater than 0

if (( pod_number > 0 )); then

pod0_svc=${pod_name_array[0]}${headless}

echo "exec:redis-cli -h ${pod_name}${headless} -p 9221 slaveof ${pod0_svc}:9221"

redis-cli -h "${pod_name}${headless}" -p 9221 slaveof "${pod0_svc}" 9221

fi

done

preCondition: ComponentReady

- /bin/bash

- -c

- |

replication_info=$(redis-cli -h 127.0.0.1 -p 9221 info replication)

role=$(echo "${replication_info}" | awk -F':' '/^role:/ {print $2}' | tr -d '[:space:]')

if [ "$role"x = "master"x ]; then

echo -n "master"

else

echo -n "slave"

fi

periodSeconds: 1

timeoutSeconds: 1

roleProbe:

builtinHandler: custom

customHandler:

image: {{ include "redis.image" .}}

exec:

command:

- /bin/bash

- -c

- |

REDIS_PORT=${REDIS_PORT:-9221}

timeout 5s redis-cli -h 127.0.0.1 -p ${REDIS_PORT} info replication > /tmp/replication_info || {

echo "Failed to get replication info" >&2

exit 1

}

replication_info=$(cat /tmp/replication_info)

role=$(echo "${replication_info}" | awk -F':' '/^role:/ {print $2}' | tr -d '[:space:]')

if [ -z "$role" ]; then

echo "Failed to determine role" >&2

exit 1

}

if [ "$role"x = "master"x ]; then

echo -n "master"

else

echo -n "slave"

fi

periodSeconds: 1

timeoutSeconds: 1

🧰 Tools

🪛 yamllint (1.35.1)

[error] 51-51: trailing spaces

(trailing-spaces)

coderabbitai · 2024-12-06T07:30:22Z

tools/kubeblocks_helm/pika-master-slave/templates/componentdefinition-pika.yaml

+      - name: redis-cli-runner
+        image: {{ include "redis.image" . }}
+        imagePullPolicy: {{ include "pika.imagePullPolicy" . }}
+        command:
+          - /bin/bash
+          - -c
+          - |
+            while true; do
+              hostname=$KB_POD_NAME
+              if [[ "$hostname" =~ -([0-9]+)$ ]]; then
+                ordinal=${BASH_REMATCH[1]}
+                if [[ $ordinal -ge 1 && $ordinal -le 9 ]]; then
+                  replication_info=$(redis-cli -h 127.0.0.1 -p 9221 info replication)
+                  role=$(echo "${replication_info}" | awk -F':' '/^role:/ {print $2}' | tr -d '[:space:]')
+                  if [ "$role"x = "master"x ]; then
+                    redis-cli -h 127.0.0.1 -p 9221 slaveof pika-master-slave-cluster-pika-0.pika-master-slave-cluster-pika-headless 9221
+                  fi
+                fi
+              fi
+              sleep 10
+            done


🛠️ Refactor suggestion

Improve redis-cli-runner container reliability and maintainability.

The current implementation has several areas for improvement:

Infinite loop without proper termination

Missing error handling for redis-cli commands

Hardcoded sleep interval

Magic numbers in ordinal check

Limited logging for debugging

Suggested improvement:

- name: redis-cli-runner image: {{ include "redis.image" . }} imagePullPolicy: {{ include "pika.imagePullPolicy" . }} + env: + - name: REDIS_PORT + value: "9221" + - name: SLEEP_INTERVAL + value: "10" + - name: MAX_REPLICAS + value: "9" command: - /bin/bash - -c - | + set -eo pipefail + + log() { + echo "$(date -u +"%Y-%m-%d %H:%M:%S") $1" >&2 + } + while true; do hostname=$KB_POD_NAME if [[ "$hostname" =~ -([0-9]+)$ ]]; then ordinal=${BASH_REMATCH[1]} - if [[ $ordinal -ge 1 && $ordinal -le 9 ]]; then + if [[ $ordinal -ge 1 && $ordinal -le ${MAX_REPLICAS} ]]; then + log "Checking replication status for ordinal $ordinal" - replication_info=$(redis-cli -h 127.0.0.1 -p 9221 info replication) + replication_info=$(timeout 5s redis-cli -h 127.0.0.1 -p ${REDIS_PORT} info replication) || { + log "Failed to get replication info" + sleep ${SLEEP_INTERVAL} + continue + } role=$(echo "${replication_info}" | awk -F':' '/^role:/ {print $2}' | tr -d '[:space:]') if [ "$role"x = "master"x ]; then - redis-cli -h 127.0.0.1 -p 9221 slaveof pika-master-slave-cluster-pika-0.pika-master-slave-cluster-pika-headless 9221 + log "Setting up replication for ordinal $ordinal" + timeout 5s redis-cli -h 127.0.0.1 -p ${REDIS_PORT} slaveof pika-master-slave-cluster-pika-0.pika-master-slave-cluster-pika-headless ${REDIS_PORT} || { + log "Failed to set up replication" + } fi fi fi - sleep 10 + sleep ${SLEEP_INTERVAL} done

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

- name: redis-cli-runner

image: {{ include "redis.image" . }}

imagePullPolicy: {{ include "pika.imagePullPolicy" . }}

command:

- /bin/bash

- -c

- |

while true; do

hostname=$KB_POD_NAME

if [[ "$hostname" =~ -([0-9]+)$ ]]; then

ordinal=${BASH_REMATCH[1]}

if [[ $ordinal -ge 1 && $ordinal -le 9 ]]; then

replication_info=$(redis-cli -h 127.0.0.1 -p 9221 info replication)

role=$(echo "${replication_info}" | awk -F':' '/^role:/ {print $2}' | tr -d '[:space:]')

if [ "$role"x = "master"x ]; then

redis-cli -h 127.0.0.1 -p 9221 slaveof pika-master-slave-cluster-pika-0.pika-master-slave-cluster-pika-headless 9221

fi

fi

fi

sleep 10

done

- name: redis-cli-runner

image: {{ include "redis.image" . }}

imagePullPolicy: {{ include "pika.imagePullPolicy" . }}

env:

- name: REDIS_PORT

value: "9221"

- name: SLEEP_INTERVAL

value: "10"

- name: MAX_REPLICAS

value: "9"

command:

- /bin/bash

- -c

- |

set -eo pipefail

log() {

echo "$(date -u +"%Y-%m-%d %H:%M:%S") $1" >&2

}

while true; do

hostname=$KB_POD_NAME

if [[ "$hostname" =~ -([0-9]+)$ ]]; then

ordinal=${BASH_REMATCH[1]}

if [[ $ordinal -ge 1 && $ordinal -le ${MAX_REPLICAS} ]]; then

log "Checking replication status for ordinal $ordinal"

replication_info=$(timeout 5s redis-cli -h 127.0.0.1 -p ${REDIS_PORT} info replication) || {

log "Failed to get replication info"

sleep ${SLEEP_INTERVAL}

continue

}

role=$(echo "${replication_info}" | awk -F':' '/^role:/ {print $2}' | tr -d '[:space:]')

if [ "$role"x = "master"x ]; then

log "Setting up replication for ordinal $ordinal"

timeout 5s redis-cli -h 127.0.0.1 -p ${REDIS_PORT} slaveof pika-master-slave-cluster-pika-0.pika-master-slave-cluster-pika-headless ${REDIS_PORT} || {

log "Failed to set up replication"

}

fi

fi

fi

sleep ${SLEEP_INTERVAL}

done

buzhimingyonghu added 3 commits December 6, 2024 06:46

backup and restore

ca2449d

backup and restore

ada0a64

delete sh

a1cdd24

github-actions bot added Invalid PR Title 📒 Documentation Improvements or additions to documentation labels Dec 6, 2024

coderabbitai bot reviewed Dec 6, 2024

View reviewed changes

modify config

daf5bfc

chejinge changed the title ~~finish backup and restore~~ feat:finish backup and restore Dec 6, 2024

github-actions bot removed the Invalid PR Title label Dec 6, 2024

chejinge added 3.5.6 4.0.2 and removed 📒 Documentation Improvements or additions to documentation labels Dec 6, 2024

chejinge approved these changes Dec 9, 2024

View reviewed changes

baerwang merged commit 713e6fb into OpenAtomFoundation:unstable Dec 9, 2024
18 of 20 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat:finish backup and restore #2968

feat:finish backup and restore #2968

buzhimingyonghu commented Dec 6, 2024 •

edited by chejinge

Loading

coderabbitai bot commented Dec 6, 2024 •

edited

Loading

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

coderabbitai bot left a comment

coderabbitai bot Dec 6, 2024

coderabbitai bot Dec 6, 2024

coderabbitai bot Dec 6, 2024

coderabbitai bot Dec 6, 2024

coderabbitai bot Dec 6, 2024

coderabbitai bot Dec 6, 2024

coderabbitai bot Dec 6, 2024

coderabbitai bot Dec 6, 2024

coderabbitai bot Dec 6, 2024

coderabbitai bot Dec 6, 2024

		@@ -0,0 +1 @@
		helm install pika-master-slave ./pika-master-slave && helm install pika-master-slave-cluster ./pika-master-slave-cluster

-helm install pika-master-slave ./pika-master-slave && helm install pika-master-slave-cluster ./pika-master-slave-cluster
+#!/bin/bash
+set -eo pipefail
+# Script directory
+SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
+# Function to check if chart exists
+check_chart() {
+    local chart_path="$1"
+    if [ ! -d "$chart_path" ]; then
+        echo "Error: Chart not found at $chart_path"
+        exit 1
+    fi
+}
+# Check charts exist
+check_chart "${SCRIPT_DIR}/pika-master-slave"
+check_chart "${SCRIPT_DIR}/pika-master-slave-cluster"
+# Install charts with error handling
+echo "Installing pika-master-slave chart..."
+if ! helm install pika-master-slave "${SCRIPT_DIR}/pika-master-slave"; then
+    echo "Failed to install pika-master-slave chart"
+    exit 1
+fi
+echo "Installing pika-master-slave-cluster chart..."
+if ! helm install pika-master-slave-cluster "${SCRIPT_DIR}/pika-master-slave-cluster"; then
+    echo "Failed to install pika-master-slave-cluster chart"
+    exit 1
+fi
+echo "Installation completed successfully"

-cd ${DATA_DIR}
+cd "${DATA_DIR}" || {
+    echo "Error: Failed to change to directory ${DATA_DIR}"
+    exit 1
+}

-if [ -d "log" ] || [ -d "db" ]; then
-    tar -cvf - ./log ./db | datasafed push -z zstd-fastest - "${DP_BACKUP_NAME}.tar.zst"
-else
-    echo "no log db"
-    exit 1
-fi
+# Check directory existence and size
+MIN_BACKUP_SIZE=1024  # 1KB minimum
+if [ -d "log" ] || [ -d "db" ]; then
+    total_size=$(du -sb ./log ./db 2>/dev/null | awk '{sum += $1} END {print sum}')
+    if [ "$total_size" -lt "$MIN_BACKUP_SIZE" ]; then
+        echo "Warning: Backup size ($total_size bytes) is unusually small"
+    fi
+    tar -cvf - ./log ./db | datasafed push -z zstd-fastest - "${DP_BACKUP_NAME}.tar.zst"
+else
+    echo "Error: Required directories 'log' and/or 'db' not found in ${DATA_DIR}"
+    echo "Please verify that the Redis instance is properly configured and running"
+    exit 1
+fi

feat:finish backup and restore #2968

feat:finish backup and restore #2968

Conversation

buzhimingyonghu commented Dec 6, 2024 • edited by chejinge Loading

Summary by CodeRabbit

Summary by CodeRabbit

Release Notes

coderabbitai bot commented Dec 6, 2024 • edited Loading

Walkthrough

Changes

Possibly related PRs

Suggested labels

Suggested reviewers

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Dec 6, 2024

Choose a reason for hiding this comment

coderabbitai bot Dec 6, 2024

Choose a reason for hiding this comment

coderabbitai bot Dec 6, 2024

Choose a reason for hiding this comment

coderabbitai bot Dec 6, 2024

Choose a reason for hiding this comment

coderabbitai bot Dec 6, 2024

Choose a reason for hiding this comment

coderabbitai bot Dec 6, 2024

Choose a reason for hiding this comment

coderabbitai bot Dec 6, 2024

Choose a reason for hiding this comment

coderabbitai bot Dec 6, 2024

Choose a reason for hiding this comment

coderabbitai bot Dec 6, 2024

Choose a reason for hiding this comment

coderabbitai bot Dec 6, 2024

Choose a reason for hiding this comment

buzhimingyonghu commented Dec 6, 2024 •

edited by chejinge

Loading

coderabbitai bot commented Dec 6, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)