- Python packages:
# Install required Python packages
pip3 install ruamel.yaml
- Terraform (latest version)
- OCI CLI (latest version)
Generate SSH keys without a passphrase (required for automated deployment):
# Generate SSH key pair without passphrase
ssh-keygen -t rsa -b 4096 -C "[email protected]"
- Create a file to store node details:
# Create nodes configuration file
touch /path/to/your/nodes.yaml
# Initial content should be:
nodes: []
- Update the following paths in your configuration files:
# In finalise_yaml.py
cluster_config_file = "/path/to/your/nodes.yaml"
# In main.tf
python3 -c "... /path/to/your/nodes.yaml ..." #at line 128
- Update the following variables in
terraform.tfvars
:
# Instance naming
worker_instance_display_name = "your-worker-name"
# OCI Authentication
user_ocid = "ocid1.user.oc1..your-user-ocid"
private_key_path = "/path/to/your/.oci/oci_private_key.pem"
fingerprint = "your:oci:api:key:fingerprint"
# SSH Configuration
ssh_public_keys = <<EOT
your-ssh-public-key-content
EOT
ssh_private_key_file = "/path/to/your/.ssh/id_rsa"
- Make the deployment script executable:
chmod +x create-nodes.sh
- Run the script:
./create-nodes.sh
-
Choose from the menu:
- Option 1: Deploy worker nodes
- Option 2: Destroy infrastructure
- Option 3: Exit
-
For deployment:
- Enter the number of worker nodes when prompted
- Review the terraform plan
- Confirm to apply
-
For destruction:
- Confirm destruction
- Wait for completion
- Ensure all paths in configuration files are absolute paths
- SSH keys must be generated without a passphrase
- The nodes.yaml file must exist and be writable
- Ensure proper OCI permissions for resource creation/destruction
- Python 3.6+ and required packages:
pip3 install oci paramiko pyyaml python-dotenv
- OCI CLI configuration and authentication
- Edit the configuration in
oci_node_manager.py
. Update the following values in the.env
file:
# .env.example
USER_OCID=your-user-ocid
FINGERPRINT=your-api-key-fingerprint
KEY_FILE_PATH=/path/to/your/oci_private_key.pem
SSH_PUBLIC_KEY_PATH=/path/to/your/id_rsa.pub
SSH_PRIVATE_KEY_PATH=/path/to/your/id_rsa
- Make the script executable:
chmod +x oci_node_manager.py
- Go to funtion create_instance in oci_node_manager.py and modify the shape and size of instances.
shape="VM.Standard.E4.Flex",
shape_config=oci.core.models.LaunchInstanceShapeConfigDetails(
ocpus=1,
memory_in_gbs=4
)
-
Deploy a single node: #default name is "rafay-paas"
./oci_node_manager.py deploy --count 1
-
Deploy multiple nodes:
./oci_node_manager.py deploy --count 5
-
Deploy with custom instance naming:
./oci_node_manager.py deploy --count 3 --basename custom-worker # Creates: custom-worker-1, custom-worker-2, custom-worker-3
-
Deploy with custom concurrency:
./oci_node_manager.py deploy --count 10 --concurrent 8
-
Combine multiple flags:
./oci_node_manager.py deploy --count 5 --basename prod-node --concurrent 3
Flag | Description | Default | Example |
---|---|---|---|
--count |
Number of nodes to deploy | 1 | --count 5 |
--basename |
Base name for instance naming | rafay-paas | --basename worker |
--concurrent |
Maximum concurrent operations | 5 | --concurrent 8 |
- Nodes are automatically numbered sequentially
- Numbering continues from the last used index in nodes.yaml
- Example sequence with
--basename rafay-paas
:- First deployment: rafay-paas-1, rafay-paas-2, rafay-paas-3
- Second deployment: rafay-paas-4, rafay-paas-5, rafay-paas-6
-
Stop specific nodes by hostname:
./oci_node_manager.py stop --hostnames rafay-paas-1 rafay-paas-2
-
Start specific nodes by hostname:
./oci_node_manager.py start --hostnames rafay-paas-1 rafay-paas-2
-
Destroy specific nodes by hostname:
./oci_node_manager.py destroy --hostnames rafay-paas-1 rafay-paas-2
-
Destroy all deployed nodes:
./oci_node_manager.py destroy
-
Destroy specific nodes by hostname:
./oci_node_manager.py destroy --hostnames rafay-paas-1 rafay-paas-2
-
Destroy with custom concurrency:
./oci_node_manager.py destroy --concurrent 8
-
Combine multiple flags:
./oci_node_manager.py destroy --hostnames rafay-paas-1 rafay-paas-2 --concurrent 3
Flag | Description | Default | Example |
---|---|---|---|
--hostnames |
Specific nodes to destroy | None (destroys all) | --hostnames worker-1 worker-2 |
--concurrent |
Maximum concurrent operations | 5 | --concurrent 8 |
- Concurrent node creation and deletion
- Automatic YAML configuration generation
- Progress tracking and detailed feedback
- Error handling and recovery
- Configurable concurrency limits
- Manage nodes by hostname for easier identification
nodes.yaml
: Contains node configuration detailsinstance_ids.txt
: Backup of instance IDs for tracking
- The script uses the VM.Standard.E4.Flex shape with 1 OCPU and 4GB RAM by default
- Instances are named as "rafay-paas-1", "rafay-paas-2", etc.
- The script automatically configures iptables rules on the instances
- Maximum concurrent operations can be adjusted using the
--concurrent
flag - Default concurrency is set to 5 to respect API rate limits
-
If deployment fails:
- Check OCI credentials and permissions
- Verify subnet and image OCIDs
- Ensure SSH keys are properly configured
-
If destroy operation fails:
- Check if instances still exist in OCI console
- Verify the YAML file and instance_ids.txt are present
- Ensure OCI API access is working
Advantages:
- Faster concurrent operations
- Real-time progress feedback
- Simpler configuration
- Direct OCI API interaction
Disadvantages:
- Less infrastructure-as-code features
- Manual state management
- Limited to specific use case
Choose the method that best suits your needs:
- Use Terraform method for infrastructure-as-code approach
- Use Python script for quick deployments and better concurrency
-
Copy the
.env.example
file to.env
:cp .env.example .env
-
Fill in the
.env
file with your actual configuration values.