Important
Copyright 2024, Clumio, a Commvault Company. Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
Bulk restores are used to restore multiple resources from different originating locations (AWS account and region pairing) and/or times to one or more target locations.
Based upon the source definition, Clumio will find the set of appropriate backups for identified resources and restore those resources using the user provided target information.
The inputs - in json format - required to run the bulk restore automation can be defined ahead of time or easily crafted/updated when the restore is needed. Inputs are in two different categories: source information and target information.
Source information can include AWS account, AWS Region, AWS resources tags, and a datetime search window.
Target information is the resource specific AWS infrastructure elements running in the target location that are required to deploy that specific type of AWS resource. The only additional value you will need is a Clumio API token that validates your permissions to run the automation based upon your Clumio login.
To greatly simplify the process of running the recovery, the bulk restore automation leverages a serverless architecture (AWS Lambda functions) and a state machine (AWS StepFunctions). This scales out the recovery process to initiate all restores at the same time.
Limits on the number of concurrent restores and the performance of those restores are dependent upon the resource types being restored. https://help.clumio.com/docs/clumio-service-limits
This solution can be deployed anywhere in AWS and does not need to have access to either the original AWS source location or the target locations. Outside of the AWS resources mentioned above, logging in AWS CloudWatch, a S3 bucket used temporarily to deploy the solution; the only other AWS resource needed is an AWS Secret which can optionally be used to store your Clumio API token.
Note
The files included in this github repository for information purposes only. This python code represents the contents of the lambda functions used by the state machine. This code along with non-default python packages are bundled in the ZIP file that is required to run the CFT.
Note
JSON file example_step_function_inputs.json is an example of the inputs required to run the step function. These inputs would be modified to reflect your environment.
Note
An IAM role that has permissions to execute the step function and the lambda functions (and to write to CloudWatch for logging purposes) must be identified/created before you deploy the CFT template. If required, you can modify the permission of this IAM role after all of the resources have been created to scope those permissions to achieve least privilege. If you use the AWS secret to store your Clumio api token, this IAM Role will also need to have read access to the secret.
The example of both the role trusted relationships and the policy can be found in examples folder.
Note
The clumio_bulk_restore_deploy_cft.yaml
file is the CloudFormation (CFT)
deployment template. Deploy this CFT template to setup the solution.
To build you will need a Unix type shell (bash
, zsh
, ...), Python 3.12, make
and zip
.
make build
It will fetch the dependencies and generate the zip file clumio_bulk_restore.zip
under the build
directory alongside the clumio_bulk_restore_deploy_cft.yaml
CloudFormation template.
The zip file must be uploaded to a S3 bucket where it can be accessed by the CloudFormation Template when you deploy the solution.
Tip
- Clumio backups must exist (and not be expired) for all resources that are to be restored.
- Identify a S3 bucket where zip file can be copied.
- Identify an IAM Role that has the ability to run both the lambda functions and the state machine.
- Copy ZIP file from the git repository to the S3 bucket.
- Run the CFT YAML file. You will need to enter the S3 bucket and IAM role as parameters to run the CFT YAML file.
- Create an input JSON file for the state machine based upon the example JSON and the descriptions below.
- Execute the State machine and pass it your input JSON.
- If the input file has multiple restore sets, the restore automation will start multiple discovery threads. One for each restore set.
- Each Discovery thread will discovery a set of backups to be restored and will start separate restore threads for each backup.
- Each restore thread will log a job status when the restore completes/fails. This status will include information about the source of the restore.
- The final step/state of the step function will show a combined status for every restore.
Warning
FOR EXAMPLE PURPOSES ONLY
Base Input Parameter | Description |
---|---|
bear | Clumio API bearer token https://help.clumio.com/docs/api-tokens |
debug | Set to a non-zero value to debug issues |
Source/Search Input Parameter | Description |
---|---|
source_account | AWS account from which the ebs resources where backed up |
source_region | AWS region from which the ebs resources where backed up |
search_direction | When Choosing backups based upon a point in time look for backups "before" or "after" this point in time |
"before" | Set a search window from the point in time to the current time |
"after" | Set a search window from the point in time to the max search time |
end_search_day_offset | If searching "before" a point in time this represents the offset from the current day to the point in time |
start_search_day_offset | If searching "before" a point in time this is not used |
start_search_day_offset | If searching "after" a point in time this represents the offset from the current day to the point in time |
end_search_day_offset | If searching "after" a point in time this represents the offset from the current day to the max search time |
search_tag_key | If searching by AWS tag set search key and value |
search_tag_value | If searching by AWS tag set search key and value |
Target Input Parameter EBS | Description |
---|---|
target_account | AWS account where the ebs resource is to be restored |
target_region | AWS region where the ebs resource is to be restored |
target_aws_az | required, infrastructure value for restore AWS AZ |
target_iops | optional, infrastructure value for EBS iops setting. Should only be use if target_volume_type is one of gp3, io1, or io2. |
target_volume_type | optional, infrastructure value for EBS volume type setting. Required if target_iops is set. |
target_kms_key_native_id | optional, infrastructure value for restore AWS KMS key id |
Target Input Parameter RDS | Description |
---|---|
target_account | AWS account where the ebs resource is to be restored |
target_region | AWS region where the ebs resource is to be restored |
target_subnet_group_name | required, infrastructure value for RDS Subnet group name |
target_rds_name | required, infrastructure value for RDS instance/cluster name |
target_security_group_native_id | optional, infrastructure value for RDS Security Group List |
target_kms_key_native_id | optional, infrastructure value for RDS AWS KMS key id |
Target Input Parameter EC2 | Description |
---|---|
target_account | AWS account where the ebs resource is to be restored |
target_region | AWS region where the ebs resource is to be restored |
target_aws_az | required, infrastructure value for restore AWS AZ |
target_vpc_native_id | required, infrastructure value for EC2 VPC id |
target_subnet_native_id | required, infrastructure value for EC2 Subnet id |
target_kms_key_native_id | optional, infrastructure value for EC2 AWS KMS key id |
target_iam_instance_profile_name | optional, infrastructure value for EC2 IAM instance profile name |
target_key_pair_name | optional, infrastructure value for EC2 Key pair name |
target_security_group_native_id | optional, infrastructure value for EC2 Security Group List |
Target Input Parameter DynamoDB | Description |
---|---|
target_account | AWS account where the ebs resource is to be restored |
target_region | AWS region where the ebs resource is to be restored |
change_set_name | required, infrastructure value for DynamoDB table name component |
Target Input Parameter ProtectionGroup | Description |
---|---|
search_pg_name | Required. The name of protection group that the restore is going to be done from. |
target_bucket | Required. The name of the bucket that the resource will be restored to. |
search_bucket_names | Optional. The list of bucket names within the protection group to be restored. |
search_object_filters | Optional. A specification dict to filter the objects to restore. |
target_prefix | Optional. The prefix to add to the restored object. |
Note
Optional infrastructure target values may still be required based upon the configuration of the original backed up resource.