Skip to content

User Guide

Brian Presley edited this page Nov 17, 2023 · 18 revisions

User Guide: End-to-End Migration Process

This User Guide outlines the process for successfully performing an end-to-end migration. The solution offered in this repository caters to several specific scenarios:

  1. Migrating existing or historical data from one cluster to another.
  2. Transferring ongoing or live traffic between clusters.
  3. Conducting a comprehensive migration involving both existing and live data.
  4. Upgrading an existing cluster.
  5. Comparing an existing cluster with a prospective new one.

In this guide, we focus on scenarios 1 and 2, guiding you through the migration of historical data from a source cluster while concurrently handling live production traffic, which will be captured and redirected to a target cluster. It's crucial to note that migration strategies are not universally applicable. This guide provides a detailed methodology, predicated on certain assumptions detailed throughout, emphasizing the importance of robust engineering practices and a systematic approach to ensure a successful migration.

Key Components of the Solution

Elasticsearch/OpenSearch Source

:Assumption: Starting point is an ElasticSearch 7.x/OpenSearch 1.x cluster managed on AWS EC2.

Your source cluster in this solution operates on Elasticsearch or OpenSearch, hosted on EC2 instances or similar computing environments. A proxy is set up to interact with this source cluster, either positioned in front of or directly on the coordinating nodes of the cluster.

Capture Proxy

This component is designed for HTTP RESTful traffic, playing a dual role. It not only forwards traffic to the source cluster but also splits and channels this traffic to a stream-processing service for later playback.

Traffic Replayer

Acting as a traffic simulation tool, the Traffic Replayer replays recorded request traffic to a target cluster, mirroring real-world workload patterns. It links original requests and their responses to those directed at the target cluster, facilitating comparative analysis.

Historical Data Migration Container

This container is tasked with a one-time operation to transfer index metadata and historical data from the source to the target cluster. It compares indices between clusters to identify those requiring migration and employs the open-source Data Prepper.

Migration Management Console

Operational within the Elastic Container Service (ECS) on AWS Fargate, this console is a containerized platform. It orchestrates the deployment of the Migration Assistant for Amazon OpenSearch Service, alongside a variety of tools to streamline the migration process.

Architecture Overview

The solution architecture, adaptable for cloud deployment, unfolds as follows:

  1. Incoming traffic reaches the existing cluster, targeting each coordinator node.
  2. A Capture Proxy is placed before each coordinator node for traffic capture, storing data in an event stream.
  3. With the continuous capture setup, historical data backfill is initiated.
  4. Post-backfill, the captured traffic is replayed using the Traffic Replayer.
  5. The results from directing traffic to both the original and new clusters are then evaluated.

:assumption: This architecture is based on the use of AWS cloud infrastructure, but most tools are designed to be cloud-independent. A local containerized version of this solution is also available.

Deploying to AWS (covered later in the guide) will deploy the following into your AWS account:

image

  1. Traffic is directed to the existing cluster, reaching each coordinator node.
  2. A Capture Proxy is added before each coordinator node in the cluster, allowing for traffic capture and storage in Amazon MSK.
  3. Once continuous traffic capture is in place, the user initiates a historical backfill.
  4. Following the backfill, the user replays the captured traffic using a Traffic Replayer.
  5. The user evaluates the outcomes from routing traffic to both the original and the new cluster.
  6. After confirming the new cluster’s functionality meets expectations, the user dismantles all related stacks, retaining only the new cluster’s setup. Additionally, the user can retire and discard the old cluster’s legacy infrastructure and all of the solution’s stacks, keeping only the new cluster.

Installing the proxy

Capture Proxy

How to attach a Capture Proxy on a coordinator node.

Follow documentation for deploying solution. Then, on a cluster with at least two coordinator nodes, the user can attach a Capture Proxy on a node by following these steps: Please note that this is one method for installing the Capture Proxy on a node, and that these steps may vary depending on your environment.

These are the prerequisites to being able to attach the Capture Proxy:

  • *Make sure that your MSK client is accessible by the coordinator nodes in the cluster

    • Add the following IAM policy to the node/EC2 instance so that it’s able to store the captured traffic in Kafka:
    • From the AWS Console, go to the EC2 instance page, click on IAM Role, click on Add permissions, choose Create inline policy, click on JSON VIEW then add the following policy (replace region and account-id).
    {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Action": "kafka-cluster:Connect",
                "Resource": "arn:aws:kafka:<region>:<account-id>:cluster/migration-msk-cluster-<stage>/*",
                "Effect": "Allow"
            },
            {
                "Action": [
                    "kafka-cluster:CreateTopic",
                    "kafka-cluster:DescribeTopic",
                    "kafka-cluster:WriteData"
                ],
                "Resource": "arn:aws:kafka:<region>:<account-id>:topic/migration-msk-cluster-<stage>/*",
                "Effect": "Allow"
            }
        ]
    }
  • Verify Java installation is accessible.

    • From linux command line of that EC2 instance, Check that the JAVA_HOME environment variable is set properly (echo $JAVA_HOME), if not, then try running the following command that might help set it correctly:

      JAVA_HOME=$(dirname "$(dirname "$(type -p java)")")

      • If that doesn’t work, then find the java directory on your node and set it as $JAVA_HOME

Follow these steps to attach a Capture Proxy on the node.

  1. Log in to one of the coordinator nodes for command line access.

  2. Update node’s port setting.

    1. Update elasticsearch.yml/opensearch.yml. Add this line to the node’s config file: http.port: 19200
  3. Restart Elasticsearch/OpenSearch process so that the process will bind to the newly configured port. For example, if systemctl is available on your linux distribution you can run the following (Note: depending on your installation of Elasticsearch, these methods may not work for you)

    1. sudo systemctl restart elasticsearch.service
  4. Verify process is bound to new port. Run netstat -tapn to see if the new port is being listened on. If the new port is not there, then there is a chance that Elasticsearch/ OpenSearch is not running, in that case, you must start the process again. (Depending on your setup, restarting/starting the Elasticsearch process may differ)

  5. Test the new port by sending any kind of traffic or request, e.g; curl https://localhost:19200 or http://

  6. Download Capture Proxy:

    1. Go to the Opensearch Migrations latest releases page: https://github.com/opensearch-project/opensearch-migrations/releases/latest
    2. Copy the link for the Capture Proxy tar file, mind your instance’s architecture.
    3. curl -L0 <capture-proxy-tar-file-link> --output CaptureProxyX64.tar.gz
    4. Unpack solution tarball: tar -xvf CaptureProxyX64.tar.gz
    5. cd CaptureProxyX64/bin
  7. Running the Capture Proxy: 1. nohup ./CaptureProxyX64 --kafkaConnection <msk-endpoint> --destinationUri http://localhost:19200 —listenPort 9200 —enableMSKAuth --insecureDestination &

    Explanation of parameters in the command above:

    • --kafkaConnection: your MSK client endpoint.
    • --destinationUri: URI of the server that the Capture Proxy is capturing traffic for.
    • --listenPort: Exposed port for clients to connect to this proxy. (The original port that the node was listening to)
    • --enableMSKAuth: Enables SASL Kafka properties required for connecting to MSK with IAM auth.
    • --insecureDestination: Do not check the destination server’s certificate.
  8. Test the port that the Capture Proxy is now listening to.

    1. curl https://localhost:9200 or http://
    2. You should expect the same response when sending a request to either ports (9200, 19200), except that the traffic sent to the port that the Capture Proxy is listening to, will be captured and sent to your MSK Client, also forwarded to the new Elasticsearch port.
  9. Verify requests are sent to Kafka

    • Verify that a new topic has been created
      1. Log in to the Migration Console container.
      2. Go the Kafka tools directory cd kafka-tools/kafka/bin
      3. Run the following command to list the Kafka topics, and confirm that a new topic was created. ./kafka-topics.sh --bootstrap-server "$MIGRATION_KAFKA_BROKER_ENDPOINTS" --list --command-config ../../aws/msk-iam-auth.properties
Clone this wiki locally