Skip to content

Commit

Permalink
Merge branch 'main' into MoreContextOnThrows
Browse files Browse the repository at this point in the history
  • Loading branch information
gregschohn authored Dec 8, 2023
2 parents 0d9c2f9 + 908f360 commit 84e54b4
Show file tree
Hide file tree
Showing 3 changed files with 79 additions and 75 deletions.
123 changes: 49 additions & 74 deletions TrafficCapture/trafficCaptureProxyServer/README.md
Original file line number Diff line number Diff line change
@@ -1,81 +1,56 @@
# Capture Proxy

## How to attach a Capture Proxy on a coordinator node.
## Installing Capture Proxy on Coordinator nodes

Follow documentation for [deploying solution](../../deployment/README.md). Then, on a cluster with at least two coordinator nodes, the user can attach a Capture Proxy on a node by following these steps:
Please note that this is one method for installing the Capture Proxy on a node, and that these steps may vary depending on your environment.
Follow documentation for [deploying solution](../../deployment/README.md) to set up the Kafka cluster which the Capture Proxy will send captured traffic to. Then, following the steps below, attach the Capture Proxy on each coordinator node of the source cluster, so that all incoming traffic can be captured. If the source cluster has more than one coordinator node, this can be done in a rolling restart fashion where there is no downtime for the source cluster, otherwise a single node cluster should expect downtime as the Elasticsearch/OpenSearch process restarts.


These are the **prerequisites** to being able to attach the Capture Proxy:

* **Make sure that your MSK client is accessible by the coordinator nodes in the cluster*
* Add the following IAM policy to the node/EC2 instance so that it’s able to store the captured traffic in Kafka:
* From the AWS Console, go to the EC2 instance page, click on **IAM Role**, click on **Add permissions**, choose **Create inline policy**, click on **JSON VIEW** then add the following policy (replace region and account-id).

```json
{
"Version": "2012-10-17",
"Statement": [
{
"Action": "kafka-cluster:Connect",
"Resource": "arn:aws:kafka:<region>:<account-id>:cluster/migration-msk-cluster-<stage>/*",
"Effect": "Allow"
},
{
"Action": [
"kafka-cluster:CreateTopic",
"kafka-cluster:DescribeTopic",
"kafka-cluster:WriteData"
],
"Resource": "arn:aws:kafka:<region>:<account-id>:topic/migration-msk-cluster-<stage>/*",
"Effect": "Allow"
}
]
}
```

* **Verify Java installation is accessible**.
* From linux command line of that EC2 instance, Check that the JAVA_HOME environment variable is set properly `(echo $JAVA_HOME)`, if not, then try running the following command that might help set it correctly:

`JAVA_HOME=$(dirname "$(dirname "$(type -p java)")")`
* If that doesn’t work, then find the java directory on your node and set it as $JAVA_HOME
**Note**: For AWS deployments, see required instructions for setting up IAM and Security Groups [here](../../deployment/cdk/opensearch-service-migration/README.md#configuring-capture-proxy-iam-and-security-groups) </br>
**Note**: This is one method for installing the Capture Proxy on a node, and that these steps may vary depending on your environment.

### Follow these steps to attach a Capture Proxy on the node.

1. **Log in to one of the coordinator nodes** for command line access.
2. **Update node’s port setting**.
1. **Update elasticsearch.yml/opensearch.yml**. Add this line to the node’s config file:
http.port: 19200
3. **Restart Elasticsearch/OpenSearch process** so that the process will bind to the newly configured port. For example, if systemctl is available on your linux distribution you can run the following (Note: depending on your installation of Elasticsearch, these methods may not work for you)
1. `sudo systemctl restart elasticsearch.service`

4. **Verify process is bound to new port**. Run netstat -tapn to see if the new port is being listened on.
If the new port is not there, then there is a chance that Elasticsearch/ OpenSearch is not running, in that case, you must start the process again. (Depending on your setup, restarting/starting the Elasticsearch process may differ)
5. **Test the new port** by sending any kind of traffic or request, e.g; curl https://localhost:19200 or http://
6. **Download Capture Proxy**:
1. Go to the Opensearch Migrations latest releases page: https://github.com/opensearch-project/opensearch-migrations/releases/latest
2. Copy the link for the Capture Proxy tar file, mind your instance’s architecture.
3. `curl -L0 <capture-proxy-tar-file-link> --output CaptureProxyX64.tar.gz`
4. Unpack solution tarball: `tar -xvf CaptureProxyX64.tar.gz`
5. `cd CaptureProxyX64/bin`
7. **Running the Capture Proxy**:
1. `nohup ./CaptureProxyX64 --kafkaConnection <msk-endpoint> --destinationUri http://localhost:19200 —listenPort 9200 —enableMSKAuth --insecureDestination &`

**Explanation of parameters** in the command above:

* **--kafkaConnection**: your MSK client endpoint.
* **--destinationUri**: URI of the server that the Capture Proxy is capturing traffic for.
* **--listenPort**: Exposed port for clients to connect to this proxy. (The original port that the node was listening to)
* **--enableMSKAuth**: Enables SASL Kafka properties required for connecting to MSK with IAM auth.
* **--insecureDestination**: Do not check the destination server’s certificate.

8. **Test the port** that the Capture Proxy is now listening to.
1. `curl https://localhost:9200` or `http://`
2. You should expect the same response when sending a request to either ports (9200, 19200), except that the traffic sent to the port that the Capture Proxy is listening to, will be captured and sent to your MSK Client, also forwarded to the new Elasticsearch port.
9. **Verify requests are sent to Kafka**
* **Verify that a new topic has been created**
1. Log in to the Migration Console container.
2. Go the Kafka tools directory
cd kafka-tools/kafka/bin
3. Run the following command to list the Kafka topics, and confirm that a new topic was created.
`./kafka-topics.sh --bootstrap-server "$MIGRATION_KAFKA_BROKER_ENDPOINTS" --list --command-config ../../aws/msk-iam-auth.properties`
1. Log in to one of the coordinator nodes for command line access.
2. Locate the elasticsearch/opensearch directory
* On EC2 this may look like `/home/ec2-user/elasticsearch`
3. Update node’s port setting in elasticsearch.yml/opensearch.yml, e.g. `/home/ec2-user/elasticsearch/config/elasticsearch.yml`.
* Add this line to the node’s config file: `http.port: 19200`
* This will allow incoming traffic to enter through the Capture proxy at the normal port (typically `9200`) and then be passed to Elasticsearch/OpenSearch at the `19200` port
4. Restart Elasticsearch/OpenSearch process so that the process will bind to the newly configured port.
* Depending on the installation method used, how you should restart Elasticsearch/OpenSearch will differ:
* For **Debian packages with systemctl** (Most common) you can run the following command: `sudo systemctl restart elasticsearch.service`
* For **Debian packages with service** you can run the following command: `sudo -i service elasticsearch restart`
* For **Tarball** you should find the running process (e.g. `ps aux | grep elasticsearch`) and stop it. Then start the process again as you normally do.
5. Verify process is bound to new port. Run `netstat -tapn | grep LISTEN` to see if the new port is being listened on and that the old port is no longer there.
* If the new port is not there, then there is a chance that Elasticsearch/OpenSearch is not running, in that case, you must start the process again and verify that it has started properly.
6. Test the new port by sending any kind of traffic or request, e.g. `curl https://localhost:19200` or if security is not enabled `curl http://localhost:19200`
7. Verify the JAVA_HOME environment variable has been set (`echo $JAVA_HOME`) as this will be necessary for the Capture Proxy to execute.
* If Java is not already available on your node, your elasticsearch/opensearch folder may have a bundled JDK you can use, e.g. `export JAVA_HOME=/home/ec2-user/elasticsearch/jdk` (this can also be added to your shell startup script to be available for all sessions)
8. Download the Capture Proxy tar file
1. Go to the `opensearch-migrations` repository's latest releases page: https://github.com/opensearch-project/opensearch-migrations/releases/latest
2. Copy the link for the Capture Proxy tar file, mind your node’s architecture.
3. Download the tar file in a persistent folder, using the link from the previous step `curl -L0 <CAPTURE-PROXY-TAR-LINK> --output CaptureProxyX64.tar.gz`
4. Unpack the tar file: `tar -xvf CaptureProxyX64.tar.gz`
9. Start the Capture Proxy:
1. Access the Capture Proxy shell script directory `cd trafficCaptureProxyServer/bin`
2. Run the Capture Proxy command
* Depending on how you are running Kafka, the command needed will differ:
* For **default/Docker Kafka** clusters `nohup ./trafficCaptureProxyServer --kafkaConnection <KAFKA_BROKERS> --destinationUri http://localhost:19200 --listenPort 9200 --insecureDestination &`
* The `KAFKA_BROKERS` referenced here will vary based on setup, but for a default docker setup would be `kafka:9092`
* For **AWS MSK(Kafka)** clusters `nohup ./trafficCaptureProxyServer --kafkaConnection <KAFKA_BROKERS> --destinationUri http://localhost:19200 --listenPort 9200 --enableMSKAuth --insecureDestination &`
* The `KAFKA_BROKERS` referenced here can be obtained from the AWS Console(MSK -> Clusters -> Select Cluster -> View Client Information -> Copy Private endpoint)
* This command will start the Capture Proxy in the background and allow it to continue past the lifetime of the shell session.
* :warning: If the machine running the Elasticsearch/OpenSearch process is restarted, the Capture Proxy will need to be started again
* Explanation of parameters in the command above:
* --kafkaConnection: Your Kafka broker endpoint(s) as a string with multiple brokers delimited by a ',' e.g. `"broker1:9098,broker2:9098"`.
* --destinationUri: URI of the server that the Capture Proxy is capturing traffic for.
* --listenPort: Exposed port for clients to connect to this proxy. (The original port that the node was listening to)
* --enableMSKAuth: Enables SASL Kafka properties required for connecting to Kafka with IAM auth. **Note**: Only valid for AWS MSK setups
* --insecureDestination: Do not check the destination server’s certificate.
10. Test that the original Elasticsearch/OpenSearch port for the node is available again.
* `curl https://localhost:9200` or if security is not enabled `curl http://localhost:9200`
* You should expect the same response when sending a request to either ports (`9200`, `19200`), with the difference being that requests that are sent to the `9200` port now pass through the Capture Proxy whose first priority is to forward the request to the Elasticsearch/OpenSearch process, and whose second priority is to send the captured requests to Kafka.
11. Verify requests are sent to Kafka
1. Log in to the Migration Console container. More details [here](../../deployment/cdk/opensearch-service-migration/README.md#executing-commands-on-a-deployed-service) for AWS deployments
2. Run the following command to list the Kafka topics, and confirm that the `logging-traffic-topic` has been created.
* For **default/Docker Kafka** clusters `./kafka-tools/kafka/bin/kafka-topics.sh --bootstrap-server "$MIGRATION_KAFKA_BROKER_ENDPOINTS" --list`
* For **AWS MSK(Kafka)** clusters `./kafka-tools/kafka/bin/kafka-topics.sh --bootstrap-server "$MIGRATION_KAFKA_BROKER_ENDPOINTS" --list --command-config kafka-tools/aws/msk-iam-auth.properties`
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
1.0.0
1.0.1
29 changes: 29 additions & 0 deletions deployment/cdk/opensearch-service-migration/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -200,6 +200,35 @@ Run the `accessAnalyticsDashboard` script, and then open https://localhost:8157/
./accessAnalyticsDashboard.sh dev us-east-1
```

## Configuring Capture Proxy IAM and Security Groups
Although this CDK does not set up the Capture Proxy on source cluster nodes (except in the case of the demo solution), the Capture Proxy instances do need to communicate with resources deployed by this CDK (e.g. Kafka) which this section covers

#### Capture Proxy on OpenSearch/Elasticsearch nodes
Before [setting up Capture Proxy instances](../../../TrafficCapture/trafficCaptureProxyServer/README.md#how-to-attach-a-capture-proxy-on-a-coordinator-node) on the source cluster, the IAM policies and Security Groups for the nodes should allow access to the Migration tooling:
1. The coordinator nodes should add the `migrationMSKSecurityGroup` security group to allow access to Kafka
2. The IAM role used by the coordinator nodes should have permissions to publish captured traffic to Kafka. A template policy to use, can be seen below
* This can be added through the AWS Console (IAM Role -> Add permissions -> Create inline policy -> JSON view)
```json
{
"Version": "2012-10-17",
"Statement": [
{
"Action": "kafka-cluster:Connect",
"Resource": "arn:aws:kafka:<REGION>:<ACCOUNT-ID>:cluster/migration-msk-cluster-<STAGE>/*",
"Effect": "Allow"
},
{
"Action": [
"kafka-cluster:CreateTopic",
"kafka-cluster:DescribeTopic",
"kafka-cluster:WriteData"
],
"Resource": "arn:aws:kafka:<REGION>:<ACCOUNT-ID>:topic/migration-msk-cluster-<STAGE>/*",
"Effect": "Allow"
}
]
}
```

## Tearing down CDK
To remove all the CDK stack(s) which get created during a deployment we can execute a command similar to below
Expand Down

0 comments on commit 84e54b4

Please sign in to comment.