Second Dinner stress-test-loader
This is the Second Dinner stress-test-loader. It is a small golang application for executing stress tests (and arguably any executable) on the cloud, plus some infra things (packer, pulumi, etc) for deployment. We have open-sourced this code to contribute to development community.
Currently, this setup targets AWS, but it can be ported other clouds if needed.
Directory structure:
-
stress-test-loader (golang service that can load any stress-test-client, plus packer templates for creating AMIs)
-
infra-pulumi (.NET Pulumi code for deploying AMIs to create stress-test-loader ec2 instances)
-
build-container (Docker container for running the stress test loader in GitHub Actions)
-
simple-stress-test-client (A simple stress test client that pings any host)
To take advantage of AWS arm64 offering, we are building the golang executable and packer AMI image all in arm64 format.
The following example demonstrates building the stress-test-loader ami.
To secure the GRPC connection between the client and server of the stress test loader, we use SSL to authenticate clients. Please follow the steps below to generate the certificates.
cd stress-test-loader
- Generate a private key and self-signed certificate for a CA
./gen_ca.sh
- Generate certificates for the server and the client
./gen_cert.sh
cd stress-test-loader
source cicd/ami/build-stress-test-loader.sh
- If everything worked according to plan, you will see message like
AMIs were created: ami-XXXXXXXXXXXXXXX
Once you have created an AWS AMI for stress-test, you can use infra-pulumi to create EC2 instance and create as many EC2 instances as your AWS account allows.
[Note] We migrated from Terraform to Pulumi for the following reasons:
- Language Flexibility: While Terraform uses its own domain-specific language (HCL), Pulumi allows infrastructure provisioning and management using popular general-purpose languages like C#, JavaScript, Python, TypeScript, and Go.
- Declarative and Imperative Approaches: While Terraform focuses on the declarative paradigm, Pulumi offers both declarative and imperative approaches. Pulumi's support for the imperative style allows for more fine-grained control and dynamic changes during provisioning. In our case, we would like to provision the infrastructure in multiple regions. In Terraform, we had to hardcode each region in the code, which was inflexible and hard to maintain. In Pulumi, we can read the list of regions from environment variables and implement a loop to achieve this.
- Need following variables:
- public_key: your ssh public key;
- stress_test_loader_allowed_cidr: your machine's public IP; when running in GitHub Actions, this should be the GitHub runner's public IP (feel free to check out our workflow);
- s3_client_bucket_name: the name of your AWS S3 bucket to store the stress test client executable;
- s3_log_bucket_name: the name of your AWS S3 bucket to store the logs;
- desired_capacity: the number of ec2 instances to create in a region;
- regions: the AWS regions to create ec2 instances, separated by commas (e.g., "us-east-1, us-west-2");
- Update variables
infra-pulumi/Infra.Pulumi/Infra.Pulumi/Config.cs
. cd infra-pulumi/Infra.Pulumi/Infra.Pulumi
- Set up the infrastructure
dotnet run --project-name stress-test-loader-pulumi --stack-name dev
- If everything worked according to plan, you will see message like
Diagnostics:
pulumi:pulumi:Stack (stress-test-loader-pulumi-dev):
Downloading provider: aws
Resources:
+ 51 created
Duration: 4m20s
- the public IP addresses of all the ec2 instances will be stored in
/tmp/IP.json
- to destroy the infrastructure after the stress test:
dotnet run --project-name stress-test-loader-pulumi --stack-name dev --destroy
- build your stress test client as an arm64 executable, this can be a directory with libraries and one entry executable. The executable can take any number of environment variable as configuration. We are going to use
simple-stress-test-client
as an example cd simple-stress-test-client/StressTest
dotnet publish -r linux-arm64 --self-contained true -c Release
cd bin/Release
cd "$(ls -d */ | head -n 1)"
cd linux-arm64/publish
tar czf /tmp/simple-stress-test-client.tgz ./
- copy the tgz file to an S3 bucket
aws s3 cp /tmp/simple-stress-test-client.tgz s3://stress-test-client-s3/simple-stress-test-client.tgz
- Build a stress-test-loader config json. For example
stresstest.json
{
"s3": "stress-test-client-s3",
"s3key": "simple-stress-test-client.tgz",
"loadtestExec": "StressTest",
"envVariableList": [
{
"EnvName": "num_pings",
"EnvValue": "10"
},
{
"EnvName": "ping_interval",
"EnvValue": "500"
},
{
"EnvName": "host",
"EnvValue": "https://www.google.com/"
}
]
}
cd stress-test-loader/client
- Run stress test
go run main.go stresstest.json /tmp/IP.json
- You can let the stress test loader client to wait for the stress tests to finish by specifying a maximum time to wait
- If you gave an ssh public key, you can ssh into the ec2 instance and check its systemd service log
journalctl -f -u stress*
- If you are running our simple-stress-test-client, you can check the log
cat /tmp/stresstest-log
- To check the status of the stress tests (running or finished)
go run main.go -p /tmp/IP.json
- To stop the stress tests
go run main.go -s /tmp/IP.json
We provided three reusable GitHub Actions workflows, namely _stress-test-packer-build.yaml, _pulumi-set-up.yaml, and _run-stress-test.yaml. You can fork this repo and call these workflows from your repo.
This workflow builds the stress test loader, generates certificates for the SSL connection between stress test loader clients and servers, and creates an AMI using Packer. To call this workflow, you will need to provide the following secrets:
- TARGET_ACCOUNT: Your AWS Account ID.
- CA_KEY: The private key of your certificate authority (CA) used in SSL. You may generate a new key by
openssl genrsa -out example.org.key 2048
This workflow sets up the infrastructure on AWS for stress test clients. The infrastructure includes S3 buckets (may already exist), IAM roles and policies, VPC, and AutoScaling Groups in one or more regions. To call this workflow, you will need to provide the following input:
- REGIONS: The AWS regions to create ec2 instances, separated by commas (e.g., "us-east-1, us-west-2").
- DESIRED_CAPACITY: The number of ec2 instances to create in a region.
And the following secrets:
- GITHUB_ACTION_PULUMI_ACCESS_TOKEN: Your Pulumi Cloud access token.
- STRESSTESTLOADER_S3_CLIENT_BUCKET_NAME: The name of your S3 bucket that stores the stress test client.
- STRESSTESTLOADER_S3_LOG_BUCKET_NAME: The name of your S3 bucket that will store the stress test logs.
- TARGET_ACCOUNT: Your AWS Account ID.
- CA_KEY: The private key of your certificate authority (CA) used in SSL. This should be the same one you used for the Packer build.
This workflow will produce one output:
- EC2_IP: The public IP addresses of the EC2 instances created by this workflow.
This workflow starts the stress tests on the launched EC2 instances. It will wait until all stress tests are finished or the specified timeout. Optionally, it can ssh into the EC2 instances and fetch the logs. In the end, it will destroy the infrastructure. To call this workflow, you will need to provide the following input:
- GET_RESULTS: A boolean. If set, the stress test loader will ssh into the EC2 instances and print the last 100 lines of
/tmp/stress-test-log
to the GitHub Actions console. - EC2_IP: The public IP addresses of the EC2 instances output by
_pulumi-set-up.yaml
. - REGIONS: The AWS regions to create ec2 instances, separated by commas (e.g., "us-east-1, us-west-2").
- DESIRED_CAPACITY: The number of ec2 instances to create in a region.
And the following secrets:
- GITHUB_ACTION_PULUMI_ACCESS_TOKEN: Your Pulumi Cloud access token.
- STRESSTESTLOADER_S3_CLIENT_BUCKET_NAME: The name of your S3 bucket that stores the stress test client.
- STRESSTESTLOADER_S3_LOG_BUCKET_NAME: The name of your S3 bucket that will store the stress test logs.
- TARGET_ACCOUNT: Your AWS Account ID.
- STRESS_TEST_JSON: The config json for the stress test loader. Please see the section
Create a stress test client configuration json
in this README. - STRESS_TEST_TOTAL_TIME: The maximum time (in seconds) to run the stress test. After the specified time, any running stress test will be stopped by the loader.
- SSH_PRIVATE_KEY: Your ssh public key.
- CA_KEY: The private key of your certificate authority (CA) used in SSL. This should be the same one you used for the Packer build.