Skip to content

Commit

Permalink
Fully automate stress cluster buildout and add support for azure file…
Browse files Browse the repository at this point in the history
… share mounting (#2106)

- Fully automate cluster buildout. Add azure file share mount to stress tests.
    - Moving the test/ad-hoc cluster back to the playground subscription
    - Upgrading kubernetes cluster version to 1.21.x to pull in support for the azure csi file driver
    - Adding high memory agent nodes to the base deployment
    - Enabling node autoscaler in the base deployment
- Publish stress watcher image in CI. Run docker build on PR
    - Using common image location across stress clusters to simplify buildout+deployment
- Add stress test debug file share usage example

Resolves #1903
  • Loading branch information
benbp authored Oct 22, 2021
1 parent 262e4e5 commit dd1c8ca
Show file tree
Hide file tree
Showing 32 changed files with 569 additions and 163 deletions.
4 changes: 2 additions & 2 deletions eng/common/scripts/stress-testing/deploy-stress-tests.ps1
Original file line number Diff line number Diff line change
Expand Up @@ -76,9 +76,9 @@ function DeployStressTests(
[string]$environment = 'test',
[string]$repository = 'images',
[boolean]$pushImages = $false,
[string]$clusterGroup = 'rg-stress-test-cluster-',
[string]$clusterGroup = 'rg-stress-cluster-test',
[string]$deployId = 'local',
[string]$subscription = 'Azure SDK Test Resources'
[string]$subscription = 'Azure SDK Developer Playground'
) {
if ($PSCmdlet.ParameterSetName -eq 'DoLogin') {
Login $subscription $clusterGroup $pushImages
Expand Down
21 changes: 20 additions & 1 deletion eng/containers/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,12 @@ parameters:
dockerFile: 'tools/test-proxy/docker/dockerfile-win'
stableTags:
- 'latest'
- name: stress_watcher
pool: 'ubuntu-20.04'
dockerRepo: 'stress/watcher'
dockerFile: 'tools/stress-cluster/services/Stress.Watcher/Dockerfile'
stableTags:
- 'latest'

trigger:
branches:
Expand All @@ -32,8 +38,18 @@ trigger:
- eng/containers/
- tools/test-proxy/docker/
- tools/keyvault-mock-attestation/Dockerfile
- tools/stress-cluster/services/Stress.Watcher/Dockerfile

pr: none
pr:
branches:
include:
- main
paths:
include:
- eng/containers/
- tools/test-proxy/docker/
- tools/keyvault-mock-attestation/Dockerfile
- tools/stress-cluster/services/Stress.Watcher/Dockerfile

variables:
- name: containerRegistry
Expand Down Expand Up @@ -64,6 +80,7 @@ jobs:

- task: Docker@2
displayName: Push ${{ config.name }}:$(imageTag)
condition: and(succeeded(), ne(variables['Build.Reason'], 'PullRequest'))
inputs:
containerRegistry: $(containerRegistry)
repository: ${{ config.dockerRepo }}
Expand All @@ -81,6 +98,8 @@ jobs:

- task: Docker@2
displayName: Push ${{ config.name }}:${{ stableTag }}
condition: and(succeeded(), ne(variables['Build.Reason'], 'PullRequest'))

inputs:
containerRegistry: $(containerRegistry)
repository: ${{ config.dockerRepo }}
Expand Down
29 changes: 27 additions & 2 deletions tools/stress-cluster/chaos/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ The chaos environment is an AKS cluster (Azure Kubernetes Service) with several
* [Creating a Stress Test](#creating-a-stress-test)
* [Layout](#layout)
* [Stress Test Secrets](#stress-test-secrets)
* [Stress Test File Share](#stress-test-file-share)
* [Stress Test Azure Resources](#stress-test-azure-resources)
* [Helm Chart Dependencies](#helm-chart-dependencies)
* [Job Manifest](#job-manifest)
Expand Down Expand Up @@ -41,12 +42,14 @@ You will need the following tools to create and run tests:

## Access

To access the cluster, run the following:
To access the cluster, run the following. These commands are unnecessary for stress test deployment but can be useful
for verifying permissions and directly interacting with containers via the kubernetes command line tool `kubectl`. For
running the build and deployment script, see [Deploying a Stress Test](#deploying-a-stress-test).

```
az login
# Download the kubeconfig for the cluster
az aks get-credentials --subscription "Azure SDK Test Resources" -g rg-stress-test-cluster- -n stress-test
az aks get-credentials --subscription "Azure SDK Developer Playground" -g rg-stress-cluster-test -n stress-test
```

You should now be able to access the cluster. To verify, you should see a list of namespaces when running the command:
Expand Down Expand Up @@ -198,6 +201,28 @@ APPINSIGHTS_INSTRUMENTATIONKEY=<value>
RESOURCE_GROUP=<value>
```

### Stress Test File Share

Stress tests are encouraged to use app insights logs and metrics as much as possible for diagnostics. However there
are some times where larger files (such as verbose logs, heap dumps, packet captures, etc.) need to be persisted for
a duration longer than the lifespan of the test itself.

All stress tests have an azure file share automatically mounted into the container by default. The path to this share
is available via the environment variable `$DEBUG_SHARE` and is global to all tests in the cluster.
The `$DEBUG_SHARE` path includes the namespace and pod name of the test in order to avoid path overlaps with other
tests. The `$DEBUG_SHARE_ROOT` path is also available, which points to the root of the file share, but this directory
should only be used in special circumstances and with caution.

NOTE: The share directory path MUST be created by the test before using it.

After writing debug files to the share, the files can be viewed by navigating to the [file share
portal](https://aka.ms/azsdk/stress/share),
selecting the `namespace/<pod name>` directory, and clicking the download link for any files in that directory.

See
[stress-debug-share-example](https://github.com/Azure/azure-sdk-tools/tree/main/tools/stress-cluster/chaos/examples/stress-debug-share-example)
for example usage.

### Stress Test Azure Resources

Stress test resources can either be defined as azure bicep files, or an ARM template directly, provided there is
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
dependencies:
- name: stress-test-addons
repository: https://stresstestcharts.blob.core.windows.net/helm/
version: 0.1.6
digest: sha256:b97697ef5f303eec43e9a94fca8e312d20b8aed71318250499344aeca9880d31
generated: "2021-08-16T12:57:01.466377-04:00"
version: 0.1.9
digest: sha256:2a32027871497958af15562a675bad47f4e29523cb18a91ce17b5078eaf9bbdf
generated: "2021-10-15T13:37:14.6487529-04:00"
Original file line number Diff line number Diff line change
Expand Up @@ -10,5 +10,5 @@ annotations:

dependencies:
- name: stress-test-addons
version: 0.1.7
version: 0.1.9
repository: https://stresstestcharts.blob.core.windows.net/helm/
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
dependencies:
- name: stress-test-addons
repository: https://stresstestcharts.blob.core.windows.net/helm/
version: 0.1.9
digest: sha256:2a32027871497958af15562a675bad47f4e29523cb18a91ce17b5078eaf9bbdf
generated: "2021-10-15T13:23:41.8857818-04:00"
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
apiVersion: v2
name: debug-share-example
description: An example stress test chart that uses a file share for debugging (e.g. for large log files, heap dumps)
version: 0.1.1
appVersion: v0.1
annotations:
stressTest: 'true' # enable auto-discovery of this test via `find-all-stress-packages.ps1`
example: 'true' # enable auto-discovery filtering `find-all-stress-packages.ps1 -filters @{example='true'}`
namespace: 'examples'

dependencies:
- name: stress-test-addons
version: 0.1.9
repository: https://stresstestcharts.blob.core.windows.net/helm/
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
{{- include "stress-test-addons.env-job-template.from-pod" (list . "stress.deploy-example") -}}
{{- define "stress.deploy-example" -}}
metadata:
labels:
testName: "debug-share-example"
spec:
containers:
- name: debug-share-example
image: busybox
command: ['sh', '-c']
args:
- |
set -ex;
mkdir -p $DEBUG_SHARE;
cd $DEBUG_SHARE;
pwd;
ls -R $DEBUG_SHARE_ROOT;
echo "debug share example success" > success;
cat success;
# The file share is mounted by default at the path $DEBUG_SHARE
# when including the container-env template
{{- include "stress-test-addons.container-env" . | nindent 6 }}
{{- end -}}
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
dependencies:
- name: stress-test-addons
repository: https://stresstestcharts.blob.core.windows.net/helm/
version: 0.1.6
digest: sha256:b97697ef5f303eec43e9a94fca8e312d20b8aed71318250499344aeca9880d31
generated: "2021-08-13T17:24:51.4285458-04:00"
version: 0.1.9
digest: sha256:2a32027871497958af15562a675bad47f4e29523cb18a91ce17b5078eaf9bbdf
generated: "2021-10-18T17:44:55.9281601-04:00"
Original file line number Diff line number Diff line change
Expand Up @@ -10,5 +10,5 @@ annotations:

dependencies:
- name: stress-test-addons
version: 0.1.7
version: 0.1.9
repository: https://stresstestcharts.blob.core.windows.net/helm/
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@
"metadata": {
"_generator": {
"name": "bicep",
"version": "0.4.63.48766",
"templateHash": "13987799099034517242"
"version": "0.4.613.9944",
"templateHash": "9940417978769654920"
}
},
"parameters": {
Expand Down
Loading

0 comments on commit dd1c8ca

Please sign in to comment.