Skip to content

Commit

Permalink
enhancement(tests): Kubernetes E2E test framework (#2702)
Browse files Browse the repository at this point in the history
* Correct test-integration-kubernetes at Makefile

Signed-off-by: MOZGIII <[email protected]>

* Fix the tag overwrite logic at scripts/deploy-kubernetes-test.sh

Signed-off-by: MOZGIII <[email protected]>

* Make scripts/test-integration-kubernetes.sh more tweakable

Signed-off-by: MOZGIII <[email protected]>

* Reorder namespace and global config deletion command

The idea is namespace removal takes the longest, so we'd rather leave it
hanging than config deletion. Then is user gets tired of waiting and
sends a SIGINT we don't leave the global config dangling - just the
namespace removal, which will complete in the background.
So it's just a user experience improvement.

Signed-off-by: MOZGIII <[email protected]>

* Add kubernetes-test-framework

Signed-off-by: MOZGIII <[email protected]>

* Implement a first PoC kubernetes test

Signed-off-by: MOZGIII <[email protected]>

* K8s integration test is really an e2e test, rename accordingly

Signed-off-by: MOZGIII <[email protected]>

* Do not even publish container image at CI since we use "none" minikube driver

Signed-off-by: MOZGIII <[email protected]>

* Isolate kubernetes e2e tests via requried-features

Signed-off-by: MOZGIII <[email protected]>

* Add lock to the test framework

Signed-off-by: MOZGIII <[email protected]>

* Add some test cases to k8s e2e tests

Signed-off-by: MOZGIII <[email protected]>

* Add the ability to use quick debug builds in e2e tests

Useful to speed up the development cycles

Signed-off-by: MOZGIII <[email protected]>

* Use a single thread for test

Signed-off-by: MOZGIII <[email protected]>

* Made test framework async

Signed-off-by: MOZGIII <[email protected]>

* Allow specifying scope

Signed-off-by: MOZGIII <[email protected]>

* Correct arguments preparation for cargo test at scripts/test-e2e-kubernetes.sh

Signed-off-by: MOZGIII <[email protected]>

* Get rid of $(RUN) at test-e2e-kubernetes target at Makefile

Signed-off-by: MOZGIII <[email protected]>

* Set LOG at distribution/kubernetes/vector-namespaced.yaml

Signed-off-by: MOZGIII <[email protected]>

* Add a test to validate the pods are properly excluded

This tool a while to implement, and required that we make framework async.

Signed-off-by: MOZGIII <[email protected]>

* Fix a typo

Signed-off-by: MOZGIII <[email protected]>

* Add test to assert we properly collect logs from multiple namespaces

Signed-off-by: MOZGIII <[email protected]>

* Polish the test framework API

Signed-off-by: MOZGIII <[email protected]>

* Add E2E tests section to the contribution guide

Signed-off-by: MOZGIII <[email protected]>

* Kubernetes E2E tests are no longer experimental, should work consistently

Signed-off-by: MOZGIII <[email protected]>

* Add kubernetes version to the test name

Signed-off-by: MOZGIII <[email protected]>

* Bump minikube

Signed-off-by: MOZGIII <[email protected]>

* Bump kubernetes releases

Signed-off-by: MOZGIII <[email protected]>

* Use minikube cache instead of manually moving image around

Signed-off-by: MOZGIII <[email protected]>

* Test against multiple container runtimes

Signed-off-by: MOZGIII <[email protected]>

* Remove unused repeating_echo_cmd

Signed-off-by: MOZGIII <[email protected]>

* Display timeout

Signed-off-by: MOZGIII <[email protected]>

* Shorter title

Signed-off-by: MOZGIII <[email protected]>

* Switch to docker driver at minikube

Signed-off-by: MOZGIII <[email protected]>

* Remove the no_newline_at_eol test

Turns out, this test was invalid. The root cause with this is that, in
essence, Kubernetes expects logs to consist of line, with line being
defined as in POSIX - a sequence of characters *ending with \n*.
Thus it's *not valid* to emit a log line without the terminating newline
symbol in Kubernetes.
One effect of this is that when using the CRI log format, lines won't be
considered complete until we emit a newline character arrives - and the
additional content before the newline will be added to the log line that's
missing the newline.

Given all of the above, there's no reason for this test to exist. The
reason it was added was the behaviour detail of the docker log driver,
but it's a mere implementation detail, and it we should abstract from it.

The original statement of the test is also ill-posed, cause, as explained
above, it's non-partial messages (and, generally speaking, any message)
that doesn't end with newline isn't a valid log line in the first place.

Signed-off-by: MOZGIII <[email protected]>

* Increase timeout to rollout vector to 30s

Signed-off-by: MOZGIII <[email protected]>

* Temporarily disable crio

Signed-off-by: MOZGIII <[email protected]>

* Apply workaround for CRIO

Signed-off-by: MOZGIII <[email protected]>

* Fix clippy

Signed-off-by: MOZGIII <[email protected]>

* Unset log level in skaffold dev config to fallback to the one set in container

Signed-off-by: MOZGIII <[email protected]>

* Add exec_tail to the test framework

Signed-off-by: MOZGIII <[email protected]>

* Fix a typo at the comment

Signed-off-by: MOZGIII <[email protected]>

* Fix the typos and styling at the crate doccomment

Signed-off-by: MOZGIII <[email protected]>

* Bump k8s versions for E2E tests at CI

Signed-off-by: MOZGIII <[email protected]>

* Rename template params to pascal case

Signed-off-by: MOZGIII <[email protected]>

* Remove Drop from ResourceFile

Signed-off-by: MOZGIII <[email protected]>

* Proper authors

Signed-off-by: MOZGIII <[email protected]>

* Rename crate to k8s-test-framework

More in-line with the naming patterns of the rest of the k8s-related
crates.

Signed-off-by: MOZGIII <[email protected]>

* Correct kubectl comment at the interface

Signed-off-by: MOZGIII <[email protected]>

* Bumped k8s and minikube versions at CI

Signed-off-by: MOZGIII <[email protected]>

* Add a comment explaining the timeout at pod filtering test

Signed-off-by: MOZGIII <[email protected]>

* Rollback minikube to 0.11.0

Signed-off-by: MOZGIII <[email protected]>

* Update CONTRIBUTING.md

Co-authored-by: Ana Hobden <[email protected]>
Signed-off-by: MOZGIII <[email protected]>

* Update distribution/kubernetes/vector-namespaced.yaml

Co-authored-by: Ana Hobden <[email protected]>
Signed-off-by: MOZGIII <[email protected]>

* Fix an error at CONTRIBUTING.md

Signed-off-by: MOZGIII <[email protected]>

* Remove a trivial line from the doc

Signed-off-by: MOZGIII <[email protected]>

* Do second attemtp to start up minikube if the first one failed

Signed-off-by: MOZGIII <[email protected]>

* Print minikube logs if it fails to start

Signed-off-by: MOZGIII <[email protected]>

* Provide a default for CONTAINER_IMAGE_REPO if USE_MINIKUBE_CACHE is set

Signed-off-by: MOZGIII <[email protected]>

* Update the CONTRIBUTING.md for CONTAINER_IMAGE_REPO default if USE_MINIKUBE_CACHE is set

Signed-off-by: MOZGIII <[email protected]>

* Increase all rollout/wait timeouts to one minute

Signed-off-by: MOZGIII <[email protected]>

* Fix syntax error around minikube start command

Signed-off-by: MOZGIII <[email protected]>

* Rollback k8s v1.16.13 to v1.16.12 at CI

Signed-off-by: MOZGIII <[email protected]>

* Add minikube cache autodetection

Signed-off-by: MOZGIII <[email protected]>

* Document USE_MINIKUBE_CACHE=auto mode

Signed-off-by: MOZGIII <[email protected]>

* Add a note on minikube bug to CONTRIBUTING.md

Signed-off-by: MOZGIII <[email protected]>

* Add a note on minikube on ZFS to CONTRIBUTING.md

Signed-off-by: MOZGIII <[email protected]>

* Fix the doc comment at scripts/deploy-kubernetes-test.sh

Signed-off-by: MOZGIII <[email protected]>

* Apply a workaround for kubectl from snap

Signed-off-by: MOZGIII <[email protected]>

* Extract and reuse scripts/skaffold-dockerignore.sh

Signed-off-by: MOZGIII <[email protected]>

Co-authored-by: Ana Hobden <[email protected]>
  • Loading branch information
MOZGIII and Hoverbear authored Jul 29, 2020
1 parent d64c8b1 commit 6dafbeb
Show file tree
Hide file tree
Showing 32 changed files with 1,855 additions and 152 deletions.
54 changes: 41 additions & 13 deletions .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -267,21 +267,49 @@ jobs:
- run: make slim-builds
- run: make test-integration-splunk

test-integration-kubernetes:
name: Integration - Linux, Kubernetes, flaky
# This is an experimental test. Allow it to fail without failing the whole
# workflow, but keep it executing on every build to gather stats.
continue-on-error: true
test-e2e-kubernetes:
name: E2E - K8s ${{ matrix.kubernetes_version }} / ${{ matrix.container_runtime }}
runs-on: ubuntu-latest
strategy:
matrix:
kubernetes:
- v1.18.2
- v1.17.5
- v1.16.9
- v1.15.11
- v1.14.10
minikube_version:
- 'v1.11.0' # https://github.com/kubernetes/minikube/issues/8799
kubernetes_version:
- 'v1.18.6'
- 'v1.17.9'
- 'v1.16.12' # v1.16.13 is broken, see https://github.com/kubernetes/minikube/issues/8840
- 'v1.15.12'
- 'v1.14.10'
container_runtime:
- docker
- containerd
- crio
fail-fast: false
steps:
- name: Temporarily off
run: "true"
- name: Setup Minikube
run: |
set -xeuo pipefail
curl -Lo kubectl \
'https://storage.googleapis.com/kubernetes-release/release/${{ matrix.kubernetes_version }}/bin/linux/amd64/kubectl'
sudo install kubectl /usr/local/bin/
curl -Lo minikube \
'https://storage.googleapis.com/minikube/releases/${{ matrix.minikube_version }}/minikube-linux-amd64'
sudo install minikube /usr/local/bin/
minikube config set profile minikube
minikube config set vm-driver docker
minikube config set kubernetes-version '${{ matrix.kubernetes_version }}'
minikube config set container-runtime '${{ matrix.container_runtime }}'
# Start minikube, try again once if fails and print logs if the second
# attempt fails too.
minikube start || minikube delete && minikube start || minikube logs
kubectl cluster-info
- name: Checkout
uses: actions/checkout@v1
- run: USE_CONTAINER=none make slim-builds
- run: make test-e2e-kubernetes
env:
USE_MINIKUBE_CACHE: "true"
PACKAGE_DEB_USE_CONTAINER: docker
96 changes: 92 additions & 4 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,8 @@ expanding into more specifics.
1. [Benchmarking](#benchmarking)
1. [Profiling](#profiling)
1. [Kubernetes](#kubernetes)
1. [Dev flow](#kubernetes-dev-flow)
1. [E2E tests](#kubernetes-e2e-tests)
1. [Humans](#humans)
1. [Documentation](#documentation)
1. [Changelog](#changelog)
Expand Down Expand Up @@ -550,13 +552,15 @@ navigated in your favorite web browser.

### Kubernetes

#### Kubernetes Dev Flow

There is a special flow for when you develop portions of Vector that are
designed to work with Kubernetes, like `kubernetes_logs` source or the
`deployment/kubernetes/*.yaml` configs.

This flow facilitates building Vector and deploying it into a cluster.

#### Requirements
##### Requirements

There are some extra requirements besides what you'd normally need to work on
Vector:
Expand All @@ -570,7 +574,7 @@ Vector:
* [`minikube`](https://minikube.sigs.k8s.io/)-powered or other k8s cluster
* [`cargo watch`](https://github.com/passcod/cargo-watch)

#### The dev flow
##### The dev flow

Once you have the requirements, use the `scripts/skaffold.sh dev` command.

Expand All @@ -596,7 +600,7 @@ the cluster state and exit.
`scripts/skaffold.sh` wraps `skaffold`, you can use other `skaffold` subcommands
if it fits you better.

#### Troubleshooting
##### Troubleshooting

You might need to tweak `skaffold`, here are some hints:

Expand All @@ -614,7 +618,7 @@ You might need to tweak `skaffold`, here are some hints:
* For the rest of the `skaffold` tweaks you might want to apply check out
[this page](https://skaffold.dev/docs/environment/).

#### Going through the dev flow manually
##### Going through the dev flow manually

Is some cases `skaffold` may not work. It's possible to go through the dev flow
manually, without `skaffold`.
Expand All @@ -627,6 +631,90 @@ required.
Essentially, the steps you have to take to deploy manually are the same that
`skaffold` will perform, and they're outlined at the previous section.

#### Kubernetes E2E tests

Kubernetes integration has a lot of parts that can go wrong.

To cope with the complexity and ensure we maintain high quality, we use
E2E (end-to-end) tests.

> E2E tests normally run at CI, so there's typically no need to run them
> manually.
##### Requirements

* `kubernetes` cluster (`minikube` has special support, but any cluster should
work)
* `docker`
* `kubectl`
* `bash`

Vector release artifacts are prepared for E2E tests, so the ability to do that
is required too, see Vector [docs](https://vector.dev) for more details.

> Note: `minikube` has a bug in the latest versions that affects our test
> process - see https://github.com/kubernetes/minikube/issues/8799.
> Use version `1.11.0` for now.
> Note: `minikube` has troubles running on ZFS systems. If you're using ZFS, we
> suggest using a cloud cluster or [`minik8s`](https://microk8s.io/) with local
> registry.
##### Running the E2E tests

To run the E2E tests, use the following command:

```shell
CONTAINER_IMAGE_REPO=<your name>/vector-test make test-e2e-kubernetes
```

Where `CONTAINER_IMAGE_REPO` is the docker image repo name to use, without part
after the `:`. Replace `<your name>` with your Docker Hub username.

You can also pass additional parameters to adjust the behavior of the test:

* `QUICK_BUILD=true` - use development build and a skaffold image from the dev
flow instead of a production docker image. Significantly speeds up the
preparation process, but doesn't guarantee the correctness in the release
build. Useful for development of the tests or Vector code to speed up the
iteration cycles.

* `USE_MINIKUBE_CACHE=true` - instead of pushing the built docker image to the
registry under the specified name, directly load the image into
a `minikube`-controlled cluster node.
Requires you to test against a `minikube` cluster. Eliminates the need to have
a registry to run tests.
When `USE_MINIKUBE_CACHE=true` is set, we provide a default value for the
`CONTAINER_IMAGE_REPO` so it can be omitted.
Can be set to `auto` (default) to automatically detect whether to use
`minikube cache` or not, based on the current `kubectl` context. To opt-out,
set `USE_MINIKUBE_CACHE=false`.

* `CONTAINER_IMAGE=<your name>/vector-test:tag` - completely skip the step
of building the Vector docker image, and use the specified image instead.
Useful to speed up the iterations speed when you already have a Vector docker
image you want to test against.

* `SKIP_CONTAINER_IMAGE_PUBLISHING=true` - completely skip the image publishing
step. Useful when you want to speed up the iteration speed and when you know
the Vector image you want to test is already available to the cluster you're
testing against.

* `SCOPE` - pass a filter to the `cargo test` command to filter out the tests,
effectively equivalent to `cargo test -- $SCOPE`.

Passing additional commands is done like so:

```shell
QUICK_BUILD=true USE_MINIKUBE_CACHE=true make test-e2e-kubernetes
```

or

```shell
QUICK_BUILD=true CONTAINER_IMAGE_REPO=<your name>/vector-test make test-e2e-kubernetes
```

## Humans

After making your change, you'll want to prepare it for Vector's users
Expand Down
12 changes: 12 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

10 changes: 9 additions & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ members = [
"lib/file-source",
"lib/tracing-limit",
"lib/vector-wasm",
"lib/k8s-test-framework",
]

[dependencies]
Expand Down Expand Up @@ -195,6 +196,7 @@ tokio-test = "0.2"
tokio = { version = "0.2", features = ["test-util"] }
assert_cmd = "1.0"
reqwest = { version = "0.10.6", features = ["json"] }
k8s-test-framework = { version = "0.1", path = "lib/k8s-test-framework" }

[features]
# Default features for *-unknown-linux-gnu and *-apple-darwin
Expand Down Expand Up @@ -431,11 +433,13 @@ kafka-integration-tests = ["sources-kafka", "sinks-kafka"]
loki-integration-tests = ["sinks-loki"]
pulsar-integration-tests = ["sinks-pulsar"]
splunk-integration-tests = ["sinks-splunk_hec", "warp"]
kubernetes-integration-tests = ["sources-kubernetes-logs"]

shutdown-tests = ["sources","sinks-console","sinks-prometheus","sinks-blackhole","unix","rdkafka","transforms-log_to_metric","transforms-lua"]
disable-resolv-conf = []

# E2E tests
kubernetes-e2e-tests = ["k8s-openapi"]

[[bench]]
name = "bench"
harness = false
Expand All @@ -453,5 +457,9 @@ name = "wasm"
harness = false
required-features = ["transforms-wasm", "transforms-lua"]

[[test]]
name = "kubernetes-e2e"
required-features = ["kubernetes-e2e-tests"]

[patch.'https://github.com/tower-rs/tower']
tower-layer = "0.3"
6 changes: 3 additions & 3 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -281,9 +281,9 @@ ifeq ($(AUTODESPAWN), true)
${MAYBE_ENVIRONMENT_EXEC} $(CONTAINER_TOOL)-compose stop
endif

PACKAGE_DEB_USE_CONTAINER ?= "$(USE_CONTAINER)"
test-integration-kubernetes: ## Runs Kubernetes integration tests (Sorry, no `ENVIRONMENT=true` support)
PACKAGE_DEB_USE_CONTAINER="$(PACKAGE_DEB_USE_CONTAINER)" USE_CONTAINER=none $(RUN) test-integration-kubernetes
PACKAGE_DEB_USE_CONTAINER ?= $(USE_CONTAINER)
test-e2e-kubernetes: ## Runs Kubernetes E2E tests (Sorry, no `ENVIRONMENT=true` support)
PACKAGE_DEB_USE_CONTAINER="$(PACKAGE_DEB_USE_CONTAINER)" scripts/test-e2e-kubernetes.sh

test-shutdown: ## Runs shutdown tests
ifeq ($(AUTOSPAWN), true)
Expand Down
5 changes: 5 additions & 0 deletions distribution/kubernetes/vector-namespaced.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,11 @@ spec:
valueFrom:
fieldRef:
fieldPath: metadata.namespace
# Set a reasonable log level to avoid issues with internal logs
# overwriting console output at E2E tests. Feel free to change at
# a real deployment.
- name: LOG
value: info
volumeMounts:
- name: var-log
mountPath: /var/log/
Expand Down
3 changes: 3 additions & 0 deletions kustomization.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,6 @@ resources:
- skaffold/manifests/namespace.yaml
- skaffold/manifests/config.yaml
- distribution/kubernetes/vector-namespaced.yaml

patchesStrategicMerge:
- skaffold/manifests/patches/env.yaml
16 changes: 16 additions & 0 deletions lib/k8s-test-framework/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
[package]
name = "k8s-test-framework"
version = "0.1.0"
authors = ["Vector Contributors <[email protected]>"]
edition = "2018"
description = "Kubernetes Test Framework used to test Vector in Kubernetes"

[dependencies]
k8s-openapi = { version = "0.9", default-features = false, features = ["v1_15"] }
serde_json = "1"
tempfile = "3"
once_cell = "1"
tokio = { version = "0.2", features = ["process", "io-util"] }

[dev-dependencies]
tokio = { version = "0.2", features = ["macros", "rt-threaded"] }
32 changes: 32 additions & 0 deletions lib/k8s-test-framework/src/exec_tail.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
//! Perform a log lookup.
use super::{Reader, Result};
use std::process::Stdio;
use tokio::process::Command;

/// Exec a `tail` command reading the specified `file` within a `Container`
/// in a `Pod` of a specified `resource` at the specified `namespace` via the
/// specified `kubectl_command`.
/// Returns a [`Reader`] that managed the reading process.
pub fn exec_tail(
kubectl_command: &str,
namespace: &str,
resource: &str,
file: &str,
) -> Result<Reader> {
let mut command = Command::new(kubectl_command);

command.stdin(Stdio::null()).stderr(Stdio::inherit());

command.arg("exec");
command.arg("-n").arg(namespace);
command.arg(resource);
command.arg("--");
command.arg("tail");
command.arg("--follow=name");
command.arg("--retry");
command.arg(file);

let reader = Reader::spawn(command)?;
Ok(reader)
}
Loading

0 comments on commit 6dafbeb

Please sign in to comment.