-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integration test for cni-repair-controller #316
Conversation
9fbaa2e
to
e4fb23b
Compare
(Note this will fail until linkerd/linkerd2#11699 lands) The `integration-cni-plugin.yml` workflow (formerly known as `cni-plugin-integration.yml`) has been expanded to run the new recipe `cni-repair-controller-integration`, which performs the following steps: - Rebuilds the `linkerd-cni-repair-controller` crate and `cni-plugin` - Creates a new cluster at version `v1.27.6-k3s1` (version required for Calico to work) - Triggers a new `./cni-repair-controller/integration/run.sh` script which: - Installs Calico - Installs the latest linkerd-edge CLI - Installs `linkerd-cni` and wait for it to become ready - Install the linkerd control plane in CNI mode - Install a `pause` DaemonSet The `linkerd-cni` instance has been configured to include an extra initContainer that will delay its start for 15s. Since we waited for it to become ready, this doesn't affect the initial install. But then a new node is added to the cluster, and this delay allows for the new `pause` DaemonSet replica to start before the full CNI config is ready, so we can observe its failure to come up. Once the new `linkerd-cni` replica becomes ready we observe how the `pause` failed replica is replaced by a new healthy one.
e4fb23b
to
89c3415
Compare
.dockerignore
Outdated
@@ -1,2 +1 @@ | |||
rust-toolchain | |||
target/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this change accidental? Or do we need the target dir in the context when building an image?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. This was a leftover from an iteration where the binary wasn't build inside the same Dockerfile.
# the full CNI config is ready and enter a failure mode | ||
extraInitContainers: | ||
- name: sleep | ||
image: busybox |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on the nitpicky side, the CNI plugin runs alpine, can we re-use the same image so we don't pull busybox in tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good thinking 👍
(Note this will fail until linkerd/linkerd2#11699 lands)
The
integration-cni-plugin.yml
workflow (formerly known ascni-plugin-integration.yml
) has been expanded to run the new recipecni-repair-controller-integration
, which performs the following steps:linkerd-cni-repair-controller
crate andcni-plugin
v1.27.6-k3s1
(version required for Calico to work)./cni-repair-controller/integration/run.sh
script which:linkerd-cni
and wait for it to become readypause
DaemonSetThe
linkerd-cni
instance has been configured to include an extra initContainer that will delay its start for 15s. Since we waited for it to become ready, this doesn't affect the initial install. But then a new node is added to the cluster, and this delay allows for the newpause
DaemonSet replica to start before the full CNI config is ready, so we can observe its failure to come up. Once the newlinkerd-cni
replica becomes ready we observe how thepause
failed replica is replaced by a new healthy one.