-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add cni-repair-controller to linkerd-cni DaemonSet #11699
Conversation
Followup to linkerd/linkerd2-proxy-init#306 Fixes #11073 This adds the `reinitialize-pods` container to the `linkerd-cni` DaemonSet, along with its config in `values.yaml`. Also the `linkerd-cni`'s version is bumped, to contain the new binary for this controller. ## TO-DOs - Integration test
178c9f8
to
1ca137a
Compare
a38b17e
to
0e943f4
Compare
Could this also be marked with the Fixes #11735 tag? |
(Note this will fail until linkerd/linkerd2#11699 lands) The `integration-cni-plugin.yml` workflow (formerly known as `cni-plugin-integration.yml`) has been expanded to run the new recipe `reinitialize-pods-integration`, which performs the following steps: - Rebuilds the `linkerd-reinitialize-pods` crate and `cni-plugin`. The `Dockerfile-cni-plugin` file has been refactored to have two main targets `runtime` and `runtime-test`, the latter picking the `linkerd-reinitialize-pods` that has just been built locally. - Creates a new cluster at version `v1.27.6-k3s1` (version required for Calico to work) - Triggers a new `./reinitialize-pods/integration/run.sh` script which: - Installs Calico - Installs the latest linkerd-edge CLI - Installs `linkerd-cni` and wait for it to become ready - Install the linkerd control plane in CNI mode - Install a `pause` DaemonSet The `linkerd-cni` instance has been configured to include an extra initContainer that will delay its start for 15s. Since we waited for it to become ready, this doesn't affect the initial install. But then a new node is added to the cluster, and this delay allows for the new `pause` DaemonSet replica to start before the full CNI config is ready, so we can observe its failure to come up. Once the new `linkerd-cni` replica becomes ready we observe how the `pause` failed replica is replaced by a new healthy one.
(Note this will fail until linkerd/linkerd2#11699 lands) The `integration-cni-plugin.yml` workflow (formerly known as `cni-plugin-integration.yml`) has been expanded to run the new recipe `reinitialize-pods-integration`, which performs the following steps: - Rebuilds the `linkerd-reinitialize-pods` crate and `cni-plugin`. The `Dockerfile-cni-plugin` file has been refactored to have two main targets `runtime` and `runtime-test`, the latter picking the `linkerd-reinitialize-pods` that has just been built locally. - Creates a new cluster at version `v1.27.6-k3s1` (version required for Calico to work) - Triggers a new `./reinitialize-pods/integration/run.sh` script which: - Installs Calico - Installs the latest linkerd-edge CLI - Installs `linkerd-cni` and wait for it to become ready - Install the linkerd control plane in CNI mode - Install a `pause` DaemonSet The `linkerd-cni` instance has been configured to include an extra initContainer that will delay its start for 15s. Since we waited for it to become ready, this doesn't affect the initial install. But then a new node is added to the cluster, and this delay allows for the new `pause` DaemonSet replica to start before the full CNI config is ready, so we can observe its failure to come up. Once the new `linkerd-cni` replica becomes ready we observe how the `pause` failed replica is replaced by a new healthy one.
(Note this will fail until linkerd/linkerd2#11699 lands) The `integration-cni-plugin.yml` workflow (formerly known as `cni-plugin-integration.yml`) has been expanded to run the new recipe `reinitialize-pods-integration`, which performs the following steps: - Rebuilds the `linkerd-reinitialize-pods` crate and `cni-plugin`. The `Dockerfile-cni-plugin` file has been refactored to have two main targets `runtime` and `runtime-test`, the latter picking the `linkerd-reinitialize-pods` that has just been built locally. - Creates a new cluster at version `v1.27.6-k3s1` (version required for Calico to work) - Triggers a new `./reinitialize-pods/integration/run.sh` script which: - Installs Calico - Installs the latest linkerd-edge CLI - Installs `linkerd-cni` and wait for it to become ready - Install the linkerd control plane in CNI mode - Install a `pause` DaemonSet The `linkerd-cni` instance has been configured to include an extra initContainer that will delay its start for 15s. Since we waited for it to become ready, this doesn't affect the initial install. But then a new node is added to the cluster, and this delay allows for the new `pause` DaemonSet replica to start before the full CNI config is ready, so we can observe its failure to come up. Once the new `linkerd-cni` replica becomes ready we observe how the `pause` failed replica is replaced by a new healthy one.
This is perfect. When can I test this ? |
This should be shipped with an edge release in the coming weeks 🙂 |
Fixes linkerd/linkerd2#11073 This fixes the issue of injected pods that cannot acquire proper network config because `linkerd-cni` and/or the cluster's network CNI haven't fully started. They are left in a permanent crash loop and once CNI is ready, they need to be restarted externally, which is what this controller does. This controller "`linkerd-cni-repair-controller`" watches over events on pods in the current node, which have been injected but are in a terminated state and whose `linkerd-network-validator` container exited with code 95, and proceeds to delete them so they can restart with a proper network config. The controller is to be deployed as an additional container in the `linkerd-cni` DaemonSet (addressed in linkerd/linkerd2#11699). This exposes two custom counter metrics: `linkerd_cni_repair_controller_queue_overflow` (in the spirit of the destination controller's `endpoint_updates_queue_overflow`) and `linkerd_cni_repair_controller_deleted`
(Note this will fail until linkerd/linkerd2#11699 lands) The `integration-cni-plugin.yml` workflow (formerly known as `cni-plugin-integration.yml`) has been expanded to run the new recipe `cni-repair-controller-integration`, which performs the following steps: - Rebuilds the `linkerd-cni-repair-controller` crate and `cni-plugin` - Creates a new cluster at version `v1.27.6-k3s1` (version required for Calico to work) - Triggers a new `./cni-repair-controller/integration/run.sh` script which: - Installs Calico - Installs the latest linkerd-edge CLI - Installs `linkerd-cni` and wait for it to become ready - Install the linkerd control plane in CNI mode - Install a `pause` DaemonSet The `linkerd-cni` instance has been configured to include an extra initContainer that will delay its start for 15s. Since we waited for it to become ready, this doesn't affect the initial install. But then a new node is added to the cluster, and this delay allows for the new `pause` DaemonSet replica to start before the full CNI config is ready, so we can observe its failure to come up. Once the new `linkerd-cni` replica becomes ready we observe how the `pause` failed replica is replaced by a new healthy one.
(Note this will fail until linkerd/linkerd2#11699 lands) The `integration-cni-plugin.yml` workflow (formerly known as `cni-plugin-integration.yml`) has been expanded to run the new recipe `cni-repair-controller-integration`, which performs the following steps: - Rebuilds the `linkerd-cni-repair-controller` crate and `cni-plugin` - Creates a new cluster at version `v1.27.6-k3s1` (version required for Calico to work) - Triggers a new `./cni-repair-controller/integration/run.sh` script which: - Installs Calico - Installs the latest linkerd-edge CLI - Installs `linkerd-cni` and wait for it to become ready - Install the linkerd control plane in CNI mode - Install a `pause` DaemonSet The `linkerd-cni` instance has been configured to include an extra initContainer that will delay its start for 15s. Since we waited for it to become ready, this doesn't affect the initial install. But then a new node is added to the cluster, and this delay allows for the new `pause` DaemonSet replica to start before the full CNI config is ready, so we can observe its failure to come up. Once the new `linkerd-cni` replica becomes ready we observe how the `pause` failed replica is replaced by a new healthy one.
(Note this will fail until linkerd/linkerd2#11699 lands) The `integration-cni-plugin.yml` workflow (formerly known as `cni-plugin-integration.yml`) has been expanded to run the new recipe `cni-repair-controller-integration`, which performs the following steps: - Rebuilds the `linkerd-cni-repair-controller` crate and `cni-plugin` - Creates a new cluster at version `v1.27.6-k3s1` (version required for Calico to work) - Triggers a new `./cni-repair-controller/integration/run.sh` script which: - Installs Calico - Installs the latest linkerd-edge CLI - Installs `linkerd-cni` and wait for it to become ready - Install the linkerd control plane in CNI mode - Install a `pause` DaemonSet The `linkerd-cni` instance has been configured to include an extra initContainer that will delay its start for 15s. Since we waited for it to become ready, this doesn't affect the initial install. But then a new node is added to the cluster, and this delay allows for the new `pause` DaemonSet replica to start before the full CNI config is ready, so we can observe its failure to come up. Once the new `linkerd-cni` replica becomes ready we observe how the `pause` failed replica is replaced by a new healthy one.
charts/linkerd2-cni/values.yaml
Outdated
# Defaults to system-cluster-critical so it signals the scheduler to start | ||
# before application pods, but after CNI plugins (whose priorityClassName is | ||
# system-node-critical). This isn't strictly enforced. | ||
priorityClassName: "system-cluster-critical" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this needed for the cni-repair controller specifically? If not, can you pull it out into a separate change?
If CNI plugins should run at system-node-critical, why wouldn't the Linkerd CNI run at system-node-critical? If we don't have that as a default now, is there a reason for that? I.e. are there any downsides to setting this as a default?
If we omit this change from this PR, this change feels less risky to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not required for this PR. The reasoning for using system-cluster-critical
was to allow for the main CNI plugin in the cluster to run first, lessening the chance to run into the race condition the repair controller attempts to fix. But it appears these class names are either best-effort or the prioritization mechanism is simply not implemented as advertised, according to my testing. I'll remove this for now.
This edge release introduces a number of different fixes and improvements. More notably, it introduces a new `cni-repair-controller` binary to the CNI plugin image. The controller will automatically restart pods that have not received their iptables configuration. * Removed shortnames from Tap API resources to avoid colliding with existing Kubernetes resources ([#11816]; fixes [#11784]) * Introduced a new ExternalWorkload CRD to support upcoming mesh expansion feature ([#11805]) * Changed `MeshTLSAuthentication` resource validation to allow SPIFFE URI identities ([#11882]) * Introduced a new `cni-repair-controller` to the `linkerd-cni` DaemonSet to automatically restart misconfigured pods that are missing iptables rules ([#11699]; fixes [#11073]) * Fixed a `"duplicate metrics"` warning in the multicluster service-mirror component ([#11875]; fixes [#11839]) * Added metric labels and weights to `linkerd diagnostics endpoints` json output ([#11889]) * Changed how `Server` updates are handled in the destination service. The change will ensure that during a cluster resync, consumers won't be overloaded by redundant updates ([#11907]) * Changed `linkerd install` error output to add a newline when a Kubernetes client cannot be successfully initialised [#11816]: #11816 [#11784]: #11784 [#11805]: #11805 [#11882]: #11882 [#11699]: #11699 [#11073]: #11073 [#11875]: #11875 [#11839]: #11839 [#11889]: #11889 [#11907]: #11907 [#11917]: #11917 Signed-off-by: Matei David <[email protected]>
This edge release introduces a number of different fixes and improvements. More notably, it introduces a new `cni-repair-controller` binary to the CNI plugin image. The controller will automatically restart pods that have not received their iptables configuration. * Removed shortnames from Tap API resources to avoid colliding with existing Kubernetes resources ([#11816]; fixes [#11784]) * Introduced a new ExternalWorkload CRD to support upcoming mesh expansion feature ([#11805]) * Changed `MeshTLSAuthentication` resource validation to allow SPIFFE URI identities ([#11882]) * Introduced a new `cni-repair-controller` to the `linkerd-cni` DaemonSet to automatically restart misconfigured pods that are missing iptables rules ([#11699]; fixes [#11073]) * Fixed a `"duplicate metrics"` warning in the multicluster service-mirror component ([#11875]; fixes [#11839]) * Added metric labels and weights to `linkerd diagnostics endpoints` json output ([#11889]) * Changed how `Server` updates are handled in the destination service. The change will ensure that during a cluster resync, consumers won't be overloaded by redundant updates ([#11907]) * Changed `linkerd install` error output to add a newline when a Kubernetes client cannot be successfully initialised ([#11917]) [#11816]: #11816 [#11784]: #11784 [#11805]: #11805 [#11882]: #11882 [#11699]: #11699 [#11073]: #11073 [#11875]: #11875 [#11839]: #11839 [#11889]: #11889 [#11907]: #11907 [#11917]: #11917 Signed-off-by: Matei David <[email protected]>
This edge release introduces a number of different fixes and improvements. More notably, it introduces a new `cni-repair-controller` binary to the CNI plugin image. The controller will automatically restart pods that have not received their iptables configuration. * Removed shortnames from Tap API resources to avoid colliding with existing Kubernetes resources ([#11816]; fixes [#11784]) * Introduced a new ExternalWorkload CRD to support upcoming mesh expansion feature ([#11805]) * Changed `MeshTLSAuthentication` resource validation to allow SPIFFE URI identities ([#11882]) * Introduced a new `cni-repair-controller` to the `linkerd-cni` DaemonSet to automatically restart misconfigured pods that are missing iptables rules ([#11699]; fixes [#11073]) * Fixed a `"duplicate metrics"` warning in the multicluster service-mirror component ([#11875]; fixes [#11839]) * Added metric labels and weights to `linkerd diagnostics endpoints` json output ([#11889]) * Changed how `Server` updates are handled in the destination service. The change will ensure that during a cluster resync, consumers won't be overloaded by redundant updates ([#11907]) * Changed `linkerd install` error output to add a newline when a Kubernetes client cannot be successfully initialised ([#11917]) [#11816]: #11816 [#11784]: #11784 [#11805]: #11805 [#11882]: #11882 [#11699]: #11699 [#11073]: #11073 [#11875]: #11875 [#11839]: #11839 [#11889]: #11889 [#11907]: #11907 [#11917]: #11917 Signed-off-by: Matei David <[email protected]>
Followup to linkerd/linkerd2-proxy-init#306 Fixes #11073 This adds the `reinitialize-pods` container to the `linkerd-cni` DaemonSet, along with its config in `values.yaml`. Also the `linkerd-cni`'s version is bumped, to contain the new binary for this controller.
This stable release adds a cni-repair-controller which fixes the issue of injected pods that cannot acquire proper network config because linkerd-cni and/or the cluster's network CNI haven't fully started ([#11699]). It also fixes a bug in the destination controller where having a large number of Server resources could cause the destination controller to use an excessive amount of CPU ([#11907]). Finally, it fixes a conflict with tap resource shortnames which was causing warnings from kubectl v1.29.0+ ([#11816]). [#11699]: #11699 [#11907]: #11907 [#11816]: #11816
(Note this will fail until linkerd/linkerd2#11699 lands) The `integration-cni-plugin.yml` workflow (formerly known as `cni-plugin-integration.yml`) has been expanded to run the new recipe `cni-repair-controller-integration`, which performs the following steps: - Rebuilds the `linkerd-cni-repair-controller` crate and `cni-plugin` - Creates a new cluster at version `v1.27.6-k3s1` (version required for Calico to work) - Triggers a new `./cni-repair-controller/integration/run.sh` script which: - Installs Calico - Installs the latest linkerd-edge CLI - Installs `linkerd-cni` and wait for it to become ready - Install the linkerd control plane in CNI mode - Install a `pause` DaemonSet The `linkerd-cni` instance has been configured to include an extra initContainer that will delay its start for 15s. Since we waited for it to become ready, this doesn't affect the initial install. But then a new node is added to the cluster, and this delay allows for the new `pause` DaemonSet replica to start before the full CNI config is ready, so we can observe its failure to come up. Once the new `linkerd-cni` replica becomes ready we observe how the `pause` failed replica is replaced by a new healthy one.
Followup to linkerd/linkerd2-proxy-init#306
Fixes #11073
This adds the
cni-repair-controller
container to thelinkerd-cni
DaemonSet, along with its config invalues.yaml
. Note that this is disabled by default; to enable setrepairController.enabled=true
.Also the
linkerd-cni
's version is bumped, to contain the new binary for this controller.Finally,**priorityClassName: system-cluster-critical
was added to the DaemonSet, which should signal the scheduler to give it priority over application pods, but this has proven to not be reliable, thus the need of the new controller.