Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cni-repair-controller to linkerd-cni DaemonSet #11699

Merged
merged 6 commits into from
Jan 5, 2024
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 13 additions & 3 deletions charts/linkerd2-cni/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,19 +31,29 @@ Kubernetes: `>=1.21.0-0`
| ignoreOutboundPorts | string | `""` | Default set of outbound ports to skip via iptables |
| image.name | string | `"cr.l5d.io/linkerd/cni-plugin"` | Docker image for the CNI plugin |
| image.pullPolicy | string | `"IfNotPresent"` | Pull policy for the linkerd-cni container |
| image.version | string | `"v1.2.2"` | Tag for the CNI container Docker image |
| image.version | string | `"v1.3.0"` | Tag for the CNI container Docker image |
| imagePullSecrets | list | `[]` | |
| inboundProxyPort | int | `4143` | Inbound port for the proxy container |
| logLevel | string | `"info"` | Log level for the CNI plugin |
| outboundProxyPort | int | `4140` | Outbound port for the proxy container |
| podLabels | object | `{}` | Additional labels to add to all pods |
| portsToRedirect | string | `""` | Ports to redirect to proxy |
| priorityClassName | string | `""` | Kubernetes priorityClassName for the CNI plugin's Pods |
| priorityClassName | string | `"system-cluster-critical"` | Kubernetes priorityClassName for the CNI plugin's Pods. Defaults to system-cluster-critical so it signals the scheduler to start before application pods, but after CNI plugins (whose priorityClassName is system-node-critical). This isn't strictly enforced. |
| privileged | bool | `false` | Run the install-cni container in privileged mode |
| proxyAdminPort | int | `4191` | Admin port for the proxy container |
| proxyControlPort | int | `4190` | Control port for the proxy container |
| proxyUID | int | `2102` | User id under which the proxy shall be ran |
| resources | object | `{"cpu":{"limit":"","request":""},"ephemeral-storage":{"limit":"","request":""},"memory":{"limit":"","request":""}}` | Resource requests and limits for linkerd-cni daemonset containers |
| repairController | object | `{"enableSecurityContext":true,"enabled":false,"logFormat":"plain","logLevel":"info","resources":{"cpu":{"limit":"","request":""},"ephemeral-storage":{"limit":"","request":""},"memory":{"limit":"","request":""}}}` | The cni-repair-controller scans pods in each node to find those that have been injected by linkerd, and whose linkerd-network-validator container has failed. This is usually caused by a race between linkerd-cni and the CNI plugin used in the cluster. This controller deletes those failed pods so they can restart and rety re-acquiring a proper network config. |
| repairController.enableSecurityContext | bool | `true` | Include a securityContext in the repair-controller container |
| repairController.logFormat | string | plain | Log format (`plain` or `json`) for the repair-controller container |
| repairController.logLevel | string | info | Log level for the repair-controller container |
| repairController.resources.cpu.limit | string | `""` | Maximum amount of CPU units that the repair-controller container can use |
| repairController.resources.cpu.request | string | `""` | Amount of CPU units that the repair-controller container requests |
| repairController.resources.ephemeral-storage.limit | string | `""` | Maximum amount of ephemeral storage that the repair-controller container can use |
| repairController.resources.ephemeral-storage.request | string | `""` | Amount of ephemeral storage that the repair-controller container requests |
| repairController.resources.memory.limit | string | `""` | Maximum amount of memory that the repair-controller container can use |
| repairController.resources.memory.request | string | `""` | Amount of memory that the repair-controller container requests |
| resources | object | `{"cpu":{"limit":"","request":""},"ephemeral-storage":{"limit":"","request":""},"memory":{"limit":"","request":""}}` | Resource requests and limits for linkerd-cni daemonset container |
| resources.cpu.limit | string | `""` | Maximum amount of CPU units that the cni container can use |
| resources.cpu.request | string | `""` | Amount of CPU units that the cni container requests |
| resources.ephemeral-storage.limit | string | `""` | Maximum amount of ephemeral storage that the cni container can use |
Expand Down
59 changes: 59 additions & 0 deletions charts/linkerd2-cni/templates/cni-plugin.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -112,6 +112,12 @@ rules:
- apiGroups: [""]
resources: ["pods", "nodes", "namespaces", "services"]
verbs: ["list", "get", "watch"]
- apiGroups: [""]
resources: ["pods"]
verbs: ["delete"]
- apiGroups: ["events.k8s.io"]
resources: ["events"]
verbs: ["create"]
alpeb marked this conversation as resolved.
Show resolved Hide resolved
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
Expand Down Expand Up @@ -274,6 +280,59 @@ spec:
{{- if .Values.resources }}
{{- include "partials.resources" .Values.resources | nindent 8 }}
{{- end }}
{{- if .Values.repairController.enabled }}
# This container watches over pods whose linkerd-network-validator
# container failed, probably because of a race condition while setting up
# the CNI plugin chain, and deletes those pods so they can try acquiring a
# proper network config again
- name: repair-controller
image: {{ .Values.image.name -}}:{{- .Values.image.version }}
imagePullPolicy: {{ .Values.image.pullPolicy }}
{{- if .Values.repairController.enableSecurityContext }}
env:
- name: LINKERD_CNI_REPAIR_CONTROLLER_NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: LINKERD_CNI_REPAIR_CONTROLLER_POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
command:
- /usr/lib/linkerd/linkerd-cni-repair-controller
args:
- --admin-addr=0.0.0.0:9990
- --log-format
- {{ .Values.repairController.logFormat }}
- --log-level
- {{ .Values.repairController.logLevel }}
livenessProbe:
httpGet:
path: /live
port: admin-http
readinessProbe:
failureThreshold: 7
httpGet:
path: /ready
port: admin-http
initialDelaySeconds: 10
ports:
- containerPort: 9990
name: admin-http
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
privileged: false
readOnlyRootFilesystem: true
seccompProfile:
type: RuntimeDefault
{{- end }}
{{- if .Values.resources }}
{{- include "partials.resources" .Values.resources | nindent 8 }}
{{- end }}
{{- end }}
volumes:
{{- if ne .Values.destCNIBinDir .Values.destCNINetDir }}
- name: cni-bin-dir
Expand Down
60 changes: 42 additions & 18 deletions charts/linkerd2-cni/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -26,8 +26,11 @@ destCNINetDir: "/etc/cni/net.d"
destCNIBinDir: "/opt/cni/bin"
# -- Configures the CNI plugin to use the -w flag for the iptables command
useWaitFlag: false
# -- Kubernetes priorityClassName for the CNI plugin's Pods
priorityClassName: ""
# -- Kubernetes priorityClassName for the CNI plugin's Pods.
# Defaults to system-cluster-critical so it signals the scheduler to start
# before application pods, but after CNI plugins (whose priorityClassName is
# system-node-critical). This isn't strictly enforced.
priorityClassName: "system-cluster-critical"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed for the cni-repair controller specifically? If not, can you pull it out into a separate change?

If CNI plugins should run at system-node-critical, why wouldn't the Linkerd CNI run at system-node-critical? If we don't have that as a default now, is there a reason for that? I.e. are there any downsides to setting this as a default?

If we omit this change from this PR, this change feels less risky to me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not required for this PR. The reasoning for using system-cluster-critical was to allow for the main CNI plugin in the cluster to run first, lessening the chance to run into the race condition the repair controller attempts to fix. But it appears these class names are either best-effort or the prioritization mechanism is simply not implemented as advertised, according to my testing. I'll remove this for now.


# -- Add a PSP resource and bind it to the linkerd-cni ServiceAccounts.
# Note PSP has been deprecated since k8s v1.21
Expand All @@ -53,7 +56,7 @@ image:
# -- Docker image for the CNI plugin
name: "cr.l5d.io/linkerd/cni-plugin"
# -- Tag for the CNI container Docker image
version: "v1.2.2"
version: "v1.3.0"
# -- Pull policy for the linkerd-cni container
pullPolicy: IfNotPresent

Expand All @@ -71,22 +74,43 @@ imagePullSecrets: []

# -- Add additional initContainers to the daemonset
extraInitContainers: []
# - name: wait-for-other-cni
# image: busybox:1.33
# command:
# - /bin/sh
# - -xc
# - |
# for i in $(seq 1 180); do
# test -f /host/etc/cni/net.d/10-aws.conflist && exit 0
# sleep 1
# done
# exit 1
# volumeMounts:
# - mountPath: /host/etc/cni/net.d
# name: cni-net-dir

# -- Resource requests and limits for linkerd-cni daemonset containers
# -- The cni-repair-controller scans pods in each node to find those that have
alpeb marked this conversation as resolved.
Show resolved Hide resolved
# been injected by linkerd, and whose linkerd-network-validator container has
# failed. This is usually caused by a race between linkerd-cni and the CNI
# plugin used in the cluster. This controller deletes those failed pods so they
# can restart and rety re-acquiring a proper network config.
repairController:
enabled: false
alpeb marked this conversation as resolved.
Show resolved Hide resolved

# -- Log level for the repair-controller container
# @default -- info
logLevel: info
# -- Log format (`plain` or `json`) for the repair-controller container
# @default -- plain
logFormat: plain

# -- Include a securityContext in the repair-controller container
enableSecurityContext: true

resources:
cpu:
# -- Maximum amount of CPU units that the repair-controller container can use
limit: ""
# -- Amount of CPU units that the repair-controller container requests
request: ""
memory:
# -- Maximum amount of memory that the repair-controller container can use
limit: ""
# -- Amount of memory that the repair-controller container requests
request: ""
ephemeral-storage:
# -- Maximum amount of ephemeral storage that the repair-controller container can use
limit: ""
# -- Amount of ephemeral storage that the repair-controller container requests
request: ""

# -- Resource requests and limits for linkerd-cni daemonset container
resources:
cpu:
# -- Maximum amount of CPU units that the cni container can use
Expand Down
7 changes: 7 additions & 0 deletions cli/cmd/testdata/install-cni-plugin_default.golden

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 6 additions & 0 deletions cli/cmd/testdata/install-cni-plugin_fully_configured.golden

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

7 changes: 7 additions & 0 deletions cli/cmd/testdata/install-cni-plugin_skip_ports.golden

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

9 changes: 8 additions & 1 deletion cli/cmd/testdata/install_cni_helm_default_output.golden

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 6 additions & 0 deletions cli/cmd/testdata/install_cni_helm_override_output.golden

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

10 changes: 10 additions & 0 deletions pkg/charts/cni/values.go
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,15 @@ type Resources struct {
EphemeralStorage Constraints `json:"ephemeral-storage"`
}

// RepairController contains the config for the repair-controller container
type RepairController struct {
Image Image `json:"image"`
LogLevel string `json:"logLevel"`
LogFormat string `json:"logFormat"`
EnableSecurityContext bool `json:"enableSecurityContext"`
Resources Resources `json:"resources"`
}

// Values contains the top-level elements in the cni Helm chart
type Values struct {
InboundProxyPort uint `json:"inboundProxyPort"`
Expand All @@ -60,6 +69,7 @@ type Values struct {
EnablePSP bool `json:"enablePSP"`
Privileged bool `json:"privileged"`
Resources Resources `json:"resources"`
RepairController RepairController `json:"repairController"`
}

// NewValues returns a new instance of the Values type.
Expand Down
Loading