Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stable-2.14.9 #11949

Merged
merged 5 commits into from
Jan 19, 2024
Merged

stable-2.14.9 #11949

merged 5 commits into from
Jan 19, 2024

Conversation

adleong
Copy link
Member

@adleong adleong commented Jan 18, 2024

This stable release adds a cni-repair-controller which fixes the issue of
injected pods that cannot acquire proper network config because linkerd-cni
and/or the cluster's network CNI haven't fully started (#11699). It also
fixes a bug in the destination controller where having a large number of
Server resources could cause the destination controller to use an excessive
amount of CPU (#11907). Finally, it fixes a conflict with tap resource
shortnames which was causing warnings from kubectl v1.29.0+ (#11816).

alpeb and others added 4 commits January 18, 2024 17:49
Followup to linkerd/linkerd2-proxy-init#306
Fixes #11073

This adds the `reinitialize-pods` container to the `linkerd-cni`
DaemonSet, along with its config in `values.yaml`.

Also the `linkerd-cni`'s version is bumped, to contain the new binary
for this controller.
#11907)

Whenever the destination controller's informer receives an update of a Server resource, it checks every portPublisher in the endpointsWatcher to see if the Server selects any pods in that servicePort and updates those pods' opaque protocol field.  Regardless of if any pods were matched or if the opaque protocol changed, an update is sent to each listener.  This results in an update to every endpointTranslator each time a Server is updated.  During a resync, we get an update for every Server in the cluster which results in N updates to each endpointTranslator where N is the number of Servers in the cluster.

If N is greater than 100, it becomes possible that these N updates could overflow the endpointTranslator update queue if the queue is not being drained fast enough.

We change this to only send the update for a Server if at least one of the servicePort addresses was selected by that server AND it's opaque protocol field changed.

Signed-off-by: Alex Leong <[email protected]>
The Tap API resource shortnames were colliding with existing Kubernetes
resources (e.g. `po`, `deploy`, etc), causing warnings from kubectl
v1.29.0+.

Remove the shortnames from the Tap APIService handlers.

To validate:
```bash
bin/k3d cluster create

# install latest edge
curl https://run.linkerd.io/install-edge | sh
linkerd install --crds | kubectl apply -f -
linkerd install        | kubectl apply -f -
linkerd check
linkerd viz install    | kubectl apply -f -
linkerd check

# observe shortnames
kubectl api-resources --api-group=tap.linkerd.io

# with kubectl v1.29.0+, observe "Warning: short name..."
kubectl get po

# replace tap image
TAP_IMAGE=$(bin/docker-build-tap)
bin/k3d image load $TAP_IMAGE
kubectl -n linkerd-viz set image deploy/tap tap=$TAP_IMAGE

# verify shortnames are no longer present
kubectl api-resources --api-group=tap.linkerd.io

# with kubectl v1.29.0+, observe no warning
kubectl get po
```

Fixes #11784

Signed-off-by: Andrew Seigner <[email protected]>
We released a new version of the CNI plugin. The chart has been updated
to reference the new version, however, some of the tests and the Go
`version` pkg still reference the old version (v1.2.2). When installing
through the CLI, I noticed that even though the chart value renders an
image for the new repair controller, the image used is still v1.2.2, and
as such, the container won't be started due to a missing binary.

This change bumps the version to v1.3.0 everywhere.

Signed-off-by: Matei David <[email protected]>
@adleong adleong requested a review from a team as a code owner January 18, 2024 20:38
@adleong adleong force-pushed the alex/stable-2.14.9 branch from a070ab3 to 6b917e5 Compare January 19, 2024 00:05
Signed-off-by: Alex Leong <[email protected]>
@adleong adleong force-pushed the alex/stable-2.14.9 branch from 6b917e5 to 3469999 Compare January 19, 2024 00:06
Copy link
Member

@zaharidichev zaharidichev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@adleong adleong merged commit 2aae59b into release/stable-2.14 Jan 19, 2024
36 checks passed
@adleong adleong deleted the alex/stable-2.14.9 branch January 19, 2024 18:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants