Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changes to server selector are not reflected in opaqueness #11995

Closed
adleong opened this issue Jan 26, 2024 · 0 comments · Fixed by #12031 or #12053
Closed

Changes to server selector are not reflected in opaqueness #11995

adleong opened this issue Jan 26, 2024 · 0 comments · Fixed by #12031 or #12053
Labels

Comments

@adleong
Copy link
Member

adleong commented Jan 26, 2024

What is the issue?

If a Server is marking a Pod's port as opaque and then the Server's podSelector is updated to no longer select that Pod, then the Pod's port should no longer be marked as opaque. However, this update does not result in any updates from the destination API's Get stream and the port remains marked as opaque.

How can it be reproduced?

  1. Create a Pod, a Service which selects that pod, and a Server which selects the pod and marks a port as opaque
  2. Initiate a Get stream for the service and notice that the endpoint is marked with opaque transport:
go run controller/script/destination-client/main.go -method get -path test.emojivoto.svc.cluster.local:8080  --token '{"nodeName":"alex-worker"}'
INFO[0000] Add:
INFO[0000] labels: map[namespace:emojivoto service:test]
INFO[0000] - 10.42.0.163:8080
INFO[0000]   - labels: map[control_plane_ns:linkerd deployment:test pod:test-5d75589c45-wv5kz pod_template_hash:5d75589c45 serviceaccount:default zone:]
INFO[0000]   - protocol hint: UNKNOWN
INFO[0000]   - identity: dns_like_identity:{name:"default.emojivoto.serviceaccount.identity.linkerd.cluster.local"}  server_name:{name:"default.emojivoto.serviceaccount.identity.linkerd.cluster.local"}
INFO[0000]   - opaque transport port: 4143
  1. Edit the Server's selector so that it no longer selects the Pod
  2. Notice no updates on the Get stream.

Logs, error output, etc

This is because we ignore Server updates if the Server's selector does not select the pod:

https://github.com/linkerd/linkerd2/blob/main/controller/api/destination/watcher/endpoints_watcher.go#L1251

output of linkerd check -o short

N/A

Environment

reproduced on commit: 796bb85
k3d version v5.6.0
k3s version v1.27.4-k3s1 (default)

Possible solution

No response

Additional context

No response

Would you like to work on fixing this bug?

None

@adleong adleong added the bug label Jan 26, 2024
adleong added a commit that referenced this issue Feb 5, 2024
Fixes #11995

If a Server is marking a Pod's port as opaque and then the Server's podSelector is updated to no longer select that Pod, then the Pod's port should no longer be marked as opaque. However, this update does not result in any updates from the destination API's Get stream and the port remains marked as opaque.

We fix this by updating the endpoint watcher's handling of Server updates to consider both the old and the new Server.

Signed-off-by: Alex Leong <[email protected]>
mateiidavid added a commit that referenced this issue Feb 9, 2024
This release addresses some issues in the destination service that could cause
it to behave unexpectedly when processing updates.

* Fixed a race condition in the destination service that could cause panics
  under very specific conditions ([#12022]; fixes [#12010])
* Changed how updates to a `Server` selector are handled in the destination
  service. When a `Server` that marks a port as opaque no longer selects a
  resource, the resource's opaqueness will reverted to default settings
  ([#12031]; fixes [#11995])
* Introduced Helm configuration values for liveness and readiness probe
  timeouts and delays ([#11458]; fixes [#11453]) (thanks @jan-kantert!)

[#12010]: #12010
[#12022]: #12022
[#11995]: #11995
[#12031]: #12031
[#11453]: #11453
[#11458]: #11458

Signed-off-by: Matei David <[email protected]>
adleong added a commit that referenced this issue Feb 17, 2024
Fixes #11995

If a Server is marking a Pod's port as opaque and then the Server's podSelector is updated to no longer select that Pod, then the Pod's port should no longer be marked as opaque. However, this update does not result in any updates from the destination API's Get stream and the port remains marked as opaque.

We fix this by updating the endpoint watcher's handling of Server updates to consider both the old and the new Server.

Signed-off-by: Alex Leong <[email protected]>
adleong added a commit that referenced this issue Feb 20, 2024
This stable release back-ports bugfixes and improvements from recent edge
releases.

* Introduced support for arbitrary labels in the `podMonitors` field in the
  control plane Helm chart (thanks @jseiser!) ([#11222]; fixes [#11175])
* Added a `prometheusUrl` field for the heartbeat job in the control plane Helm
  chart (thanks @david972!) ([#11343]; fixes [#11342])
* Updated the Destination controller to return `INVALID_ARGUMENT` status codes
  properly when a `ServiceProfile` is requested for a service that does not
  exist. ([#11980])
* Reduced the load on the Destination controller by only processing Server
  updates on workloads affected by the Server ([#12017])
* Changed how updates to a `Server` selector are handled in the destination
  service. When a `Server` that marks a port as opaque no longer selects a
  resource, the resource's opaqueness will reverted to default settings
  ([#12031]; fixes [#11995])
* Fixed a race condition in the destination service that could cause panics
  under very specific conditions ([#12022]; fixes [#12010])
* Fixed an issue where inbound policy could be incorrect after certain policy
  resources are deleted ([#12088])

[#11222]: #11222
[#11175]: #11175
[#11343]: #11343
[#11342]: #11342
[#11980]: #11980
[#12017]: #12017
[#11995]: #11995
[#12031]: #12031
[#12010]: #12010
[#12022]: #12022
[#12088]: #12088

Signed-off-by: Alex Leong <[email protected]>
Signed-off-by: David ALEXANDRE <[email protected]>
Signed-off-by: Justin S <[email protected]>
Co-authored-by: Oliver Gould <[email protected]>
Co-authored-by: Alejandro Pedraza <[email protected]>
Co-authored-by: David ALEXANDRE <[email protected]>
Co-authored-by: Justin Seiser <[email protected]>
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 7, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant