Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KEP-2086: Add internalTrafficPolicy: PreferLocal #3010

Closed

Conversation

danwinship
Copy link
Contributor

@danwinship danwinship commented Oct 14, 2021

One-line PR description:
Adds PreferLocal option to internalTrafficPolicy

Issue link: #2086

Other comments:

OpenShift is already implementing this behavior as a hardcoded hack for DNS. The behavior of internalTrafficPolicy: Local is useless for us; OpenShift clusters do software updates regularly, and we don't want to cause rolling DNS failures across the cluster every time we update the CoreDNS DaemonSet.

I see in some previous discussions people talked about "SNAT semantics" as being a problem, but that only applies to external traffic policy; internal traffic doesn't get SNATted either way.

I left the second user story as still wanting Local semantics, not PreferLocal, since I guess some people may still want that sometimes...

cc @andrewsykim @robscott @aojea @Miciah @thockin

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Oct 14, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: danwinship
To complete the pull request process, please assign caseydavenport after the PR has been reviewed.
You can assign the PR to them by writing /assign @caseydavenport in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/network Categorizes an issue or PR as relevant to SIG Network. labels Oct 14, 2021
@aojea
Copy link
Member

aojea commented Oct 14, 2021

These was requested lately in slack and some other gihub issues ... and is a thing in Openshift, so there is proof of feasibility
LGTM

@robscott
Copy link
Member

Is it going to confuse users if we only have PreferLocal for internal but not external traffic policy?

@danwinship
Copy link
Contributor Author

Maybe... but adding externalTrafficPolicy: PreferLocal might also be confusing, since the current messaging around externalTrafficPolicy: Local is all about preserving source IP, which PreferLocal would have to not guarantee. There'd need to be some doc rewriting...

FWIW, OpenShift actually also has prefer-local external traffic policy as a local hack. A bit more detail here but it's basically the same as the internal case for DNS: we want to use externalTrafficPolicy: Local for efficiency, and don't particularly care about the no-SNAT thing, but care a lot about not having any disruption when endpoints change.

(Maybe PreferLocal would be less confusing if we said that it never preserves the source IP, just like Cluster, rather than sometimes preserving it and sometimes not? ie, it would SNAT the traffic even when sending it to a local endpoint.)

@robscott
Copy link
Member

we want to use externalTrafficPolicy: Local for efficiency, and don't particularly care about the no-SNAT thing, but care a lot about not having any disruption when endpoints change.

I think this is a very common use of the feature.

Maybe PreferLocal would be less confusing if we said that it never preserves the source IP, just like Cluster, rather than sometimes preserving it and sometimes not? ie, it would SNAT the traffic even when sending it to a local endpoint.

💯 completely agree with this.

@andrewsykim
Copy link
Member

Fwiw the original implementation of internalTrafficPolicy had PreferLocal as an option, but we removed it since we weren't sure how practical it would be w.r.t topology aware routing.

@danwinship
Copy link
Contributor Author

Fwiw the original implementation of internalTrafficPolicy had PreferLocal as an option, but we removed it since we weren't sure how practical it would be w.r.t topology aware routing.

How so? I guess, if topology aware routing worked down to the node level, then you wouldn't need PreferLocal, since the topology would take care of it. Is that what you mean? Or did you just mean "what to do when traffic policy and topology conflict"? In that case, the KEPs say that internalTrafficPolicy: Local overrides topology for internal traffic, and externalTrafficPolicy: Local overrides topology for external traffic. I guess we'd have to clarify that that applies to PreferLocal too.

@danwinship
Copy link
Contributor Author

So it seems like maybe I should close this PR, let internal traffic policy (with just Cluster vs Local) go to GA on its original schedule, and file a new PR adding PreferLocal for both internal and external policy?

@danwinship
Copy link
Contributor Author

So it seems like maybe I should close this PR, let internal traffic policy (with just Cluster vs Local) go to GA on its original schedule, and file a new PR adding PreferLocal for both internal and external policy?

Right. Further discussion on #3016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/network Categorizes an issue or PR as relevant to SIG Network. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants