Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clear conntrack entries on 0 -> 1 endpoint transition with externalIPs #75265

Merged
merged 1 commit into from
Mar 18, 2019

Conversation

JacobTanenbaum
Copy link
Contributor

@JacobTanenbaum JacobTanenbaum commented Mar 11, 2019

As part of the endpoint creation process when going from 0 -> 1 conntrack entries
are cleared. This is to prevent an existing conntrack entry from preventing traffic
to the service. Currently the system ignores the existance of the services external IP
addresses, which exposes that errant behavior

This adds the externalIP addresses of udp services to the list of conntrack entries that
get cleared. Allowing traffic to flow

Signed-off-by: Jacob Tanenbaum [email protected]

/kind bug

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

UDP Service conntrack entries for ExternalIPs are now correctly cleared when endpoints are added 

@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Mar 11, 2019
@k8s-ci-robot
Copy link
Contributor

Hi @JacobTanenbaum. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added sig/network Categorizes an issue or PR as relevant to SIG Network. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Mar 11, 2019
@k8s-ci-robot k8s-ci-robot requested review from freehan and thockin March 11, 2019 16:01
@JacobTanenbaum
Copy link
Contributor Author

@dcbw PTAL

@dcbw
Copy link
Member

dcbw commented Mar 11, 2019

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Mar 11, 2019
@dcbw
Copy link
Member

dcbw commented Mar 11, 2019

/area kube-proxy

@dcbw
Copy link
Member

dcbw commented Mar 11, 2019

/lgtm

@k8s-ci-robot k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. release-note-none Denotes a PR that doesn't merit a release note. labels Mar 11, 2019
@dcbw
Copy link
Member

dcbw commented Mar 12, 2019

@thockin @freehan does this one look OK?

@dcbw
Copy link
Member

dcbw commented Mar 12, 2019

/priority important-soon

@k8s-ci-robot k8s-ci-robot added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Mar 12, 2019
@dcbw
Copy link
Member

dcbw commented Mar 14, 2019

/milestone v1.14

@spiffxp
Copy link
Member

spiffxp commented Mar 14, 2019

/hold
Can we get an explanation as to why this is important enough to introduce without an issue during code freeze? What user-facing bug does this fix? Is this something that can wait until v1.15 or v1.14.1?

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 14, 2019
@JacobTanenbaum
Copy link
Contributor Author

@spiffxp without this addition udp services with external IPs are broken if there is a stream of packets active when all endpoints go down and are restarted

@dcbw
Copy link
Member

dcbw commented Mar 14, 2019

What user-facing bug this fix? Is this something that can wait until v1.15 or v1.14.1?

@spiffxp if at all possible we'd like to get this fix into v1.14. If that's not possible, we might be OK with an /approve-for-1.14.1

@bowei
Copy link
Member

bowei commented Mar 14, 2019

/assign @freehan

@freehan
Copy link
Contributor

freehan commented Mar 14, 2019

LGTM

@freehan
Copy link
Contributor

freehan commented Mar 14, 2019

This PR #73323 should have covered the case right?

@JacobTanenbaum
Copy link
Contributor Author

@freehan ideally it should have, PR #73323 cleans up conntrack entries for entries in endpointUpdateResult.StaleEndpoints which handles when an endpoint gets deleted. It does not account for clearing on endpoint creation (which only needs conntrack to be cleared when going from 0 -> 1 endpoints backing the service) which is accounted for in endpointUpdateResult.StaleServiceNames

@nikopen
Copy link
Contributor

nikopen commented Mar 15, 2019

looks fairly important for me for stability of 1.14.0.

/approve

leaving the hold to @spiffxp

As part of the endpoint creation process when going from 0 -> 1 conntrack entries
are cleared. This is to prevent an existing conntrack entry from preventing traffic
to the service. Currently the system ignores the existance of the services external IP
addresses, which exposes that errant behavior

This adds the externalIP addresses of udp services to the list of conntrack entries that
get cleared. Allowing traffic to flow

Signed-off-by: Jacob Tanenbaum <[email protected]>
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 15, 2019
@dcbw
Copy link
Member

dcbw commented Mar 15, 2019

@JacobTanenbaum thanks for the update that also fixes IPVS.
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 15, 2019
@dcbw
Copy link
Member

dcbw commented Mar 15, 2019

/retest

2 similar comments
@JacobTanenbaum
Copy link
Contributor Author

/retest

@JacobTanenbaum
Copy link
Contributor Author

/retest

@spiffxp
Copy link
Member

spiffxp commented Mar 15, 2019

/hold cancel
thank you for the explanation and context all

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 15, 2019
@dcbw
Copy link
Member

dcbw commented Mar 15, 2019

@spiffxp ok thanks! we'll work on getting some review and approval. @thockin can you take a look?

@nikopen
Copy link
Contributor

nikopen commented Mar 17, 2019

/retest

@xmudrii
Copy link
Member

xmudrii commented Mar 18, 2019

@JacobTanenbaum @thockin The Code Thaw is starting tomorrow, Tuesday PST EOD. Would it be possible for this PR to get reviewed and approved by then? Otherwise it'll have to compete with bunch of other PRs post-thaw.

@@ -673,6 +673,9 @@ func (proxier *Proxier) syncProxyRules() {
if svcInfo, ok := proxier.serviceMap[svcPortName]; ok && svcInfo != nil && svcInfo.GetProtocol() == v1.ProtocolUDP {
klog.V(2).Infof("Stale udp service %v -> %s", svcPortName, svcInfo.ClusterIPString())
staleServices.Insert(svcInfo.ClusterIPString())
for _, extIP := range svcInfo.ExternalIPStrings() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we catch status.ingress.ip too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thockin yes we should catch status.ingress.ip also. I will submit another PR

Copy link
Member

@thockin thockin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving for velcoity but is this complete?

/approve
/lgtm

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: JacobTanenbaum, nikopen, thockin

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 18, 2019
@JacobTanenbaum
Copy link
Contributor Author

/retest

@k8s-ci-robot k8s-ci-robot merged commit aa9cbd1 into kubernetes:master Mar 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kube-proxy cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/network Categorizes an issue or PR as relevant to SIG Network. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants