Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix status update in case of connection errors #3267

Merged
merged 2 commits into from
Oct 29, 2018

Conversation

aledbf
Copy link
Member

@aledbf aledbf commented Oct 18, 2018

What this PR does / why we need it:

In case of any error (like the API server not being temporarily available) updating the ingress status, the leader election code stops working. This change forces the termination of the running elector, starting a new one.

Which issue this PR fixes:

fixes #3180
fixes #3033

Test image: quay.io/aledbf/nginx-ingress-controller:0.415

To test this we need to simulate network issues:

  1. In a terminal run kubectl proxy --address=0.0.0.0 --accept-hosts=.*
  2. In a different terminal:
export HOST_IP=<IP eth0 or ne0>

docker run --net=host -it \
    -e POD_NAMESPACE=ingress-nginx \
    -e POD_NAME=local \
    quay.io/aledbf/nginx-ingress-controller:0.415 \
    /nginx-ingress-controller \
    --configmap=ingress-nginx/nginx-configuration \
    --apiserver-host=http://$HOST_IP:8001 \
    --publish-status-address=1.1.1.1
  1. wait a minute and terminate kubectl proxy
  2. wait a minute and start kubectl proxy again

the election of a new leader should appear in the log after the reconnection

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Oct 18, 2018
@aledbf aledbf force-pushed the fix-status-update branch from 24cfe47 to b931f5e Compare October 18, 2018 20:10
@aledbf aledbf force-pushed the fix-status-update branch 2 times, most recently from 831074c to 7db1db0 Compare October 19, 2018 01:06
@aledbf aledbf removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 19, 2018
@aledbf aledbf changed the title WIP: Fix status update in case of connection errors Fix status update in case of connection errors Oct 19, 2018
@ElvinEfendi
Copy link
Member

No regression test?

@aledbf
Copy link
Member Author

aledbf commented Oct 24, 2018

No regression test?

Writing one :)

@aledbf aledbf force-pushed the fix-status-update branch 10 times, most recently from 4540fea to 8522c80 Compare October 25, 2018 21:35
@codecov-io
Copy link

Codecov Report

Merging #3267 into master will decrease coverage by 0.04%.
The diff coverage is 65.45%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3267      +/-   ##
==========================================
- Coverage   48.36%   48.31%   -0.05%     
==========================================
  Files          75       75              
  Lines        5554     5561       +7     
==========================================
+ Hits         2686     2687       +1     
- Misses       2523     2529       +6     
  Partials      345      345
Impacted Files Coverage Δ
internal/ingress/controller/nginx.go 15.71% <0%> (ø) ⬆️
internal/ingress/status/status.go 71.42% <66.66%> (-1.65%) ⬇️
internal/watch/file_watcher.go 80.76% <0%> (-3.85%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bc6f2e7...8522c80. Read the comment docs.

@aledbf aledbf added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 25, 2018
@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Oct 25, 2018
@aledbf aledbf force-pushed the fix-status-update branch from 431f54b to 5f357e5 Compare October 25, 2018 22:24
@aledbf aledbf force-pushed the fix-status-update branch 5 times, most recently from 411f16e to 91156fa Compare October 26, 2018 16:42
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Oct 26, 2018
@aledbf aledbf force-pushed the fix-status-update branch 9 times, most recently from 1c1cd8d to a5934a4 Compare October 27, 2018 01:21
@aledbf aledbf removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 27, 2018
@aledbf
Copy link
Member Author

aledbf commented Oct 27, 2018

@ElvinEfendi this is now ready with e2e test 😉
Log of e2e test pod https://gist.github.com/aledbf/c8670927ef144c0523bf7fd763f5c2c2

@@ -333,6 +352,13 @@ func (s *statusSync) updateStatus(newIngressPoint []apiv1.LoadBalancerIngress) {
sort.SliceStable(newIngressPoint, lessLoadBalancerIngress(newIngressPoint))

for _, ing := range ings {
curIPs := ing.Status.LoadBalancer.Ingress
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block of code was moved here to avoid running unnecessary goroutines without an actual update to run

@aledbf aledbf force-pushed the fix-status-update branch from a5934a4 to 6a10e02 Compare October 29, 2018 15:58
@aledbf aledbf force-pushed the fix-status-update branch from 6a10e02 to fed013a Compare October 29, 2018 16:01
@ElvinEfendi
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 29, 2018
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: aledbf, ElvinEfendi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ElvinEfendi
Copy link
Member

thanks for the regression test ❤️

@k8s-ci-robot k8s-ci-robot merged commit a06f724 into kubernetes:master Oct 29, 2018
@aledbf aledbf deleted the fix-status-update branch October 29, 2018 19:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
5 participants