Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ingress-nginx crashes on reload of configuration #4284

Closed
ac-hibbert opened this issue Jul 8, 2019 · 8 comments
Closed

ingress-nginx crashes on reload of configuration #4284

ac-hibbert opened this issue Jul 8, 2019 · 8 comments

Comments

@ac-hibbert
Copy link

Is this a request for help? (If yes, you should use our troubleshooting guide and community support channels, see https://kubernetes.io/docs/tasks/debug-application-cluster/troubleshooting/.):

What keywords did you search in NGINX Ingress controller issues before filing this one? (If you have found any duplicates, you should instead reply there.):

This is a follow on from #4041, which has been closed due to PR #4091. I have tested this with the latest version 0.25.0 and it still occurs.

Also found some related issues previously:-

#3459
#3457
#3737
#3684

Is this a BUG REPORT or FEATURE REQUEST? (choose one):

NGINX Ingress controller version:
0.25.0

Kubernetes version (use kubectl version):

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.2", GitCommit:"66049e3b21efe110454d67df4fa62b08ea79a19b", GitTreeState:"clean", BuildDate:"2019-05-16T18:55:03Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"13+", GitVersion:"v1.13.7-eks-c57ff8", GitCommit:"c57ff8e35590932c652433fab07988da79265d5b", GitTreeState:"clean", BuildDate:"2019-06-07T20:43:03Z", GoVersion:"go1.11.5", Compiler:"gc", Platform:"linux/amd64"}

Environment:

  • Cloud provider or hardware configuration: AWS EKS
  • OS (e.g. from /etc/os-release): AL2
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:

What happened:

Upon reloading of configuration nginx crashes

I0708 17:23:27.231397       8 controller.go:133] Configuration changes detected, backend reload required.
I0708 17:23:27.231983       8 controller.go:133] Configuration changes detected, backend reload required.
W0708 17:23:27.231633       7 controller.go:309] Error getting Service "jenkins-andy/jenkins-jnlp": no object matching key "jenkins-andy/jenkins-jnlp" in local store
I0708 17:23:27.231695       7 controller.go:133] Configuration changes detected, backend reload required.
I0708 17:23:27.309345       8 controller.go:149] Backend successfully reloaded.
[08/Jul/2019:17:23:27 +0000]TCP200000.000
I0708 17:23:27.310684       7 controller.go:149] Backend successfully reloaded.
[08/Jul/2019:17:23:27 +0000]TCP200000.000
I0708 17:23:27.337758       8 controller.go:149] Backend successfully reloaded.
[08/Jul/2019:17:23:27 +0000]TCP200000.000
I0708 17:23:27.332694       8 controller.go:149] Backend successfully reloaded.
[08/Jul/2019:17:23:27 +0000]TCP200000.000
I0708 17:23:27.465950       7 event.go:258] Event(v1.ObjectReference{Kind:"ConfigMap", Namespace:"ingress-nginx", Name:"tcp-services", UID:"86ae9cfe-458c-11e9-9bac-0afa1cd96c8a", APIVersion:"v1", ResourceVersion:"30052709", FieldPath:""}): type: 'Normal' reason: 'UPDATE' ConfigMap ingress-nginx/tcp-services
I0708 17:23:27.465645       8 event.go:258] Event(v1.ObjectReference{Kind:"ConfigMap", Namespace:"ingress-nginx", Name:"tcp-services", UID:"86ae9cfe-458c-11e9-9bac-0afa1cd96c8a", APIVersion:"v1", ResourceVersion:"30052709", FieldPath:""}): type: 'Normal' reason: 'UPDATE' ConfigMap ingress-nginx/tcp-services
I0708 17:23:27.465569       8 event.go:258] Event(v1.ObjectReference{Kind:"ConfigMap", Namespace:"ingress-nginx", Name:"tcp-services", UID:"86ae9cfe-458c-11e9-9bac-0afa1cd96c8a", APIVersion:"v1", ResourceVersion:"30052709", FieldPath:""}): type: 'Normal' reason: 'UPDATE' ConfigMap ingress-nginx/tcp-services
I0708 17:23:27.466535       8 event.go:258] Event(v1.ObjectReference{Kind:"ConfigMap", Namespace:"ingress-nginx", Name:"tcp-services", UID:"86ae9cfe-458c-11e9-9bac-0afa1cd96c8a", APIVersion:"v1", ResourceVersion:"30052709", FieldPath:""}): type: 'Normal' reason: 'UPDATE' ConfigMap ingress-nginx/tcp-services
I0708 17:23:28.919548       7 main.go:154] Received SIGTERM, shutting down
I0708 17:23:28.919601       7 nginx.go:402] Shutting down controller queues
I0708 17:23:28.919617       7 status.go:117] updating status of Ingress rules (remove)
I0708 17:23:28.943567       7 nginx.go:418] Stopping NGINX process
2019/07/08 17:23:28 [notice] 455#455: signal process started
E0708 17:23:34.860702       7 checker.go:41] healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: connect: connection refused
I0708 17:23:34.976083       7 nginx.go:431] NGINX process has stopped
I0708 17:23:34.976100       7 main.go:162] Handled quit, awaiting Pod deletion
E0708 17:23:38.047482       7 checker.go:41] healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: connect: connection refused
I0708 17:23:38.252232       8 main.go:154] Received SIGTERM, shutting down
I0708 17:23:38.252264       8 nginx.go:402] Shutting down controller queues
I0708 17:23:38.252282       8 status.go:117] updating status of Ingress rules (remove)
I0708 17:23:38.263363       8 nginx.go:418] Stopping NGINX process
2019/07/08 17:23:38 [notice] 457#457: signal process started
E0708 17:23:38.912718       8 checker.go:41] healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: connect: connection refused
E0708 17:23:39.764432       8 checker.go:41] healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: connect: connection refused
I0708 17:23:39.788563       8 main.go:154] Received SIGTERM, shutting down
I0708 17:23:39.788594       8 nginx.go:402] Shutting down controller queues
I0708 17:23:39.788607       8 status.go:117] updating status of Ingress rules (remove)
I0708 17:23:39.807639       8 nginx.go:418] Stopping NGINX process
2019/07/08 17:23:39 [notice] 457#457: signal process started
I0708 17:23:40.304388       8 nginx.go:431] NGINX process has stopped
I0708 17:23:40.304406       8 main.go:162] Handled quit, awaiting Pod deletion
W0708 17:23:40.564490       8 controller.go:1129] SSL certificate for server "jenkins-marmccor.dev-cdaas.umbrella.com" is about to expire (2019-06-04 19:04:05 +0000 UTC)
E0708 17:23:40.574011       8 checker.go:41] healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: connect: connection refused
I0708 17:23:42.850962       8 nginx.go:431] NGINX process has stopped
I0708 17:23:42.850979       8 main.go:162] Handled quit, awaiting Pod deletion
W0708 17:23:43.897846       8 controller.go:1129] SSL certificate for server "jenkins-marmccor.dev-cdaas.umbrella.com" is about to expire (2019-06-04 19:04:05 +0000 UTC)
E0708 17:23:44.860781       7 checker.go:41] healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: connect: connection refused
I0708 17:23:44.976238       7 main.go:165] Exiting with 0
[08/Jul/2019:17:23:45 +0000]TCP2002352330.001
I0708 17:23:46.437513       8 main.go:154] Received SIGTERM, shutting down
I0708 17:23:46.437545       8 nginx.go:402] Shutting down controller queues
I0708 17:23:46.437571       8 status.go:117] updating status of Ingress rules (remove)
I0708 17:23:46.464289       8 nginx.go:418] Stopping NGINX process
2019/07/08 17:23:46 [notice] 457#457: signal process started
I0708 17:23:48.506632       8 nginx.go:431] NGINX process has stopped
I0708 17:23:48.506652       8 main.go:162] Handled quit, awaiting Pod deletion
E0708 17:23:48.572382       8 checker.go:41] healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: connect: connection refused
E0708 17:23:48.910265       8 checker.go:41] healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: connect: connection refused
E0708 17:23:49.764411       8 checker.go:41] healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: connect: connection refused
rpc error: code = Unknown desc = Error: No such container: 691b0eaffc35f7aac64854e9ed0330dac09c1bcdc370f76fdbb3622f12cfa5f8I0708 17:23:50.304541       8 main.go:165] Exiting with 0
E0708 17:23:50.573989       8 checker.go:41] healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: connect: connection refused
I0708 17:23:52.851102       8 main.go:165] Exiting with 0
E0708 17:23:53.405254       8 checker.go:41] healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: connect: connection refused
rpc error: code = Unknown desc = Error: No such container: 780ee4713a3ec3f59eb26a0986e098c828925424e8549fde145558196affbe38E0708 17:23:56.164227       8 checker.go:41] healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: connect: connection refused
rpc error: code = Unknown desc = Error: No such container: 3f23efe22e028944d66691dc468b1af2bf9f49f2a24a67d5967984f0c8b92fecI0708 17:23:58.506779       8 main.go:165] Exiting with 0
rpc error: code = Unknown desc = Error: No such container: a8de6da4abda8a9fc42c098fa57eb310f4d9088bab2f37ff7b132445caa74532

What you expected to happen:

Pod stays up when being configured

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know:

@aledbf
Copy link
Member

aledbf commented Jul 8, 2019

main.go:154] Received SIGTERM, shutting down

This means the pod is not passing the probes (readiness/liveness)

@ac-hibbert
Copy link
Author

Specifically here I am removing a namespace (along with pods, ingress etc) and modifying tcp-services, nginx-ingress-controller (deployment) and ingress-nginx (service) to remove the tcp ports. Which triggers the reload

My setup is the same as the mandatory.yaml etc from this github repo. To me it seems the healthchecks are failing because the pod has been terminated + the pods have been terminated due to the reload.

I was under the impression that this was fixed

@aledbf
Copy link
Member

aledbf commented Jul 8, 2019

nginx-ingress-controller (deployment) and

If you change the deployment the running pod will be replaced. Why are you doing this? (this is not related to ingress-nginx but any deployment in k8s)

@ac-hibbert
Copy link
Author

Ah good point. I patch the deployment to delete the port of the service I have removed

@aledbf
Copy link
Member

aledbf commented Jul 8, 2019

@Hibbert can we close this issue?

@ac-hibbert
Copy link
Author

That bit I understand. Although it is not just when the deployment is reconfigured that I have the problem. It is also when I delete the apps namespace. I am using ingress-nginx to route JNLP traffic through to the jenkins master when I am running this. The reload seems to cause connectivity problem:-

Cannot contact i-08cb100f0b67f1286: hudson.remoting.ChannelClosedException: Channel "unknown": Remote call on JNLP4-connect connection from ip-10-207-56-171.ec2.internal/10.207.56.171:51970 failed. The channel is closing down or has closed down

@aledbf
Copy link
Member

aledbf commented Jul 8, 2019

It is also when I delete the apps namespace. I am using ingress-nginx to route JNLP traffic through to the jenkins master when I am running this.

That's expected. You are deleting the app being exposed. There is no pod running.

The reload seems to cause connectivity problem:-

Which reload?

@aledbf
Copy link
Member

aledbf commented Sep 3, 2019

Closing. This is fixed in master #4487
If you want to test the fix, you can use the image quay.io/kubernetes-ingress-controller/nginx-ingress-controller:dev

@aledbf aledbf closed this as completed Sep 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants