Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Health check error #3993

Closed
ghost opened this issue Apr 10, 2019 · 37 comments
Closed

Health check error #3993

ghost opened this issue Apr 10, 2019 · 37 comments

Comments

@ghost
Copy link

ghost commented Apr 10, 2019

Hello,

My ingress controller suddenly stopped working. This is the message that I get. I have deployed it in the past following exactly the instructions here: https://kubernetes.github.io/ingress-nginx/deploy/
Everything was working, but after I restarted kubernetes and docker it doesn't work anymore. I tried to redeploy it but still. I am running on CentOS 7.

healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory

@aledbf
Copy link
Member

aledbf commented Apr 10, 2019

@thzois please use the template issue.

@aledbf
Copy link
Member

aledbf commented Apr 10, 2019

Please post the ingress controller pod logs to see exactly what's happening.

@rimusz
Copy link

rimusz commented Apr 17, 2019

I'm seeing exactly the same error in GKE cluster

@aledbf
Copy link
Member

aledbf commented Apr 17, 2019

@rimusz can you post the ingress controller pod log and the describe pod output?

@rimusz
Copy link

rimusz commented Apr 17, 2019

sure

$ kubectl describe pod gcstg-use1-nginx-ingress-controller-dn28b
Name:               gcstg-use1-nginx-ingress-controller-dn28b
Namespace:          gcstg-use1
Priority:           0
PriorityClassName:  <none>
Node:               gke-k8s-saas-us-east1-app-pool-07b3da5e-1m8r/192.168.21.16
Start Time:         Wed, 17 Apr 2019 14:57:21 +0300
Labels:             app=nginx-ingress
                    component=controller
                    controller-revision-hash=3331872658
                    pod-template-generation=1
                    release=gcstg-use1-nginx-ingress
Annotations:        <none>
Status:             Running
IP:                 10.96.4.94
Controlled By:      DaemonSet/gcstg-use1-nginx-ingress-controller
Containers:
  nginx-ingress-controller:
    Container ID:  docker://fd519c290a450324b8b973856527ab5a1e6da7ae7ea4c02d9f69e31ea75dc35f
    Image:         quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.24.1
    Image ID:      docker-pullable://docker.jfrog.io/kubernetes-ingress-controller/nginx-ingress-controller@sha256:76861d167e4e3db18f2672fd3435396aaa898ddf4d1128375d7c93b91c59f87f
    Ports:         80/TCP, 443/TCP, 18080/TCP, 10254/TCP
    Host Ports:    80/TCP, 443/TCP, 18080/TCP, 0/TCP
    Args:
      /nginx-ingress-controller
      --default-backend-service=gcstg-use1/gcstg-use1-nginx-ingress-default-backend
      --election-id=ingress-controller-leader
      --ingress-class=nginx
      --configmap=gcstg-use1/gcstg-use1-nginx-ingress-controller
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Wed, 17 Apr 2019 15:14:05 +0300
      Finished:     Wed, 17 Apr 2019 15:14:54 +0300
    Ready:          False
    Restart Count:  9
    Liveness:       http-get http://:10254/healthz delay=10s timeout=10s period=10s #success=1 #failure=3
    Readiness:      http-get http://:10254/healthz delay=10s timeout=10s period=10s #success=1 #failure=3
    Environment:
      POD_NAME:       gcstg-use1-nginx-ingress-controller-dn28b (v1:metadata.name)
      POD_NAMESPACE:  gcstg-use1 (v1:metadata.namespace)
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from gcstg-use1-nginx-ingress-token-dzmdd (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  gcstg-use1-nginx-ingress-token-dzmdd:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  gcstg-use1-nginx-ingress-token-dzmdd
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/disk-pressure:NoSchedule
                 node.kubernetes.io/memory-pressure:NoSchedule
                 node.kubernetes.io/not-ready:NoExecute
                 node.kubernetes.io/unreachable:NoExecute
                 node.kubernetes.io/unschedulable:NoSchedule
Events:
  Type     Reason     Age                   From                                                   Message
  ----     ------     ----                  ----                                                   -------
  Warning  Unhealthy  16m (x8 over 17m)     kubelet, gke-k8s-saas-us-east1-app-pool-07b3da5e-1m8r  Readiness probe failed: HTTP probe failed with statuscode: 500
  Normal   Pulled     16m (x3 over 17m)     kubelet, gke-k8s-saas-us-east1-app-pool-07b3da5e-1m8r  Container image "quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.24.1" already present on machine
  Normal   Created    16m (x3 over 17m)     kubelet, gke-k8s-saas-us-east1-app-pool-07b3da5e-1m8r  Created container
  Normal   Started    16m (x3 over 17m)     kubelet, gke-k8s-saas-us-east1-app-pool-07b3da5e-1m8r  Started container
  Normal   Killing    12m (x6 over 17m)     kubelet, gke-k8s-saas-us-east1-app-pool-07b3da5e-1m8r  Killing container with id docker://nginx-ingress-controller:Container failed liveness probe.. Container will be killed and recreated.
  Warning  Unhealthy  7m47s (x22 over 17m)  kubelet, gke-k8s-saas-us-east1-app-pool-07b3da5e-1m8r  Liveness probe failed: HTTP probe failed with statuscode: 500
  Warning  BackOff    2m46s (x44 over 12m)  kubelet, gke-k8s-saas-us-east1-app-pool-07b3da5e-1m8r  Back-off restarting failed container

let me fetch pod's log as well, and remove sensitive stuff

@rimusz
Copy link

rimusz commented Apr 17, 2019

pod log:

$ kubectl logs gcstg-use1-nginx-ingress-controller-dn28b
-------------------------------------------------------------------------------
NGINX Ingress controller
  Release:    0.24.1
  Build:      git-ce418168f
  Repository: https://github.com/kubernetes/ingress-nginx
-------------------------------------------------------------------------------

I0417 12:14:05.324179       8 flags.go:185] Watching for Ingress class: nginx
W0417 12:14:05.324448       8 flags.go:214] SSL certificate chain completion is disabled (--enable-ssl-chain-completion=false)
nginx version: nginx/1.15.10
W0417 12:14:05.333045       8 client_config.go:549] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
I0417 12:14:05.333342       8 main.go:205] Creating API client for https://10.94.0.1:443
I0417 12:14:05.348097       8 main.go:249] Running in Kubernetes cluster version v1.11+ (v1.11.7-gke.12) - git (clean) commit 06f08e60069231bd21bdf673cf0595aac80b99f6 - platform linux/amd64
I0417 12:14:05.350047       8 main.go:102] Validated gcstg-use1/gcstg-use1-nginx-ingress-default-backend as the default backend.
I0417 12:14:05.595193       8 main.go:124] Created fake certificate with PemFileName: /etc/ingress-controller/ssl/default-fake-certificate.pem
I0417 12:14:05.624584       8 nginx.go:265] Starting NGINX Ingress controller
...
I0417 12:14:06.854240       8 event.go:209] Event(v1.ObjectReference{Kind:"Ingress", Namespace:"xxxx", Name:"xxxx-xxxx-xxxx-server", UID:"c5e0675e-1790-11e9-99c1-4201ac100003", APIVersion:"extensions/v1beta1", ResourceVersion:"89377026", FieldPath:""}): type: 'Normal' reason: 'CREATE' Ingress xxxx/xxxx-xxxx-xxxx-server
I0417 12:14:06.854563       8 backend_ssl.go:68] Adding Secret "xxxx/xxxx-info-secret" to the local store
I0417 12:14:06.854751       8 event.go:209] Event(v1.ObjectReference{Kind:"Ingress", Namespace:"gcstg-use1", Name:"gcstg-use1-xxxx-monitoring-page", UID:"98c47e93-6017-11e9-935b-4201ac100009", APIVersion:"extensions/v1beta1", ResourceVersion:"89377033", FieldPath:""}): type: 'Normal' reason: 'CREATE' Ingress gcstg-use1/gcstg-use1-xxxx-monitoring-page
I0417 12:14:06.855147       8 backend_ssl.go:68] Adding Secret "gcstg-use1/gcstg-use1-xxx-xxxx-info-secret" to the local store
I0417 12:14:06.931716       8 nginx.go:311] Starting NGINX process
I0417 12:14:06.931797       8 leaderelection.go:217] attempting to acquire leader lease  gcstg-use1/ingress-controller-leader-nginx...
W0417 12:14:06.936231       8 controller.go:797] Service "gcstg-use1/gcstg-use1-prom-alertmanager" does not have any active Endpoint.
W0417 12:14:06.936369       8 controller.go:797] Service "gcstg-use1/gcstg-use1-prom-prometheus" does not have any active Endpoint.
I0417 12:14:06.939031       8 leaderelection.go:227] successfully acquired lease gcstg-use1/ingress-controller-leader-nginx
I0417 12:14:06.941224       8 status.go:86] new leader elected: gcstg-use1-nginx-ingress-controller-dn28b
I0417 12:14:06.941335       8 controller.go:170] Configuration changes detected, backend reload required.
E0417 12:14:07.198944       8 controller.go:182] Unexpected failure reloading the backend:

-------------------------------------------------------------------------------
Error: exit status 1
2019/04/17 12:14:07 [notice] 54#54: ModSecurity-nginx v1.0.0
2019/04/17 12:14:07 [emerg] 54#54: invalid number of arguments in "set" directive in /tmp/nginx-cfg919104772:4321
nginx: [emerg] invalid number of arguments in "set" directive in /tmp/nginx-cfg919104772:4321
nginx: configuration file /tmp/nginx-cfg919104772 test failed

-------------------------------------------------------------------------------
W0417 12:14:07.198995       8 queue.go:130] requeuing initial-sync, err
-------------------------------------------------------------------------------
Error: exit status 1
2019/04/17 12:14:07 [notice] 54#54: ModSecurity-nginx v1.0.0
2019/04/17 12:14:07 [emerg] 54#54: invalid number of arguments in "set" directive in /tmp/nginx-cfg919104772:4321
nginx: [emerg] invalid number of arguments in "set" directive in /tmp/nginx-cfg919104772:4321
nginx: configuration file /tmp/nginx-cfg919104772 test failed

-------------------------------------------------------------------------------
W0417 12:14:10.269739       8 controller.go:797] Service "gcstg-use1/gcstg-use1-prom-prometheus" does not have any active Endpoint.
W0417 12:14:10.269883       8 controller.go:797] Service "gcstg-use1/gcstg-use1-prom-alertmanager" does not have any active Endpoint.
I0417 12:14:10.270525       8 controller.go:170] Configuration changes detected, backend reload required.
E0417 12:14:10.512571       8 controller.go:182] Unexpected failure reloading the backend:

-------------------------------------------------------------------------------
Error: exit status 1
2019/04/17 12:14:10 [notice] 62#62: ModSecurity-nginx v1.0.0
2019/04/17 12:14:10 [emerg] 62#62: invalid number of arguments in "set" directive in /tmp/nginx-cfg612748947:4321
nginx: [emerg] invalid number of arguments in "set" directive in /tmp/nginx-cfg612748947:4321
nginx: configuration file /tmp/nginx-cfg612748947 test failed

-------------------------------------------------------------------------------
W0417 12:14:10.512619       8 queue.go:130] requeuing kuku/jmeter-reporter, err
-------------------------------------------------------------------------------
Error: exit status 1
2019/04/17 12:14:10 [notice] 62#62: ModSecurity-nginx v1.0.0
2019/04/17 12:14:10 [emerg] 62#62: invalid number of arguments in "set" directive in /tmp/nginx-cfg612748947:4321
nginx: [emerg] invalid number of arguments in "set" directive in /tmp/nginx-cfg612748947:4321
nginx: configuration file /tmp/nginx-cfg612748947 test failed

-------------------------------------------------------------------------------
W0417 12:14:13.603063       8 controller.go:797] Service "gcstg-use1/gcstg-use1-prom-alertmanager" does not have any active Endpoint.
W0417 12:14:13.603133       8 controller.go:797] Service "gcstg-use1/gcstg-use1-prom-prometheus" does not have any active Endpoint.
I0417 12:14:13.603661       8 controller.go:170] Configuration changes detected, backend reload required.
E0417 12:14:13.853688       8 controller.go:182] Unexpected failure reloading the backend:

-------------------------------------------------------------------------------
Error: exit status 1
2019/04/17 12:14:13 [notice] 74#74: ModSecurity-nginx v1.0.0
2019/04/17 12:14:13 [emerg] 74#74: invalid number of arguments in "set" directive in /tmp/nginx-cfg020726998:4321
nginx: [emerg] invalid number of arguments in "set" directive in /tmp/nginx-cfg020726998:4321
nginx: configuration file /tmp/nginx-cfg020726998 test failed

-------------------------------------------------------------------------------
W0417 12:14:13.853736       8 queue.go:130] requeuing xxxxgcp01/xxxxgcp01-xxxx-rabbitmq-ha, err
-------------------------------------------------------------------------------
Error: exit status 1
2019/04/17 12:14:13 [notice] 74#74: ModSecurity-nginx v1.0.0
2019/04/17 12:14:13 [emerg] 74#74: invalid number of arguments in "set" directive in /tmp/nginx-cfg020726998:4321
nginx: [emerg] invalid number of arguments in "set" directive in /tmp/nginx-cfg020726998:4321
nginx: configuration file /tmp/nginx-cfg020726998 test failed

-------------------------------------------------------------------------------
W0417 12:14:16.936792       8 controller.go:797] Service "gcstg-use1/gcstg-use1-prom-prometheus" does not have any active Endpoint.
W0417 12:14:16.937059       8 controller.go:797] Service "gcstg-use1/gcstg-use1-prom-alertmanager" does not have any active Endpoint.
I0417 12:14:16.938282       8 controller.go:170] Configuration changes detected, backend reload required.
E0417 12:14:17.205245       8 controller.go:182] Unexpected failure reloading the backend:

-------------------------------------------------------------------------------
Error: exit status 1
2019/04/17 12:14:17 [notice] 81#81: ModSecurity-nginx v1.0.0
2019/04/17 12:14:17 [emerg] 81#81: invalid number of arguments in "set" directive in /tmp/nginx-cfg147961405:4321
nginx: [emerg] invalid number of arguments in "set" directive in /tmp/nginx-cfg147961405:4321
nginx: configuration file /tmp/nginx-cfg147961405 test failed

-------------------------------------------------------------------------------
W0417 12:14:17.205297       8 queue.go:130] requeuing gcstg-use1/xxxx.info-nfs, err
-------------------------------------------------------------------------------
Error: exit status 1
2019/04/17 12:14:17 [notice] 81#81: ModSecurity-nginx v1.0.0
2019/04/17 12:14:17 [emerg] 81#81: invalid number of arguments in "set" directive in /tmp/nginx-cfg147961405:4321
nginx: [emerg] invalid number of arguments in "set" directive in /tmp/nginx-cfg147961405:4321
nginx: configuration file /tmp/nginx-cfg147961405 test failed

-------------------------------------------------------------------------------
W0417 12:14:20.269750       8 controller.go:797] Service "gcstg-use1/gcstg-use1-prom-prometheus" does not have any active Endpoint.
W0417 12:14:20.269825       8 controller.go:797] Service "gcstg-use1/gcstg-use1-prom-alertmanager" does not have any active Endpoint.
I0417 12:14:20.270655       8 controller.go:170] Configuration changes detected, backend reload required.
E0417 12:14:20.522354       8 controller.go:182] Unexpected failure reloading the backend:

-------------------------------------------------------------------------------
Error: exit status 1
2019/04/17 12:14:20 [notice] 88#88: ModSecurity-nginx v1.0.0
2019/04/17 12:14:20 [emerg] 88#88: invalid number of arguments in "set" directive in /tmp/nginx-cfg161907320:4321
nginx: [emerg] invalid number of arguments in "set" directive in /tmp/nginx-cfg161907320:4321
nginx: configuration file /tmp/nginx-cfg161907320 test failed

-------------------------------------------------------------------------------
W0417 12:14:20.522398       8 queue.go:130] requeuing xxxxcentralgcstguse1/xxxxcentralgcstguse1-xxxx-rabbitmq-ha-discovery, err
-------------------------------------------------------------------------------
Error: exit status 1
2019/04/17 12:14:20 [notice] 88#88: ModSecurity-nginx v1.0.0
2019/04/17 12:14:20 [emerg] 88#88: invalid number of arguments in "set" directive in /tmp/nginx-cfg161907320:4321
nginx: [emerg] invalid number of arguments in "set" directive in /tmp/nginx-cfg161907320:4321
nginx: configuration file /tmp/nginx-cfg161907320 test failed

-------------------------------------------------------------------------------
E0417 12:14:22.590609       8 checker.go:41] healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
E0417 12:14:22.592768       8 checker.go:41] healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
W0417 12:14:23.603091       8 controller.go:797] Service "gcstg-use1/gcstg-use1-prom-alertmanager" does not have any active Endpoint.
W0417 12:14:23.603158       8 controller.go:797] Service "gcstg-use1/gcstg-use1-prom-prometheus" does not have any active Endpoint.
I0417 12:14:23.603716       8 controller.go:170] Configuration changes detected, backend reload required.
E0417 12:14:23.859675       8 controller.go:182] Unexpected failure reloading the backend:

-------------------------------------------------------------------------------
Error: exit status 1
2019/04/17 12:14:23 [notice] 95#95: ModSecurity-nginx v1.0.0
2019/04/17 12:14:23 [emerg] 95#95: invalid number of arguments in "set" directive in /tmp/nginx-cfg482805111:4321
nginx: [emerg] invalid number of arguments in "set" directive in /tmp/nginx-cfg482805111:4321
nginx: configuration file /tmp/nginx-cfg482805111 test failed

-------------------------------------------------------------------------------
W0417 12:14:23.859737       8 queue.go:130] requeuing kube-system/gcp-controller-manager, err
-------------------------------------------------------------------------------
Error: exit status 1
2019/04/17 12:14:23 [notice] 95#95: ModSecurity-nginx v1.0.0
2019/04/17 12:14:23 [emerg] 95#95: invalid number of arguments in "set" directive in /tmp/nginx-cfg482805111:4321
nginx: [emerg] invalid number of arguments in "set" directive in /tmp/nginx-cfg482805111:4321
nginx: configuration file /tmp/nginx-cfg482805111 test failed

-------------------------------------------------------------------------------
W0417 12:14:26.936343       8 controller.go:797] Service "gcstg-use1/gcstg-use1-prom-alertmanager" does not have any active Endpoint.
W0417 12:14:26.936410       8 controller.go:797] Service "gcstg-use1/gcstg-use1-prom-prometheus" does not have any active Endpoint.
I0417 12:14:26.936987       8 controller.go:170] Configuration changes detected, backend reload required.
E0417 12:14:27.223886       8 controller.go:182] Unexpected failure reloading the backend:

-------------------------------------------------------------------------------
Error: exit status 1
2019/04/17 12:14:27 [notice] 102#102: ModSecurity-nginx v1.0.0
2019/04/17 12:14:27 [emerg] 102#102: invalid number of arguments in "set" directive in /tmp/nginx-cfg479136874:4321
nginx: [emerg] invalid number of arguments in "set" directive in /tmp/nginx-cfg479136874:4321
nginx: configuration file /tmp/nginx-cfg479136874 test failed

-------------------------------------------------------------------------------
W0417 12:14:27.223950       8 queue.go:130] requeuing xxxxgcp7/xxxxgcp7-xxxx-xxxx-server, err
-------------------------------------------------------------------------------
Error: exit status 1
2019/04/17 12:14:27 [notice] 102#102: ModSecurity-nginx v1.0.0
2019/04/17 12:14:27 [emerg] 102#102: invalid number of arguments in "set" directive in /tmp/nginx-cfg479136874:4321
nginx: [emerg] invalid number of arguments in "set" directive in /tmp/nginx-cfg479136874:4321
nginx: configuration file /tmp/nginx-cfg479136874 test failed

-------------------------------------------------------------------------------
W0417 12:14:30.269805       8 controller.go:797] Service "gcstg-use1/gcstg-use1-prom-prometheus" does not have any active Endpoint.
W0417 12:14:30.269957       8 controller.go:797] Service "gcstg-use1/gcstg-use1-prom-alertmanager" does not have any active Endpoint.
I0417 12:14:30.270695       8 controller.go:170] Configuration changes detected, backend reload required.
E0417 12:14:30.534347       8 controller.go:182] Unexpected failure reloading the backend:

-------------------------------------------------------------------------------
Error: exit status 1
2019/04/17 12:14:30 [notice] 110#110: ModSecurity-nginx v1.0.0
2019/04/17 12:14:30 [emerg] 110#110: invalid number of arguments in "set" directive in /tmp/nginx-cfg755472065:4321
nginx: [emerg] invalid number of arguments in "set" directive in /tmp/nginx-cfg755472065:4321
nginx: configuration file /tmp/nginx-cfg755472065 test failed

-------------------------------------------------------------------------------
W0417 12:14:30.534398       8 queue.go:130] requeuing xxxx02/xxxx02-xxxx-rabbitmq-ha-discovery, err
-------------------------------------------------------------------------------
Error: exit status 1
2019/04/17 12:14:30 [notice] 110#110: ModSecurity-nginx v1.0.0
2019/04/17 12:14:30 [emerg] 110#110: invalid number of arguments in "set" directive in /tmp/nginx-cfg755472065:4321
nginx: [emerg] invalid number of arguments in "set" directive in /tmp/nginx-cfg755472065:4321
nginx: configuration file /tmp/nginx-cfg755472065 test failed

-------------------------------------------------------------------------------
E0417 12:14:32.590578       8 checker.go:41] healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
E0417 12:14:32.592722       8 checker.go:41] healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
W0417 12:14:33.603075       8 controller.go:797] Service "gcstg-use1/gcstg-use1-prom-prometheus" does not have any active Endpoint.
W0417 12:14:33.603131       8 controller.go:797] Service "gcstg-use1/gcstg-use1-prom-alertmanager" does not have any active Endpoint.
I0417 12:14:33.603649       8 controller.go:170] Configuration changes detected, backend reload required.
E0417 12:14:33.892027       8 controller.go:182] Unexpected failure reloading the backend:

-------------------------------------------------------------------------------
Error: exit status 1
2019/04/17 12:14:33 [notice] 117#117: ModSecurity-nginx v1.0.0
2019/04/17 12:14:33 [emerg] 117#117: invalid number of arguments in "set" directive in /tmp/nginx-cfg153138988:4321
nginx: [emerg] invalid number of arguments in "set" directive in /tmp/nginx-cfg153138988:4321
nginx: configuration file /tmp/nginx-cfg153138988 test failed

-------------------------------------------------------------------------------
W0417 12:14:33.892083       8 queue.go:130] requeuing xxxx/xxxx-xxxx-xxxx-persist, err
-------------------------------------------------------------------------------
Error: exit status 1
2019/04/17 12:14:33 [notice] 117#117: ModSecurity-nginx v1.0.0
2019/04/17 12:14:33 [emerg] 117#117: invalid number of arguments in "set" directive in /tmp/nginx-cfg153138988:4321
nginx: [emerg] invalid number of arguments in "set" directive in /tmp/nginx-cfg153138988:4321
nginx: configuration file /tmp/nginx-cfg153138988 test failed

-------------------------------------------------------------------------------
W0417 12:14:36.936405       8 controller.go:797] Service "gcstg-use1/gcstg-use1-prom-prometheus" does not have any active Endpoint.
W0417 12:14:36.936474       8 controller.go:797] Service "gcstg-use1/gcstg-use1-prom-alertmanager" does not have any active Endpoint.
I0417 12:14:36.937014       8 controller.go:170] Configuration changes detected, backend reload required.
E0417 12:14:37.240681       8 controller.go:182] Unexpected failure reloading the backend:

-------------------------------------------------------------------------------
Error: exit status 1
2019/04/17 12:14:37 [notice] 126#126: ModSecurity-nginx v1.0.0
2019/04/17 12:14:37 [emerg] 126#126: invalid number of arguments in "set" directive in /tmp/nginx-cfg213387931:4321
nginx: [emerg] invalid number of arguments in "set" directive in /tmp/nginx-cfg213387931:4321
nginx: configuration file /tmp/nginx-cfg213387931 test failed

-------------------------------------------------------------------------------
W0417 12:14:37.240730       8 queue.go:130] requeuing shlomidemo70/shlomidemo70-xxxx-rabbitmq-ha-discovery, err
-------------------------------------------------------------------------------
Error: exit status 1
2019/04/17 12:14:37 [notice] 126#126: ModSecurity-nginx v1.0.0
2019/04/17 12:14:37 [emerg] 126#126: invalid number of arguments in "set" directive in /tmp/nginx-cfg213387931:4321
nginx: [emerg] invalid number of arguments in "set" directive in /tmp/nginx-cfg213387931:4321
nginx: configuration file /tmp/nginx-cfg213387931 test failed

-------------------------------------------------------------------------------
W0417 12:14:40.269796       8 controller.go:797] Service "gcstg-use1/gcstg-use1-prom-prometheus" does not have any active Endpoint.
W0417 12:14:40.269922       8 controller.go:797] Service "gcstg-use1/gcstg-use1-prom-alertmanager" does not have any active Endpoint.
I0417 12:14:40.270550       8 controller.go:170] Configuration changes detected, backend reload required.
E0417 12:14:40.504945       8 controller.go:182] Unexpected failure reloading the backend:

-------------------------------------------------------------------------------
Error: exit status 1
2019/04/17 12:14:40 [notice] 133#133: ModSecurity-nginx v1.0.0
2019/04/17 12:14:40 [emerg] 133#133: invalid number of arguments in "set" directive in /tmp/nginx-cfg059340094:4321
nginx: [emerg] invalid number of arguments in "set" directive in /tmp/nginx-cfg059340094:4321
nginx: configuration file /tmp/nginx-cfg059340094 test failed

-------------------------------------------------------------------------------
W0417 12:14:40.504990       8 queue.go:130] requeuing xxxx02/xxxx02-xxxx-xxxx-indexer, err
-------------------------------------------------------------------------------
Error: exit status 1
2019/04/17 12:14:40 [notice] 133#133: ModSecurity-nginx v1.0.0
2019/04/17 12:14:40 [emerg] 133#133: invalid number of arguments in "set" directive in /tmp/nginx-cfg059340094:4321
nginx: [emerg] invalid number of arguments in "set" directive in /tmp/nginx-cfg059340094:4321
nginx: configuration file /tmp/nginx-cfg059340094 test failed

-------------------------------------------------------------------------------
E0417 12:14:42.590939       8 checker.go:41] healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
E0417 12:14:42.592859       8 checker.go:41] healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
I0417 12:14:42.893735       8 main.go:172] Received SIGTERM, shutting down
I0417 12:14:42.893814       8 nginx.go:387] Shutting down controller queues
I0417 12:14:42.893866       8 status.go:116] updating status of Ingress rules (remove)
I0417 12:14:42.912037       8 nginx.go:395] Stopping NGINX process
2019/04/17 12:14:42 [notice] 134#134: signal process started
I0417 12:14:43.925234       8 nginx.go:408] NGINX process has stopped
I0417 12:14:43.925284       8 main.go:180] Handled quit, awaiting Pod deletion
E0417 12:14:52.593093       8 checker.go:41] healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
I0417 12:14:53.925529       8 main.go:183] Exiting with 0

@aledbf
Copy link
Member

aledbf commented Apr 17, 2019

2019/04/17 12:14:40 [emerg] 133#133: invalid number of arguments in "set" directive in /tmp/nginx-cfg059340094:4321

@rimusz in your case the issue is related to a bad configuration. Are you using custom snippets?

Note: this is not going to be an issue 0.25 thanks to #3802

@rimusz
Copy link

rimusz commented Apr 17, 2019

no, we don't use any custom snippets there

any timeline for 0.25 release?

@aledbf
Copy link
Member

aledbf commented Apr 17, 2019

no, we don't use any custom snippets there

Ok, then use kubectl exec <ing pod> cat nginx: /tmp/nginx-cfg059340094 to see exactly what's wrong

any timeline for 0.25 release?

~3 weeks

@ghost
Copy link
Author

ghost commented May 1, 2019

Hello, I have not managed to reproduce the issue that's why I didn't post anything. I restarted the controller 2-3 times and the issue was resolved.

@mmingorance-dh
Copy link

I also face this issue in nginx-ingress 0.24.1

I0603 08:23:49.076078       8 nginx.go:311] Starting NGINX process
I0603 08:23:49.076198       8 leaderelection.go:217] attempting to acquire leader lease  utils/ingress-controller-leader-nginx...
I0603 08:23:49.079065       8 status.go:86] new leader elected: nginx-ingress-controller-84bb6995c5-rhzx6
I0603 08:23:49.114287       8 controller.go:170] Configuration changes detected, backend reload required.
2019/06/03 08:24:05 Get http+unix://nginx-status/nginx_status: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
W0603 08:24:05.078110       8 nginx_status.go:172] unexpected error obtaining nginx status info: Get http+unix://nginx-status/nginx_status: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
E0603 08:24:17.493531       8 checker.go:41] healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
2019/06/03 08:24:32 Get http+unix://nginx-status/nginx_status: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
W0603 08:24:32.776660       8 nginx_status.go:172] unexpected error obtaining nginx status info: Get http+unix://nginx-status/nginx_status: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
E0603 08:24:47.493559       8 checker.go:41] healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
E0603 08:24:55.022540       8 checker.go:41] healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
2019/06/03 08:25:05 Get http+unix://nginx-status/nginx_status: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
W0603 08:25:05.080689       8 nginx_status.go:172] unexpected error obtaining nginx status info: Get http+unix://nginx-status/nginx_status: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
E0603 08:25:17.493466       8 checker.go:41] healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory

I only faced this issue in my production environment and not in my staging/qa environments.
The most interesting thing is that eventually after some restarts, the pods start working.

@kwladyka
Copy link

kwladyka commented Jul 8, 2019

The same issue. It started to happen immediately after I deleted etingroup-sync pod to recreate it by automate by ReplicaSet. The pod works fine.

kubectl --context=etingroup-production -n nginx-ingress log -f pod/nginx-ingress-controller-589f7bc68f-g2bx6
-------------------------------------------------------------------------------
NGINX Ingress controller
  Release:    0.24.1
  Build:      git-ce418168f
  Repository: https://github.com/kubernetes/ingress-nginx
-------------------------------------------------------------------------------

I0708 10:11:31.107184       6 flags.go:185] Watching for Ingress class: nginx
W0708 10:11:31.107602       6 flags.go:214] SSL certificate chain completion is disabled (--enable-ssl-chain-completion=false)
nginx version: nginx/1.15.10
W0708 10:11:33.680844       6 client_config.go:549] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
I0708 10:11:33.683650       6 main.go:205] Creating API client for https://10.55.240.1:443
I0708 10:11:35.775897       6 main.go:249] Running in Kubernetes cluster version v1.13+ (v1.13.6-gke.13) - git (clean) commit fcbc1d20b6bca1936c0317743055ac75aef608ce - platform linux/amd64
I0708 10:11:35.784068       6 main.go:102] Validated nginx-ingress/nginx-ingress-default-backend as the default backend.
I0708 10:11:49.038667       6 main.go:124] Created fake certificate with PemFileName: /etc/ingress-controller/ssl/default-fake-certificate.pem
W0708 10:11:51.214288       6 store.go:613] Unexpected error reading configuration configmap: configmaps "nginx-ingress-controller" not found
I0708 10:11:51.713804       6 nginx.go:265] Starting NGINX Ingress controller
E0708 10:11:54.582229       6 checker.go:41] healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
I0708 10:11:57.173844       6 event.go:209] Event(v1.ObjectReference{Kind:"Ingress", Namespace:"argocd", Name:"argocd-server-http-ingress", UID:"1e332eb9-8451-11e9-9fe8-42010a8400b2", APIVersion:"extensions/v1beta1", ResourceVersion:"13010210", FieldPath:""}): type: 'Normal' reason: 'CREATE' Ingress argocd/argocd-server-http-ingress
I0708 10:11:57.957309       6 nginx.go:311] Starting NGINX process
I0708 10:11:57.959214       6 leaderelection.go:217] attempting to acquire leader lease  nginx-ingress/ingress-controller-leader-nginx...
I0708 10:11:58.130491       6 controller.go:170] Configuration changes detected, backend reload required.
I0708 10:11:58.978471       6 backend_ssl.go:68] Adding Secret "argocd/argocd-secret" to the local store
I0708 10:11:59.730692       6 backend_ssl.go:68] Adding Secret "concourse/concourse-etingroup-pl-tls" to the local store
I0708 10:11:59.723495       6 leaderelection.go:227] successfully acquired lease nginx-ingress/ingress-controller-leader-nginx
I0708 10:11:59.723578       6 status.go:86] new leader elected: nginx-ingress-controller-589f7bc68f-g2bx6
I0708 10:12:00.438162       6 event.go:209] Event(v1.ObjectReference{Kind:"Ingress", Namespace:"etingroup", Name:"etingroup-sync", UID:"35893ce3-84ae-11e9-90aa-42010a840106", APIVersion:"extensions/v1beta1", ResourceVersion:"13010211", FieldPath:""}): type: 'Normal' reason: 'CREATE' Ingress etingroup/etingroup-sync
I0708 10:12:00.438460       6 event.go:209] Event(v1.ObjectReference{Kind:"Ingress", Namespace:"concourse", Name:"concourse-web", UID:"22990b6f-8f96-11e9-90aa-42010a840106", APIVersion:"extensions/v1beta1", ResourceVersion:"13010209", FieldPath:""}): type: 'Normal' reason: 'CREATE' Ingress concourse/concourse-web
E0708 10:12:00.441464       6 checker.go:41] healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
I0708 10:12:00.448210       6 backend_ssl.go:68] Adding Secret "etingroup/main-etingroup-pl-tls" to the local store
I0708 10:12:00.560179       6 main.go:172] Received SIGTERM, shutting down
I0708 10:12:00.561090       6 nginx.go:387] Shutting down controller queues
I0708 10:12:00.562061       6 status.go:116] updating status of Ingress rules (remove)
W0708 10:12:00.606436       6 template.go:108] unexpected error cleaning template: signal: terminated
E0708 10:12:00.612075       6 controller.go:182] Unexpected failure reloading the backend:
invalid NGINX configuration (empty)
W0708 10:12:00.612360       6 queue.go:130] requeuing initial-sync, err invalid NGINX configuration (empty)
I0708 10:12:00.616718       6 status.go:135] removing address from ingress status ([35.195.XXX.XX])
I0708 10:12:00.617685       6 nginx.go:395] Stopping NGINX process
I0708 10:12:00.620943       6 status.go:295] updating Ingress argocd/argocd-server-http-ingress status from [] to [{35.195.XXX.XX }]
I0708 10:12:00.621944       6 status.go:295] updating Ingress etingroup/etingroup-sync status from [] to [{35.195.XXX.XX }]
I0708 10:12:00.623930       6 status.go:295] updating Ingress concourse/concourse-web status from [] to [{35.195.XXX.XX }]
2019/07/08 10:12:00 [notice] 35#35: signal process started
E0708 10:12:04.107941       6 checker.go:41] healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
I0708 10:12:05.757746       6 nginx.go:408] NGINX process has stopped
I0708 10:12:05.761038       6 main.go:180] Handled quit, awaiting Pod deletion
E0708 10:12:14.324338       6 checker.go:41] healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
I0708 10:12:15.937613       6 main.go:183] Exiting with 0

values.yaml for helm

nginx-ingress:
    controller:
        service:
            externalTrafficPolicy: "Local"
            loadBalancerIP: "35.195.XXX.XX"

        publishService:
            enabled: true

* X is censored :)

The issue appear in version: 1.6.18 of helm chart.

It fixed itself after about 15 minutes.

kubectl --context=etingroup-production -n nginx-ingress describe  pod/nginx-ingress-controller-589f7bc68f-g2bx6
Name:           nginx-ingress-controller-589f7bc68f-g2bx6
Namespace:      nginx-ingress
Priority:       0
Node:           gke-production-pool-1-6cb6f205-5rft/10.132.0.5
Start Time:     Mon, 24 Jun 2019 18:55:28 +0200
Labels:         app=nginx-ingress
                component=controller
                pod-template-hash=589f7bc68f
                release=nginx-ingress
Annotations:    <none>
Status:         Running
IP:             10.52.1.35
Controlled By:  ReplicaSet/nginx-ingress-controller-589f7bc68f
Containers:
  nginx-ingress-controller:
    Container ID:  docker://8c469aea0a1ab17bdc4a9849686eb30c8d7810355e4de3935f52f2e7067f4c4a
    Image:         quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.24.1
    Image ID:      docker-pullable://quay.io/kubernetes-ingress-controller/nginx-ingress-controller@sha256:76861d167e4e3db18f2672fd3435396aaa898ddf4d1128375d7c93b91c59f87f
    Ports:         80/TCP, 443/TCP
    Host Ports:    0/TCP, 0/TCP
    Args:
      /nginx-ingress-controller
      --default-backend-service=nginx-ingress/nginx-ingress-default-backend
      --publish-service=nginx-ingress/nginx-ingress-controller
      --election-id=ingress-controller-leader
      --ingress-class=nginx
      --configmap=nginx-ingress/nginx-ingress-controller
    State:          Running
      Started:      Mon, 08 Jul 2019 12:31:01 +0200
    Last State:     Terminated
      Reason:       Error
      Exit Code:    143
      Started:      Mon, 08 Jul 2019 12:25:13 +0200
      Finished:     Mon, 08 Jul 2019 12:25:52 +0200
    Ready:          True
    Restart Count:  68
    Liveness:       http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=3
    Readiness:      http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=3
    Environment:
      POD_NAME:       nginx-ingress-controller-589f7bc68f-g2bx6 (v1:metadata.name)
      POD_NAMESPACE:  nginx-ingress (v1:metadata.namespace)
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from nginx-ingress-token-n59h8 (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  nginx-ingress-token-n59h8:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  nginx-ingress-token-n59h8
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason     Age                   From                                          Message
  ----     ------     ----                  ----                                          -------
  Normal   Pulled     35m (x59 over 13d)    kubelet, gke-production-pool-1-6cb6f205-5rft  Container image "quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.24.1" already present on machine
  Normal   Created    35m (x59 over 13d)    kubelet, gke-production-pool-1-6cb6f205-5rft  Created container
  Normal   Killing    35m (x58 over 9d)     kubelet, gke-production-pool-1-6cb6f205-5rft  Killing container with id docker://nginx-ingress-controller:Container failed liveness probe.. Container will be killed and recreated.
  Warning  Unhealthy  35m (x38 over 9d)     kubelet, gke-production-pool-1-6cb6f205-5rft  Readiness probe failed: HTTP probe failed with statuscode: 500
  Warning  Unhealthy  33m (x391 over 13d)   kubelet, gke-production-pool-1-6cb6f205-5rft  Readiness probe failed: Get http://10.52.1.35:10254/healthz: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
  Warning  Unhealthy  32m (x328 over 13d)   kubelet, gke-production-pool-1-6cb6f205-5rft  Liveness probe failed: Get http://10.52.1.35:10254/healthz: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
  Warning  Unhealthy  27m (x147 over 9d)    kubelet, gke-production-pool-1-6cb6f205-5rft  Liveness probe failed: Get http://10.52.1.35:10254/healthz: dial tcp 10.52.1.35:10254: connect: connection refused
  Normal   Started    17m (x66 over 13d)    kubelet, gke-production-pool-1-6cb6f205-5rft  Started container
  Warning  BackOff    7m48s (x348 over 9d)  kubelet, gke-production-pool-1-6cb6f205-5rft  Back-off restarting failed container

But considering number of Unhealthy and BackOff it is not a first time when it was down.

Oh just one extra thing which maybe can help to recreate it. It is Ingress annotations for Service to this etingroup-sync Pod.

  annotations:
    kubernetes.io/ingress.class: "nginx"
    nginx.ingress.kubernetes.io/whitelist-source-range: "213.XXX.XXX.XXX/32,85.XXX.XXX.XXX/32"
    certmanager.k8s.io/cluster-issuer: "letsencrypt"

@kwladyka
Copy link

This time it started after update other Application without nginx.ingress.kubernetes.io/whitelist-source-range:.

Guys who have also this issue. Do you use publishService and externalTrafficPolicy for your nginx-ingress? I just updated to the newest ingress, but issue still exist. It would be great if somebody can fix it... it is very critical bug.

@kwladyka
Copy link

Anybody solved it? I still suffer this issue.

@Aloush-ha
Copy link

i have the same issue, version 0.25.1,

@eljefedelrodeodeljefe
Copy link

I think this issue is not being treated with enough tenacity. Honestly I wonder how one can use this in production, as even our staging tests for months keep failing on issues like this.

Restarting the controller is not a fix; It's sysadmin patch work.

@mmingorance-dh
Copy link

I've been facing this issue too and it got fixed after solving a couple of issues in my ingress resources.
In most of the cases, I could see how there were ingress resources deployed in my cluster which had no any endpoints available or even deployed.
After deleting those useless and problematic ingress resources, nginx started to start up normally.

@eljefedelrodeodeljefe Regarding your comment about how one can use this in production, I have to say that we run this component in production since 3 years and so far it hasn't cause any outage.

@kwladyka
Copy link

@mmingorance-dh are you sure the same issue?

When something is down, then down is whole ingress not only one service. It works, but randomly it doesn't. I have only a few services in cluster.

@eljefedelrodeodeljefe Do you use publishService and externalTrafficPolicy for your nginx-ingress? Maybe it is an issue. Probably not many people use it. I guess.

@mmingorance-dh
Copy link

@kwladyka not the same issue then as we don't use publishService and externalTrafficPolicy

@umar1201
Copy link

umar1201 commented Sep 9, 2019

I am also getting this error :

healthcheck error: Get http+unix://nginx-status/healthz: dial unix /tmp/nginx-status-server.sock: i/o timeout

And after some time, it works again and issue is intermittent, and without making any change , it started working.

Any solution to overcome this?

@aledbf
Copy link
Member

aledbf commented Sep 9, 2019

Closing. Fixed in master #4531

@aledbf aledbf closed this as completed Sep 9, 2019
@umar1201
Copy link

Closing. Fixed in master #4531

Can you please give me the Image tag ?
Currently I m using : quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.25.1

last time you give me something like this for example :
if you want to test the fix, you can use the image quay.io/kubernetes-ingress-controller/nginx-ingress-controller:dev

@aledbf
Copy link
Member

aledbf commented Sep 11, 2019

quay.io/kubernetes-ingress-controller/nginx-ingress-controller:dev

@umar1201
Copy link

Thanks For your Quick response , last question , is it okay to use :dev image for Production ?

Or should we wait for :0.25.2 or something ?

@kwladyka
Copy link

It is not ok to use dev on production in any project ;)

@umar1201
Copy link

What do you suggest, Should We wait for 0.25.2 or something ?

Because We are getting errors and its intermittent.

@kwladyka
Copy link

In my case I will wait. I don't see other choice.

@OGKevin
Copy link

OGKevin commented Sep 20, 2019

So, I've just ran into this issue as well.

However, my logs indicate that backed reload failed but it does not say why

Unexpected failure reloading the backend:
2019/09/20 06:26:51 Get http+unix://nginx-status/nginx_status: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory

unexpected error obtaining nginx status info: Get http+unix://nginx-status/nginx_status: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
nginx: [alert] kill(52, 1) failed (3: No such process)

2019/09/20 06:26:50 [alert] 2096#2096: kill(52, 1) failed (3: No such process)

2019/09/20 06:26:50 [notice] 2096#2096: signal process started

2019/09/20 06:26:50 [notice] 2096#2096: ModSecurity-nginx v1.0.0

nginx: [alert] kill(52, 1) failed (3: No such process)

2019/09/20 06:26:50 [alert] 2096#2096: kill(52, 1) failed (3: No such process)

2019/09/20 06:26:50 [notice] 2096#2096: signal process started

2019/09/20 06:26:50 [notice] 2096#2096: ModSecurity-nginx v1.0.0

exit status 1

requeuing nginx-private/nginx-private-nginx-ingress-controller-metrics, err exit status 1
Unexpected failure reloading the backend:
Configuration changes detected, backend reload required.
Configuration changes detected, backend reload required.
2019/09/20 06:26:48 Get http+unix://nginx-status/nginx_status: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory

2019/09/20 06:26:48 Get http+unix://nginx-status/nginx_status: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory

unexpected error obtaining nginx status info: Get http+unix://nginx-status/nginx_status: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
unexpected error obtaining nginx status info: Get http+unix://nginx-status/nginx_status: dial unix /tmp/nginx-status-server.sock: connect: no such file or directory
-------------------------------------------------------------------------------



2019/09/20 06:26:47 [notice] 122#122: ModSecurity-nginx v1.0.0

Error: signal: killed

-------------------------------------------------------------------------------

-------------------------------------------------------------------------------



2019/09/20 06:26:47 [notice] 122#122: ModSecurity-nginx v1.0.0

Error: signal: killed

-------------------------------------------------------------------------------


Will this also be solved in the next release or is this a different issue?

@aledbf
Copy link
Member

aledbf commented Sep 20, 2019

@OGKevin yes, you can check this using the image quay.io/kubernetes-ingress-controller/nginx-ingress-controller:dev

@adam-qin
Copy link

@OGKevin yes, you can check this using the image quay.io/kubernetes-ingress-controller/nginx-ingress-controller:dev

dev seems not work,but when i try to solve this problem,i found another error says " cannot list resource "ingresses" in API group "networking.k8s.io", so i add following, and the error missing,the ingress-controller has not been restarted from then

  • apiGroups:
    • "extensions"
    • "networking.k8s.io"

@aledbf
Copy link
Member

aledbf commented Sep 22, 2019

dev seems not work,but when i try to solve this problem,i found another error says " cannot list resource "ingresses" in API group "networking.k8s.io", so i add following, and the error missing,the ingress-controller has not been restarted from then

It seems you are using k8s >= 1.14. For that reason, you need to update the roles to be able to use that api https://github.com/kubernetes/ingress-nginx/blob/master/deploy/static/rbac.yaml#L53

@jurrian
Copy link

jurrian commented Oct 21, 2019

Eventually we fixed this by upgrading the nodes to 1.14.7-gke.10. After that the for i in $(seq 1 200); do curl localhost:10254/healthz; done inside the ingress-nginx container was done in a few seconds, whereas before it took minutes. It could well be that the upgrade triggered a reset on the root cause, which is still unknown to me. Or maybe somehow nginx-ingress-controller:0.26.1 works better with the newer kubernetes version.

@kwladyka
Copy link

kwladyka commented Nov 1, 2019

I still have this issue:

kubectl --context=etingroup-production get node
NAME                                  STATUS   ROLES    AGE   VERSION
gke-production-pool-1-ce587bf0-rxwq   Ready    <none>   31m   v1.14.7-gke.10

image


ingress version
`tag: "0.26.1"`

Is it possible it fail, because third party pod which has nginx-ingress service fail? Will nginx-ingress fail, because third party app fail?

@aasier
Copy link

aasier commented Jan 25, 2020

I have the same behavior with AKS 1.14.8 and nginx-controller 0.27.1 + HPA. @kwladyka

@ZzzJing
Copy link

ZzzJing commented Apr 27, 2020

So, anybody know the root case?

@kwladyka
Copy link

@ZzzJing check #4735

@jack-manju
Copy link

Hello, I have not managed to reproduce the issue that's why I didn't post anything. I restarted the controller 2-3 times and the issue was resolved.

it worked for me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests