Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nginx OOM #3314

Closed
yvespp opened this issue Oct 29, 2018 · 39 comments · Fixed by #3338
Closed

Nginx OOM #3314

yvespp opened this issue Oct 29, 2018 · 39 comments · Fixed by #3338

Comments

@yvespp
Copy link

yvespp commented Oct 29, 2018

Is this a request for help? (If yes, you should use our troubleshooting guide and community support channels, see https://kubernetes.io/docs/tasks/debug-application-cluster/troubleshooting/.): yes

What keywords did you search in NGINX Ingress controller issues before filing this one? (If you have found any duplicates, you should instead reply there.): memory, oom


Is this a BUG REPORT or FEATURE REQUEST? (choose one): BUG REPORT

NGINX Ingress controller version: 0.20.0 and 0.19.0

Kubernetes version (use kubectl version): v1.11.4

Environment:

  • Cloud provider or hardware configuration: 128 GB RAM; 20 CPUs
  • OS (e.g. from /etc/os-release): RHEL 7.5
  • Kernel (e.g. uname -a): 3.10.0
  • Install tools:
  • Others:

What happened: The memory of Nginx increases slowly and then suddenly increases rapidly until the Pod gets OOM killed by Kubernetes. In the Pod I can see that the memory is used by nginx itself.

Here the memory graph of one Pod over the last 24h:
image

All pods last 24h:
image

In the log of one of the crashed Pods I see this:

2018/10/28 23:44:13 [alert] 63#63: worker process 96931 exited on signal 9
2018/10/28 23:44:14 [alert] 63#63: worker process 96964 exited on signal 9
2018/10/28 23:44:14 [alert] 63#63: worker process 96997 exited on signal 9
2018/10/28 23:44:14 [alert] 63#63: worker process 96998 exited on signal 9
2018/10/28 23:44:14 [alert] 63#63: worker process 97064 exited on signal 9
2018/10/28 23:44:15 [error] 97065#97065: *754462 lua entry thread aborted: memory allocation error: not enough memory
stack traceback:
coroutine 0:
        [C]: in function 'ffi_str'
        /usr/local/lib/lua/resty/core/shdict.lua:242: in function 'get_backends_data'
        /etc/nginx/lua/balancer.lua:100: in function </etc/nginx/lua/balancer.lua:99>, context: ngx.timer
2018/10/28 23:44:15 [alert] 63#63: worker process 97131 exited on signal 9
2018/10/28 23:44:15 [alert] 63#63: worker process 97065 exited on signal 9
2018/10/28 23:44:16 [alert] 63#63: worker process 97229 exited on signal 9
2018/10/28 23:44:16 [alert] 63#63: worker process 97164 exited on signal 9
2018/10/28 23:44:16 [alert] 63#63: worker process 97295 exited on signal 9
2018/10/28 23:44:16 [alert] 63#63: worker process 97130 exited on signal 9
2018/10/28 23:44:17 [alert] 63#63: worker process 97262 exited on signal 9

With version 0.17.1 in the same cluster and dynamic config we had no such problems. 0.19.0 showed the same OOM behavior. I didn't try 0.18.

What you expected to happen: No OOMs

How to reproduce it (as minimally and precisely as possible): I don't know what causes it, doesn't happen on other, smaller clusters in our env. On the affected cluster we have 1121 Ingresses, 1384 Services and 1808 Pods.

Anything else we need to know:
Nginx Config Map:

  upstream-max-fails: "3"
  upstream-fail-timeout: "5"
  proxy-read-timeout: "300"
  proxy-send-timeout: "300"
  proxy-connect-timeout: "10"
  use-gzip: "false"
  client-body-buffer-size: "64k"
  server-tokens: "false"
  proxy-body-size: "20m"
  client_max_body_size: "20m"
  worker-shutdown-timeout: "300s"
  worker-processes: "4"

Controller flags:

      containers:
      - args:
        - /nginx-ingress-controller
        - --default-ssl-certificate=$(POD_NAMESPACE)/ingress-cert
        - --configmap=$(POD_NAMESPACE)/nginx-ingress-conf
        - --enable-ssl-chain-completion=false
        - --update-status=false
        - --publish-service=$(POD_NAMESPACE)/nginx-ingress

How can I debug this? Can I somehow see how much memory the lua module uses?

@aledbf
Copy link
Member

aledbf commented Oct 29, 2018

@yvespp first, please update to the latest version 0.20.0

Also, using the setting worker-shutdown-timeout: "300s" means that after a reload, nginx will consume twice the memory up to five minutes (the old worker processes)

@yvespp
Copy link
Author

yvespp commented Oct 29, 2018

@aledbf we're currently using 0.20.0, the graphs and logs are from that version.
Using worker-shutdown-timeout: "300s" was no problem in 0.17.1.

@ElvinEfendi
Copy link
Member

@yvespp can you increase the number of Nginx workers (i.e set it to 12) and try to reproduce this again? I wonder if you're hitting Luajit's memory limit.

In your big cluster can you consistently reproduce this?

One potentially related change since 0.17.0 is https://github.com/kubernetes/ingress-nginx/pull/2804/files#diff-cde3fffe2425ad7efaa8add1d05ae2c0R744 where it increase the payload size.

Can I somehow see how much memory the lua module uses?

you can use server snippet to configure something like

location /gc {
    content_by_lua '
        ngx.say(string.format("GC: %dKB", collectgarbage("count")))
    ';
}

to see memory usage.

@ElvinEfendi
Copy link
Member

Cloud provider or hardware configuration: 128 GB RAM; 20 CPUs

@yvespp how much of this are you requesting for the ingress-nginx pods?

In the provided logs I don't see OOMKill, are you seeing this in kernel logs?

@yvespp
Copy link
Author

yvespp commented Oct 29, 2018

Cloud provider or hardware configuration: 128 GB RAM; 20 CPUs

@yvespp how much of this are you requesting for the ingress-nginx pods?

In the provided logs I don't see OOMKill, are you seeing this in kernel logs?

Here the resources config for the nginx-ingress-controller container:

    resources:
      limits:
        cpu: "4"
        memory: 4Gi
      requests:
        cpu: 300m
        memory: 1200Mi

I see the OOMs in the Pod status and the event log of Kubernetes. Status (edited):

kubectl -n cop-nginx-ingress get pods nginx-ingress-h45jc -o yaml
...
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: 2018-10-26T06:12:18Z
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: 2018-10-29T12:33:04Z
    status: "True"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: null
    status: "True"
    type: ContainersReady
  - lastProbeTime: null
    lastTransitionTime: 2018-10-26T06:12:18Z
    status: "True"
    type: PodScheduled
  containerStatuses:
  - containerID: docker://1b1d125edca8485ece9504e0170298a5dcb1542599b7d256da2ae2d146e8e09b
    image: docker-registry.mycorp.com/kubernetes-ingress-controller/nginx-ingress-controller:0.20.0
    imageID: docker-pullable://docker-registry.mycorp.com/kubernetes-ingress-controller/nginx-ingress-controller@sha256:97bbe36d965aedce82f744669d2f78d2e4564c43809d43e80111cecebcb952d0
    lastState:
      terminated:
        containerID: docker://c2b80f88c5e81e1625d050d32b35f062ef681b5b5afb6256f4cef957f69fd624
        exitCode: 143
        finishedAt: 2018-10-29T12:32:27Z
        reason: OOMKilled
        startedAt: 2018-10-28T23:45:56Z
    name: nginx-ingress
    ready: true
    restartCount: 5
    state:
      running:
        startedAt: 2018-10-29T12:32:35Z
  startTime: 2018-10-26T06:12:18Z

Memory started to increase form 12:05 on. Here more complete log of a crash:

kubectl -n cop-nginx-ingress logs -p nginx-ingress-h45jc | grep -v remote_addr= | grep -v Service
...
2018/10/29 12:28:08 [warn] 77674#77674: *637227 a client request body is buffered to a temporary file /tmp/client-body/0000002289, client: 127.0.0.1, server: , request: "POST /configuration/backends HTTP/1.1", host: "localhost:18080"
I1029 12:28:08.298694      10 controller.go:212] Dynamic reconfiguration succeeded.
I1029 12:28:08.813040      10 controller.go:177] Configuration changes detected, backend reload required.
I1029 12:28:15.261921      10 controller.go:195] Backend successfully reloaded.
2018/10/29 12:28:15 [warn] 77867#77867: *637340 a client request body is buffered to a temporary file /tmp/client-body/0000002290, client: 127.0.0.1, server: , request: "POST /configuration/backends HTTP/1.1", host: "localhost:18080"
I1029 12:28:15.300504      10 controller.go:212] Dynamic reconfiguration succeeded.
I1029 12:28:18.138385      10 controller.go:175] Changes handled by the dynamic configuration, skipping backend reload.
2018/10/29 12:28:18 [warn] 77899#77899: *637400 a client request body is buffered to a temporary file /tmp/client-body/0000002291, client: 127.0.0.1, server: , request: "POST /configuration/backends HTTP/1.1", host: "localhost:18080"
I1029 12:28:18.191473      10 controller.go:212] Dynamic reconfiguration succeeded.
I1029 12:29:39.456278      10 controller.go:175] Changes handled by the dynamic configuration, skipping backend reload.
2018/10/29 12:29:39 [warn] 78008#78008: *638546 a client request body is buffered to a temporary file /tmp/client-body/0000002292, client: 127.0.0.1, server: , request: "POST /configuration/backends HTTP/1.1", host: "localhost:18080"
I1029 12:29:39.493177      10 controller.go:212] Dynamic reconfiguration succeeded.
I1029 12:29:56.063475      10 controller.go:177] Configuration changes detected, backend reload required.
I1029 12:30:05.139111      10 controller.go:195] Backend successfully reloaded.
2018/10/29 12:30:05 [warn] 78008#78008: *638855 a client request body is buffered to a temporary file /tmp/client-body/0000002293, client: 127.0.0.1, server: , request: "POST /configuration/backends HTTP/1.1", host: "localhost:18080"
I1029 12:30:05.194413      10 controller.go:212] Dynamic reconfiguration succeeded.
I1029 12:30:05.863656      10 controller.go:177] Configuration changes detected, backend reload required.
2018/10/29 12:30:08 [alert] 60#60: worker process 78204 exited on signal 9
2018/10/29 12:30:08 [alert] 60#60: worker process 78239 exited on signal 9
2018/10/29 12:30:09 [alert] 60#60: worker process 78274 exited on signal 9
E1029 12:30:09.307321      10 controller.go:189] Unexpected failure reloading the backend:

-------------------------------------------------------------------------------
Error: exit status 1
nginx: the configuration file /tmp/nginx-cfg613797303 syntax is ok
2018/10/29 12:30:07 [emerg] 78196#78196: bind() to 0.0.0.0:80 failed (38: Function not implemented)
nginx: [emerg] bind() to 0.0.0.0:80 failed (38: Function not implemented)
nginx: configuration file /tmp/nginx-cfg613797303 test failed

-------------------------------------------------------------------------------
W1029 12:30:09.307362      10 queue.go:130] requeuing sss-y/sss-businesslocks-service, err
-------------------------------------------------------------------------------
Error: exit status 1
nginx: the configuration file /tmp/nginx-cfg613797303 syntax is ok
2018/10/29 12:30:07 [emerg] 78196#78196: bind() to 0.0.0.0:80 failed (38: Function not implemented)
nginx: [emerg] bind() to 0.0.0.0:80 failed (38: Function not implemented)
nginx: configuration file /tmp/nginx-cfg613797303 test failed

-------------------------------------------------------------------------------
2018/10/29 12:30:09 [alert] 60#60: worker process 78208 exited on signal 9
I1029 12:30:09.409400      10 controller.go:177] Configuration changes detected, backend reload required.
2018/10/29 12:30:12 [alert] 60#60: worker process 78339 exited on signal 9
2018/10/29 12:30:12 [alert] 60#60: worker process 78408 exited on signal 9
2018/10/29 12:30:12 [alert] 60#60: worker process 78372 exited on signal 9
I1029 12:30:16.191852      10 controller.go:195] Backend successfully reloaded.
2018/10/29 12:30:16 [warn] 78532#78532: *638999 a client request body is buffered to a temporary file /tmp/client-body/0000002294, client: 127.0.0.1, server: , request: "POST /configuration/backends HTTP/1.1", host: "localhost:18080"
I1029 12:30:16.222203      10 controller.go:212] Dynamic reconfiguration succeeded.
I1029 12:30:16.906015      10 controller.go:177] Configuration changes detected, backend reload required.
2018/10/29 12:30:16 [alert] 60#60: fork() failed while spawning "worker process" (12: Cannot allocate memory)
2018/10/29 12:30:16 [alert] 60#60: sendmsg() failed (9: Bad file descriptor)
2018/10/29 12:30:16 [alert] 60#60: sendmsg() failed (9: Bad file descriptor)
2018/10/29 12:30:16 [alert] 60#60: sendmsg() failed (9: Bad file descriptor)
2018/10/29 12:30:16 [alert] 60#60: sendmsg() failed (9: Bad file descriptor)
2018/10/29 12:30:16 [alert] 60#60: sendmsg() failed (9: Bad file descriptor)
2018/10/29 12:30:16 [alert] 60#60: sendmsg() failed (9: Bad file descriptor)
2018/10/29 12:30:16 [alert] 60#60: sendmsg() failed (9: Bad file descriptor)
2018/10/29 12:30:16 [alert] 60#60: sendmsg() failed (9: Bad file descriptor)
2018/10/29 12:30:16 [alert] 60#60: sendmsg() failed (9: Bad file descriptor)
2018/10/29 12:30:16 [alert] 60#60: sendmsg() failed (9: Bad file descriptor)
2018/10/29 12:30:19 [alert] 60#60: worker process 78490 exited on signal 9
2018/10/29 12:30:19 [alert] 60#60: worker process 78491 exited on signal 9
2018/10/29 12:30:19 [alert] 60#60: worker process 78532 exited on signal 9
2018/10/29 12:30:19 [alert] 60#60: worker process 78409 exited on signal 9
I1029 12:30:23.660487      10 controller.go:195] Backend successfully reloaded.
2018/10/29 12:30:23 [warn] 78620#78620: *639100 a client request body is buffered to a temporary file /tmp/client-body/0000002295, client: 127.0.0.1, server: , request: "POST /configuration/backends HTTP/1.1", host: "localhost:18080"
I1029 12:30:23.686493      10 controller.go:212] Dynamic reconfiguration succeeded.
I1029 12:30:25.611392      10 controller.go:177] Configuration changes detected, backend reload required.
2018/10/29 12:30:23 [alert] 60#60: fork() failed while spawning "worker process" (12: Cannot allocate memory)
2018/10/29 12:30:23 [alert] 60#60: sendmsg() failed (9: Bad file descriptor)
2018/10/29 12:30:23 [alert] 60#60: sendmsg() failed (9: Bad file descriptor)
2018/10/29 12:30:23 [alert] 60#60: sendmsg() failed (9: Bad file descriptor)
2018/10/29 12:30:23 [alert] 60#60: sendmsg() failed (9: Bad file descriptor)
2018/10/29 12:30:23 [alert] 60#60: sendmsg() failed (9: Bad file descriptor)
2018/10/29 12:30:23 [alert] 60#60: sendmsg() failed (9: Bad file descriptor)
2018/10/29 12:30:23 [alert] 60#60: sendmsg() failed (9: Bad file descriptor)
2018/10/29 12:30:23 [alert] 60#60: sendmsg() failed (9: Bad file descriptor)
2018/10/29 12:30:23 [alert] 60#60: sendmsg() failed (9: Bad file descriptor)
2018/10/29 12:30:26 [alert] 60#60: unknown process 78738 exited on signal 9
2018/10/29 12:30:26 [alert] 60#60: worker process 78741 exited on signal 9
2018/10/29 12:30:26 [alert] 60#60: worker process 78748 exited on signal 9
2018/10/29 12:30:26 [alert] 60#60: worker process 78760 exited on signal 9
2018/10/29 12:30:27 [alert] 60#60: worker process 78832 exited on signal 9
2018/10/29 12:30:27 [alert] 60#60: worker process 78826 exited on signal 9
2018/10/29 12:30:27 [alert] 60#60: worker process 78879 exited on signal 9
E1029 12:30:27.363689      10 controller.go:189] Unexpected failure reloading the backend:

-------------------------------------------------------------------------------
Error: exit status 1
Error reading file /etc/nginx/geoip/GeoLiteCity.dat
2018/10/29 12:30:27 [emerg] 78880#78880: GeoIP_open("/etc/nginx/geoip/GeoLiteCity.dat") failed in /tmp/nginx-cfg731001452:79
nginx: [emerg] GeoIP_open("/etc/nginx/geoip/GeoLiteCity.dat") failed in /tmp/nginx-cfg731001452:79
nginx: configuration file /tmp/nginx-cfg731001452 test failed

-------------------------------------------------------------------------------
W1029 12:30:27.363728      10 queue.go:130] requeuing rocketchat-u/rocketchat-u-rocketchat, err
-------------------------------------------------------------------------------
Error: exit status 1
Error reading file /etc/nginx/geoip/GeoLiteCity.dat
2018/10/29 12:30:27 [emerg] 78880#78880: GeoIP_open("/etc/nginx/geoip/GeoLiteCity.dat") failed in /tmp/nginx-cfg731001452:79
nginx: [emerg] GeoIP_open("/etc/nginx/geoip/GeoLiteCity.dat") failed in /tmp/nginx-cfg731001452:79
nginx: configuration file /tmp/nginx-cfg731001452 test failed

-------------------------------------------------------------------------------
2018/10/29 12:30:27 [alert] 60#60: worker process 78881 exited on signal 9
2018/10/29 12:30:27 [alert] 60#60: worker process 78882 exited on signal 9
2018/10/29 12:30:27 [alert] 60#60: worker process 78915 exited on signal 9
2018/10/29 12:30:27 [alert] 60#60: worker process 78927 exited on signal 9
2018/10/29 12:30:27 [alert] 60#60: worker process 78928 exited on signal 9
2018/10/29 12:30:28 [alert] 60#60: worker process 78994 exited on signal 9
2018/10/29 12:30:28 [alert] 60#60: worker process 78961 exited on signal 9
2018/10/29 12:30:28 [alert] 60#60: worker process 78996 exited on signal 9
2018/10/29 12:30:28 [alert] 60#60: worker process 79029 exited on signal 9
I1029 12:30:28.982260      10 controller.go:177] Configuration changes detected, backend reload required.
I1029 12:30:35.463575      10 controller.go:195] Backend successfully reloaded.
I1029 12:30:36.054688      10 controller.go:175] Changes handled by the dynamic configuration, skipping backend reload.
2018/10/29 12:30:38 [alert] 60#60: worker process 79194 exited on signal 9
2018/10/29 12:30:38 [alert] 60#60: worker process 79201 exited on signal 9
2018/10/29 12:30:38 [alert] 60#60: unknown process 79190 exited on signal 9
2018/10/29 12:30:38 [alert] 60#60: worker process 79228 exited on signal 9
W1029 12:30:38.955128      10 controller.go:214] Dynamic reconfiguration failed: Post http://localhost:18080/configuration/backends: read tcp 127.0.0.1:60278->127.0.0.1:18080: read: connection reset by peer
W1029 12:30:38.955198      10 controller.go:214] Dynamic reconfiguration failed: Post http://localhost:18080/configuration/backends: read tcp 127.0.0.1:60308->127.0.0.1:18080: read: connection reset by peer
2018/10/29 12:30:39 [alert] 60#60: worker process 79256 exited on signal 9
2018/10/29 12:30:39 [alert] 60#60: worker process 79315 exited on signal 9
I1029 12:30:39.370822      10 controller.go:177] Configuration changes detected, backend reload required.
I1029 12:30:45.836171      10 controller.go:195] Backend successfully reloaded.
2018/10/29 12:30:49 [alert] 60#60: worker process 79350 exited on signal 9
2018/10/29 12:30:49 [alert] 60#60: worker process 79372 exited on signal 9
2018/10/29 12:30:49 [alert] 60#60: worker process 79385 exited on signal 9
2018/10/29 12:30:49 [alert] 60#60: worker process 79403 exited on signal 9
2018/10/29 12:30:49 [warn] 79521#79521: *639294 a client request body is buffered to a temporary file /tmp/client-body/0000002296, client: 127.0.0.1, server: , request: "POST /configuration/backends HTTP/1.1", host: "localhost:18080"
I1029 12:30:49.871323      10 controller.go:212] Dynamic reconfiguration succeeded.
2018/10/29 12:30:50 [alert] 60#60: unknown process 79510 exited on signal 9
2018/10/29 12:30:51 [alert] 60#60: worker process 79030 exited on signal 9
I1029 12:32:25.748396      10 main.go:163] Received SIGTERM, shutting down
I1029 12:32:25.748446      10 nginx.go:341] Shutting down controller queues
I1029 12:32:25.748469      10 nginx.go:349] Stopping NGINX process
I1029 12:32:25.749021      10 main.go:167] Error during shutdown: signal: terminated
I1029 12:32:25.749040      10 main.go:171] Handled quit, awaiting Pod deletion

@aledbf
Copy link
Member

aledbf commented Oct 29, 2018

@yvespp I suggest you remove the limits and add the /gc location mentioned by @ElvinEfendi
This will allows to see if the issue is related to the limits or in lua.

@aledbf
Copy link
Member

aledbf commented Oct 29, 2018

Also, every time you see Backend successfully reloaded it means nginx will double the memory usage until the old worker processes are removed.

@yvespp
Copy link
Author

yvespp commented Oct 29, 2018

@yvespp can you increase the number of Nginx workers (i.e set it to 12) and try to reproduce this again? I wonder if you're hitting Luajit's memory limit.

In your big cluster can you consistently reproduce this?

One potentially related change since 0.17.0 is https://github.com/kubernetes/ingress-nginx/pull/2804/files#diff-cde3fffe2425ad7efaa8add1d05ae2c0R744 where it increase the payload size.

Can I somehow see how much memory the lua module uses?

you can use server snippet to configure something like

location /gc {
    content_by_lua '
        ngx.say(string.format("GC: %dKB", collectgarbage("count")))
    ';
}

to see memory usage.

We had it running with Nginx workers not set and it it crashed as well.
OOMs happens verly constantly, in the last 6h we had about Pod 14 crashes with 25 Nginx replicas running.

Will add the server snippet and increase the memory limit.

@yvespp
Copy link
Author

yvespp commented Oct 30, 2018

With a ressource limit of 10 GB the Controller Pods no longer get OOM killed and restartet.
The output from the server snippet fluctuates between 4 and 12 MB and doesn't grow over time, so the lua part seems to work fine.

Still sometimes nginx starts to use a lot of memory suddenly and after a while releases it again. E.g.:
image
From about 01:00 to 03:30 memory usage increased steadily from 1.6 to 4.1 GB and then dropped back to 1.7 GB. Pods also have livness and readyness probe failures:

Readiness probe failed: Get http://10.32.50.226:10254/healthz: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
Liveness probe failed: Get http://10.32.50.226:10254/healthz: net/http: request canceled (Client.Timeout exceeded while awaiting headers)

In the logs in don't see anything special:

I1030 01:17:48.587764 9 controller.go:177] Configuration changes detected, backend reload required.
--
I1030 01:17:55.568008 9 controller.go:195] Backend successfully reloaded.
I1030 01:17:55.613569 9 controller.go:212] Dynamic reconfiguration succeeded.
I1030 01:18:46.501962 9 controller.go:177] Configuration changes detected, backend reload required.
I1030 01:18:54.522308 9 controller.go:195] Backend successfully reloaded.
I1030 01:18:54.560216 9 controller.go:212] Dynamic reconfiguration succeeded.
I1030 01:18:55.343100 9 controller.go:177] Configuration changes detected, backend reload required.
I1030 01:19:05.102516 9 controller.go:195] Backend successfully reloaded.
I1030 01:19:05.161724 9 controller.go:212] Dynamic reconfiguration succeeded.
I1030 01:19:06.689191 9 controller.go:177] Configuration changes detected, backend reload required.
I1030 01:19:17.662288 9 controller.go:195] Backend successfully reloaded.
I1030 01:19:17.756550 9 controller.go:212] Dynamic reconfiguration succeeded.
I1030 01:20:06.288051 9 controller.go:177] Configuration changes detected, backend reload required.
I1030 01:20:19.065371 9 controller.go:195] Backend successfully reloaded.
I1030 01:20:19.162346 9 controller.go:212] Dynamic reconfiguration succeeded.
I1030 01:20:20.162305 9 controller.go:175] Changes handled by the dynamic configuration, skipping backend reload.
I1030 01:20:20.263043 9 controller.go:212] Dynamic reconfiguration succeeded.
I1030 01:20:23.425881 9 controller.go:175] Changes handled by the dynamic configuration, skipping backend reload.
I1030 01:20:23.512980 9 controller.go:212] Dynamic reconfiguration succeeded.
I1030 01:20:30.176249 9 controller.go:177] Configuration changes detected, backend reload required.
I1030 01:20:40.319298 9 controller.go:195] Backend successfully reloaded.
I1030 01:20:40.384064 9 controller.go:212] Dynamic reconfiguration succeeded.
I1030 01:21:07.995160 9 controller.go:177] Configuration changes detected, backend reload required.

How does it decide if it does a dynamic reload or not? I seems to me there are a lot of non-dynamic reloads... at night we shouldn't have any config changes at all.
Maybe a reload trips up Nginx?

Thanks for your tipps so for, you have more of them? :-)

@aledbf
Copy link
Member

aledbf commented Oct 30, 2018

How does it decide if it does a dynamic reload or not?

If the change is related to pods being created or removed, a dynamic reload is enough. If there's a change in the configuration, a full reload is required.

To see the reason for the reload you can add the flag --v=2 in the ingress controller deployment. With that flag you will see the diff in the configuration file in the pod logs.

@aledbf
Copy link
Member

aledbf commented Oct 30, 2018

Maybe a reload trips up Nginx?

No, as I mentioned, when a reload is triggered nginx creates new worker processes, keeping the old ones alive until worker-shutdown-timeout: "300s" or no more connections alive in the old processes.
This is the reason why you see the spikes of memory when a reload happens.

@yvespp
Copy link
Author

yvespp commented Oct 31, 2018

Yesterday I turned dynamic config and geoip off. Now the controller runs much better: uses less memory, no memory spikes so far.
image
The vertical red doted lines are when an external health check fails.
The memory consumtion still goes up slowly over time.

I will try with dyn mode on and geoip off to see if it hase any effect. Will also turn on logging of reloads.

@yvespp
Copy link
Author

yvespp commented Oct 31, 2018

Most reloads in our env seem to be because the ingress doesn't have active pods anymore:

I1031 13:10:41.963160      10 nginx.go:635] NGINX configuration diff:
--- /etc/nginx/nginx.conf       2018-10-31 13:08:58.263079675 +0000
+++ /tmp/new-nginx-cfg494125317 2018-10-31 13:10:41.913227544 +0000
@@ -1,5 +1,5 @@

-# Configuration checksum: 17280006272735157270
+# Configuration checksum: 13910001803964264479

 # setup custom paths that do not require root access
 pid /tmp/nginx.pid;
@@ -12739,7 +12739,7 @@

                        port_in_redirect off;

-                       set $proxy_upstream_name "rocket-b-rocketchat-service-80";
+                       set $proxy_upstream_name "";


@@ -12801,9 +12801,8 @@
                        proxy_next_upstream                     error timeout;
                        proxy_next_upstream_tries               3;

-                       proxy_pass http://upstream_balancer;
-
-                       proxy_redirect                          off;
+                       # No endpoints available for the request
+                       return 503;

                }

This is a dev cluster and many developers are deploying stuff that is not configured correctly or not stable yet, so this is expected.
Maybe the lua part could be enhanced to also handel this case and avoid a reload?

@yvespp
Copy link
Author

yvespp commented Nov 2, 2018

We now have one pod running with dynamic on and all other with it off. After running fine for 4 days almost all non dynamic controllers started to increase their memory usage at the same time:
image
The white line is the controller with dynamic on, its base memory consumtion is higher than the rest. It was restartet 3 times today. From the log:

2018/10/31 12:55:59 [emerg] 67#67: invalid number of arguments in "client_max_body_size" directive in /tmp/nginx-cfg906928627:662
--
nginx: [emerg] invalid number of arguments in "client_max_body_size" directive in /tmp/nginx-cfg906928627:662
2018/10/31 12:55:59 [emerg] 67#67: invalid number of arguments in "client_max_body_size" directive in /tmp/nginx-cfg906928627:662
nginx: [emerg] invalid number of arguments in "client_max_body_size" directive in /tmp/nginx-cfg906928627:662

In a container of a non dynamic controller I can see that the memory is used by the nginx workers.:

www-data@ld1c0422:/etc/nginx$ ps aux
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
www-data     1  0.0  0.0   6680   656 ?        Ss   Oct30   0:00 /usr/bin/dumb-i
www-data     8  0.0  0.0   8204  1784 ?        Ss   Oct30   0:00 /bin/bash /entr
www-data     9  9.8  0.4 897168 542636 ?       Sl   Oct30 412:38 /nginx-ingress-
www-data    60  7.2  2.2 3028212 2990512 ?     S    Oct30 305:38 nginx: master p
www-data 25265  0.0  0.0   8336  2232 pts/0    Ss   12:39   0:00 bash
www-data 26247  0.3  2.2 3274252 2973344 ?     Sl   12:43   0:00 nginx: worker p
www-data 26921  1.6  2.2 3297528 2996552 ?     Sl   12:44   0:00 nginx: worker p
www-data 26954  1.8  2.2 3297528 2996512 ?     Sl   12:44   0:00 nginx: worker p
www-data 26987  1.9  2.2 3297528 2996516 ?     Sl   12:44   0:00 nginx: worker p
www-data 27020  1.6  2.2 3297528 2996464 ?     Sl   12:44   0:00 nginx: worker p
www-data 27053  0.0  0.0  18196  1712 pts/0    R+   12:45   0:00 ps aux

I the logs a see nothing special... The reloads only take a few seconds and can't lead to the increased memory usage. One non dynamic controller (purple in the graph) use termporary more memory than the other containers.

In summary: the memory explosion also happens without the dynamic config reload but in a different way and later. We also hand no external helth check fails with the non dynamice controllers.
I still have no clue what triggers it or why it happens...

@aledbf can you please reopen the issue? Thanks!

@aledbf
Copy link
Member

aledbf commented Nov 2, 2018

@yvespp please use quay.io/kubernetes-ingress-controller/nginx-ingress-controller:dev.021118
This image contains current master with the fix that avoids reloads for 503 (endpoints not available)

@yvespp
Copy link
Author

yvespp commented Nov 5, 2018

I'm now running with tag dev.021118
There are still some reloads when there are no backends anymore:

W1105 16:41:06.247141      10 controller.go:676] Service "att-development/att-attachmentstore-service" does not have any active Endpoint.
...

I1105 16:41:06.305846      10 controller.go:164] Configuration changes detected, backend reload required.
I1105 16:41:06.917888      10 util.go:68] rlimit.max=1048576
I1105 16:41:06.917939      10 nginx.go:523] Maximum number of open file descriptors: 261120
I1105 16:41:09.079949      10 nginx.go:629] NGINX configuration diff:
--- /etc/nginx/nginx.conf       2018-11-05 16:39:52.464488749 +0000
+++ /tmp/new-nginx-cfg079505607 2018-11-05 16:41:09.050636415 +0000
@@ -1,5 +1,5 @@

-# Configuration checksum: 11004987907417578821
+# Configuration checksum: 12582047228611976621

 # setup custom paths that do not require root access
 pid /tmp/nginx.pid;
@@ -16359,7 +16359,7 @@

                        port_in_redirect off;

-                       set $proxy_upstream_name "att-development-att-attachmentstore-service-80";
+                       set $proxy_upstream_name "";

I also see diffs where proxy_upstream_name gets set.

I leave it running for now to see what the memory does.

@yvespp
Copy link
Author

yvespp commented Nov 6, 2018

Here the memory graph of dev.021118 so far (19h):
image
One Nginx crashed with exit code 143:

kubectl -n cop-nginx-ingress describe pod nginx-ingress-gwfq6
...

    State:          Running
      Started:      Tue, 06 Nov 2018 08:54:19 +0100
    Last State:     Terminated
      Reason:       Error
      Exit Code:    143
      Started:      Mon, 05 Nov 2018 17:36:08 +0100
      Finished:     Tue, 06 Nov 2018 08:53:57 +0100
    Ready:          True
    Restart Count:  1

The cause was probably that it was killed by Kubernetes because the liveness probe failed but the container did not handle that correctly: https://serverfault.com/questions/695849/services-remain-in-failed-state-after-stopped-with-systemctl

2018-11-06 08:52:12.789	Liveness probe failed: Get http://10.32.50.177:10254/healthz: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
2018-11-06 08:52:21.596	Liveness probe failed: Get http://10.32.50.177:10254/healthz: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
2018-11-06 08:52:27.389	Readiness probe failed: Get http://10.32.50.177:10254/healthz: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
2018-11-06 08:52:32.189	Liveness probe failed: Get http://10.32.50.177:10254/healthz: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
2018-11-06 08:52:47.293	Readiness probe failed: Get http://10.32.50.177:10254/healthz: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
2018-11-06 08:53:58.484	Readiness probe failed: Get http://10.32.50.177:10254/healthz: dial tcp 10.32.50.177:10254: connect: connection refused
2018-11-06 08:53:58.495	Killing container with id docker://nginx-ingress:Container failed liveness probe.. Container will be killed and recreated.

Also the external health check failed twice (red dotted line in graph), once before the crash from above.

I had to roll back to version 0.20.0 with dynamic reload off because some of our ingresses still use the annotation nginx.ingress.kubernetes.io/secure-backends: "true" and it doesn't work in dev.021118 anymore. Will it also be dropped in the final version?

@aledbf
Copy link
Member

aledbf commented Nov 6, 2018

@yvespp that annotation was deprecated in 0.18.0 and the code removed after the release of 0.20.0 in #3203

The replacement is called nginx.ingress.kubernetes.io/backend-protocol: "HTTPS"
https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration/annotations/#backend-protocol

This annotation is "generic" and allows us to indicate more than just a secure backend

@aledbf
Copy link
Member

aledbf commented Nov 6, 2018

@yvespp you should update your annotations

@aledbf aledbf reopened this Nov 6, 2018
@aledbf
Copy link
Member

aledbf commented Nov 6, 2018

@yvespp please use quay.io/kubernetes-ingress-controller/nginx-ingress-controller:dev.110318
This image contains current master with #3367

@evilezh
Copy link

evilezh commented Nov 8, 2018

@yvespp I had similar issues. So - I took time and rewrote deployment for myself.
Here is some stuff I changed in config file:

proxy-request-buffering: "off"
worker-shutdown-timeout: "60s"
use-geoip: "false"

and to startup I added:

--enable-dynamic-certificates
--enable-ssl-chain-completion=false

I did not test which of those extra options solved my issues, I've suspicion about proxy-request-buffering: "off". I'm running latest 0.20.0.
If that does not help, I can share all options I've configured.

p.s. I had memory leaks even running 0.16.2 (some long running pods took over 30Gb memory :) )

@AlexPereverzyev
Copy link

After about 5 minutes of constant load (~9000 req/sec) 0.20.0 starts leaking memory at speed of about 1-1.5 GB/hour.

After a day of such load nginx-ingress-controller process successfully consumes all remaining RAM and gets killed and restarted by Kubernetes. My Nginx ingress, upstream server and load runner are all 4 CPU, 32 GB machines.

--enable-dynamic-configuration=false completely heals the leak (RAM is stable at about 200 MB), but ingress metrics are gone as well (see #3053).

If you are happy to sacrifice the metrics, don't forget to raise error_log_level to error to not pollute logs with warnings from Lua monitor script.

@ElvinEfendi
Copy link
Member

to not pollute logs with warnings from Lua monitor script.

@AlexPereverzyev what warnings are you seeing from Lua?

@AlexPereverzyev
Copy link

@ElvinEfendi, I'm seeing lots of omitting metrics for the request, current batch is full warnings.

@ElvinEfendi
Copy link
Member

@AlexPereverzyev can you post your Nginx configuration? I'm specifically interested if you have

    init_worker_by_lua_block {
        balancer.init_worker()
        monitor.init_worker()
    }

in the config. Also are you seeing some metrics or no metrics at all?

@AlexPereverzyev
Copy link

@ElvinEfendi, please find the config files attached - configs.zip

The Lua block is not in the config, but seems it's intended:

{{ if $all.DynamicConfigurationEnabled }}
init_worker_by_lua_block {
balancer.init_worker()
monitor.init_worker()
}
{{ end }}

I can't see ingress metrics coming from monitor.lua, I can see Nginx metrics (connections, RAM) and config reloads counter.

Adding the following to nginx template works and I can see ingress metrics again (it also brings back the memory issue):

init_worker_by_lua_block {
monitor.init_worker()
}

Interestingly, ingress controller is able to release used memory if the load is turned off (at least to some degree):

ingress-nginx-ram

@ElvinEfendi
Copy link
Member

ElvinEfendi commented Nov 18, 2018

Adding the following to nginx template works and I can see ingress metrics again

Do you still see omitting metrics for the request, current batch is full?

(it also brings back the memory issue):

Does the memory keep growing unboundedly? It's expected for the memory to grow under load (not unboundedly though) because we batch the metrics in Lua per Nginx worker before flushing.

Interestingly, ingress controller is able to release used memory if the load is turned off (at least to some degree):

this aligns with what I said above.

--

The Lua block is not in the config, but seems it's intended:

Are you using a custom configuration template? If not then it seems like a nasty bug introduced in the version you're using - sorry for that.

@AlexPereverzyev
Copy link

@ElvinEfendi, turns out, ingress controller is sensitive to volume of data sent from Nginx workers to /tmp/prometheus-nginx.socket. In my particular case, each of 8 workers sends ~400kB of metrics per second, which seems too much for ingress controller to handle.

Probably, better hardware or less load can mitigate the issue, but here are the monitor.lua changes which finally worked for me (ingress container memory consumption is stable at ~100MB):

local function metrics()
return {
host = ngx.var.host or "-",
-- namespace = ngx.var.namespace or "-",
ingress = ngx.var.ingress_name or "-",
-- service = ngx.var.service_name or "-",
-- path = ngx.var.location_path or "-",
method = ngx.var.request_method or "-",
status = ngx.var.status or "-",
-- requestLength = tonumber(ngx.var.request_length) or -1,
requestTime = tonumber(ngx.var.request_time) or -1,
-- responseLength = tonumber(ngx.var.bytes_sent) or -1,
-- endpoint = ngx.var.upstream_addr or "-",
-- upstreamLatency = tonumber(ngx.var.upstream_connect_time) or -1,
-- upstreamResponseTime = tonumber(ngx.var.upstream_response_time) or -1,
-- upstreamResponseLength = tonumber(ngx.var.upstream_response_length) or -1,
-- upstreamStatus = ngx.var.upstream_status or "-",
}
end

I've also switched from 8 to 4 workers, but it doesn't really make any difference except flat memory consumption reduction.

@ElvinEfendi
Copy link
Member

@AlexPereverzyev thanks for debugging this further!

In my particular case, each of 8 workers sends ~400kB of metrics per second, which seems too much for ingress controller to handle.

with this are you seeing constant (unbounded) memory growth or does it grow until some point and stays still?

I wonder if allocating more CPUs would help with this - the theory is probably controller can not process the data sent fast enough and therefore they get queued at network level. It would be interesting to look at queuing on /tmp/prometheus-nginx.socket while you experience the memory growth.

This was referenced Nov 19, 2018
@AlexPereverzyev
Copy link

with this are you seeing constant (unbounded) memory growth or does it grow until some point and stays still?

@ElvinEfendi, ingress-nginx-controller constantly consumes more and more memory until it gets killed and restarted. Though the explanation and fix are already available - #3436.

@aledbf
Copy link
Member

aledbf commented Nov 19, 2018

ingress-nginx-controller constantly consumes more and more memory until it gets killed and restarted. Though the explanation and fix are already available

What's the behavior with #3436 ?

@aledbf
Copy link
Member

aledbf commented Nov 19, 2018

@AlexPereverzyev can you share information about the traffic? (number of ingresses, hostnames, number of replicas of the exposed services and RPS being handled by the ingress controller)

@AlexPereverzyev
Copy link

@aledbf, I can to test the fix if you can provide an image. The environment description is below:

Hardware (same for all nodes)

Intel Core i7-3770 4 cores CPU
32 GB RAM
HDD
Gigabit Ethernet

Ingress

Single node, ingress controller YAML can be found here configs.zip, but you should set --enable-dynamic-configuration=true.

Has up to 10 ingresses, but only one is used for testing.

Upstream

Single node, running 8 containers of REST service (web-api) with latency ~25ms or less.

Load Runner

Single node, JMeter 5 test script: 100 threads calling the REST service endpoint with JSON payload, in loop.

The resulting ingress throughput is ~9000 RPS, with ~2.5MB incoming and ~6MB outgoing traffic.

There is also separate node to run Prometheus and Grafana.

@aledbf
Copy link
Member

aledbf commented Nov 23, 2018

@yvespp please update to 0.21.0

@ghost
Copy link

ghost commented Feb 20, 2019

Hey folks,

Sorry to bother on an issue closed, but I'm having this kind of trouble right now with 0.21.0 (didnt try 0.22.0 because of #3788).

I'll post some graphics that display 'RAM memory available' on hosts.

Every once in a while we have a 'reload peak'. After every peak, nginx process increases its size and it never goes back. It obviously consumes massive memory because of unfinished connections but it would be fine - if nginx process didn't enlarge!

This is a sample of one of our servers:

ingress

I have a VM dedicated to test Nginx Ingress Controller that receives no connections at all. It only stays up and syncing nginx configuration all day long without receiving a single client connection.

Both machines have 40GB of RAM.

re18

(ignore the sudden peak at the end, it was a manual restart I applied).

Since the machine is not used by anyone, you'll see that there are no massive peaks of memory use; but nginx processess keep getting larger and larger as well, which shows it's nothing to do with other possible kinds of loads. I'd guess it's some leak related to LUA?

A fresh started nginx machines have its worker processes around 600MB of RAM each:

# ps aux | fgrep nginx
33       130821  2.4  1.3 1024336 565288 ?      Sl   18:50   2:27 nginx: worker process
33       130822  2.4  1.3 1024848 565432 ?      Sl   18:50   2:26 nginx: worker process

After a week, they are massively stuffed (8+GB):

33       128719  6.0 20.7 8980916 8525148 ?     Rl   18:09   8:26 nginx: worker process
33       128720  6.1 20.7 8982628 8526760 ?     Rl   18:09   8:34 nginx: worker process

At these levels, we run dangerously close of getting this to happen:

Feb 20 13:06:33 spessrvvpre00016.estaleiro.serpro docker[2079]: 2019/02/20 16:06:29 [alert] 41#41: fork() failed while spawning "worker process" (12: Cannot allocate memory)
Feb 20 13:06:33 spessrvvpre00016.estaleiro.serpro docker[2079]: 2019/02/20 16:06:29 [alert] 41#41: fork() failed while spawning "worker process" (12: Cannot allocate memory)

When this happens, since we run outside kubernetes, ingress-controller units are kept alive, all processess are alive, port is up, but it turns into a blackhole (sucking anything in, but never answering). This situation is irrecuperable on its own; something has to restart the service (like kubernetes liveliness check would do).

@aledbf
Copy link
Member

aledbf commented Feb 20, 2019

@mrrandrade how many nginx workers are configured? (or running).

@aledbf
Copy link
Member

aledbf commented Feb 20, 2019

@mrrandrade Now that #3788 is closed, please check again and if the issue persist open a new issue

@hmtai
Copy link

hmtai commented May 12, 2020

Mark here because we also met the same issue.When nginx reloads the memory usage will increase a peak about 56Gb.
@aledbf We use nginx-ingress-controller 0.23.4.Does it a nginx or nginx-ingress-controller bug?Can you give me some suggestions about this?

@aledbf
Copy link
Member

aledbf commented May 12, 2020

We use nginx-ingress-controller 0.23.4

Please update to 0.32.0. You are using a version released on Feb 27, 2019. There are lot of fixes related to reloads and multiple NGINX updates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants