uWSGI running in a Kind cluster on Fedora 33 uses over 8Gi of memory #2175

RedRoserade · 2021-04-02T08:25:41Z

What happened:

I have a Docker image for a Python web app that runs with uWSGI. If I run it through docker run, everything works fine. However, running the same docker image on a Kind cluster results in the pod, more specifically, uWSGI, to consume >8Gi of memory on boot, even with a minimal example.

The same image can run through docker run with --memory set to under 512M without issues.

This seems to affect only Fedora 33, the same image running on a Kind cluster on an Ubuntu 20.10 machine, with the same Docker version (20.10.5 community) runs as expected.

After some debugging it seems that it only affects uWSGI running with --http, where the extra process for the HTTP server is what's consuming the absurd amounts of memory. If I run it instead with --http-socket, it runs fine, as it doesn't launch a dedicated HTTP server, but this is not equivalent, and is at most a workaround.

When the pod is run with a memory limit set (512Mi), looking at dmesg -T shows the OOM killer being triggered on the HTTP server process (i.e., when running uwsgi --http).

I also tried running this on an OpenShift cluster, with no issues. I also tried destroying and recreating the Kind cluster.

What you expected to happen:

The pod should boot and consume a reasonable amount of memory regardless of operating system.

How to reproduce it (as minimally and precisely as possible):

I created a repository with instructions and dmesg logs, here: https://github.com/RedRoserade/kind-uwsgi-error-example

But, here's some basic instructions (for a manual test):

Create a pod spec for a python:3.8-buster image, and kubectl exec -it <pod> -- bash into it.
Install, through pip, uwsgi and flask.
Create a minimal Flask application, it only needs a single endpoint.
Run it through uWSGI with uwsgi --http :8080 --callable <app-variable> --wsgi-file <your-app-file.py>.
Try to curl http://localhost:8080. On Fedora 33 curl never succeeds, and dmesg -T shows OOM logs.

Anything else we need to know?:

Not that I'm aware of.

Environment:

kind version: (use kind version): 0.10.0
Kubernetes version: (use kubectl version): 1.20.5
Docker version: (use docker info): 20.10.5
OS (e.g. from /etc/os-release): Fedora 33, Kernel 5.11.10-200.fc33.x86_64

The text was updated successfully, but these errors were encountered:

BenTheElder · 2021-04-02T09:35:02Z

You probably have swap enabled, in which case ordinarly kubelet would refuse to run, but kind configures it to continue to run. However this comes at the cost of memory limits not working.

We can't fix this in kind. There is however a KEP to allow swap being enabled upstream.

BenTheElder · 2021-04-02T09:36:35Z

see also #1963 for a semi-related issue that is non-trivial to resolve. we don't have the cooperation of the necessary downstream components so nodes also cannot be properly restricted either at the node level (versus e.g. using a VM based solution).

BenTheElder · 2021-04-02T09:39:44Z

kubernetes/kubernetes#53533 for the upstream issue regarding swap.

IIRC if you disable swap memory limits will work. We don't take this approach automatically given it's a global system option with tradeoffs.

RedRoserade · 2021-04-02T09:44:09Z

I just tried re-running the test case on a fresh kind cluster after swapoff -a, and it produces the same issue:

The zombie uwsgi process tried to overallocate and it got killed.

Note that I later tried running this exact image on a k3s-based single-node cluster on the same machine, and the uwsgi process did not allocate 8Gi of RAM either (see my attached repo)

BenTheElder · 2021-04-02T10:01:51Z

k3s also brings in a different containerd and kubernetes version (and a forked one at that...), but that's good to know.

I appreciate the detailed repro repo, thanks! but don't have quick access to a fedora machine at the moment to actually repro on my end.

I'm going to be out for the next week and have pretty limited time before then unfortunately. It's very curious that it works on ubuntu 20.10 but not fedora. cc @aojea

aojea · 2021-04-02T10:24:32Z

This sounds like this #760
My fedora desktop has a high value for:

sudo sysctl fs.nr_open
fs.nr_open = 1073741816

RedRoserade · 2021-04-02T11:24:16Z

This sounds like this #760
My fedora desktop has a high value for:
sudo sysctl fs.nr_open
fs.nr_open = 1073741816

Indeed, setting it to 1048576 via sudo sysctl -w fs.nr_open=1048576 fixes the issue. That is the same value as in my Ubuntu installation.

For now I'll set it to that on my system.

From what I understand, the 1048576 limit was reversed here?: #1799

aojea · 2021-04-02T11:44:10Z

Short history, most ( I would say all) of the issues are application bugs that do the allocation based on the number of file descriptors #760 (comment)

In the other hand, if we hardcode a kernel value we don't know if other apps will break, maybe, someone will need a high number of file descriptors and we are capping them ...

It will be interesting.to know why fedora goes with this high number too ...

RedRoserade · 2021-04-02T19:07:10Z

I did a bit more digging, and found something, which I don't know if it's related. On both of my Fedora and Ubuntu systems, there's a ulimit -n set to 1024 per user, though on the containers and pods, it is set to whatever sysctl defines. This explains why running uwsgi without any containers doesn't consume the >8Gi of memory.

I've also found that the containerd.service unit sets LimitNOFILE=1048576, which is why docker run was not running into issues on Fedora (the docker.service has it set to infinity, but Docker now uses containerd, correct?), despite the default high limit on Fedora:

systemctl cat containerd.service
...
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNPROC=infinity
LimitCORE=infinity
LimitNOFILE=1048576
...

A podman run seems to have this limit set at 524288 as per the uwsgi logs, but I didn't dig to see exactly where.

I have filed an issue on the uwsgi project regarding this: unbit/uwsgi#2299, as I think that it is really an application problem and not necessarily kind's.

aojea · 2021-04-13T09:36:51Z

should we close it then @RedRoserade ?

RedRoserade · 2021-04-13T09:38:11Z

Yes, I think this can be closed. Thank you for helping me debug this!

aojea · 2021-04-13T10:01:07Z

you are welcome

Fixes memory consumption of the "uWSGI http 1" process that was rising above 8 GiB on systems like Fedora 37 (locally) and Fedora CoreOS 36 (in the cloud) due to very high file descriptor limits (`fs.nr_open = 1073741816`). See <kubernetes-sigs/kind#2175> and <unbit/uwsgi#2299>. Sets the uWSGI `max-fd` value to 1048576 as per <https://github.com/kubernetes-sigs/kind/pull/1799/files>. If need be, we can make it configurable via Helm chart values later.

RedRoserade added the kind/bug Categorizes issue or PR as related to a bug. label Apr 2, 2021

RedRoserade mentioned this issue Apr 2, 2021

High memory usage if fs.nr_open is very high and no ulimit set on Linux systems unbit/uwsgi#2299

Open

aojea closed this as completed Apr 13, 2021

BenTheElder assigned aojea Apr 13, 2021

tiborsimko mentioned this issue Mar 7, 2023

uwsgi: fix memory consumption related to max-fd reanahub/reana#703

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

uWSGI running in a Kind cluster on Fedora 33 uses over 8Gi of memory #2175

uWSGI running in a Kind cluster on Fedora 33 uses over 8Gi of memory #2175

RedRoserade commented Apr 2, 2021

BenTheElder commented Apr 2, 2021

BenTheElder commented Apr 2, 2021

BenTheElder commented Apr 2, 2021

RedRoserade commented Apr 2, 2021

BenTheElder commented Apr 2, 2021

aojea commented Apr 2, 2021

RedRoserade commented Apr 2, 2021

aojea commented Apr 2, 2021

RedRoserade commented Apr 2, 2021

aojea commented Apr 13, 2021

RedRoserade commented Apr 13, 2021

aojea commented Apr 13, 2021

uWSGI running in a Kind cluster on Fedora 33 uses over 8Gi of memory #2175

uWSGI running in a Kind cluster on Fedora 33 uses over 8Gi of memory #2175

Comments

RedRoserade commented Apr 2, 2021

BenTheElder commented Apr 2, 2021

BenTheElder commented Apr 2, 2021

BenTheElder commented Apr 2, 2021

RedRoserade commented Apr 2, 2021

BenTheElder commented Apr 2, 2021

aojea commented Apr 2, 2021

RedRoserade commented Apr 2, 2021

aojea commented Apr 2, 2021

RedRoserade commented Apr 2, 2021

aojea commented Apr 13, 2021

RedRoserade commented Apr 13, 2021

aojea commented Apr 13, 2021