Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

containerd-shim processes are leaking inotify instances with cgroups v2 #563

Closed
george-angel opened this issue Nov 29, 2021 · 18 comments
Closed
Labels
area/cgroup2 Issues uncovered through the migration to cgroup2. kind/bug Something isn't working

Comments

@george-angel
Copy link

This is a duplicate of containerd/containerd#5670

But I wanted to raise an issue with Flatcar anyway:

  1. For visibility to other FC users who might run into it
  2. Perhaps other users don't see this issue or have found a workaround

Since 2983.2.0 defaults to cgroupsv2, we saw this issue frequent enough where we had to roll back.

Client application proc might log something like this:

failed to create fsnotify watcher: too many open files

You can lessen the issue by increasing the default: fs.inotify.max_user_instances=8192, but sooner or later nodes still run out..

@george-angel george-angel added the kind/bug Something isn't working label Nov 29, 2021
@pothos
Copy link
Member

pothos commented Nov 29, 2021

Hi,
thanks for raising this here. One first question I have for a workaround is whether the shim is really required because I read a comment that it was only needed for live restore. Would be good to try setting no_shim = true following https://www.flatcar-linux.org/docs/latest/container-runtimes/customizing-docker/#use-a-custom-containerd-configuration

@george-angel
Copy link
Author

george-angel commented Nov 29, 2021

I'm going to try this now, but this is interesting: https://github.com/flatcar-linux/coreos-overlay/blob/main/app-emulation/containerd/files/config.toml#L26-L28

The comment suggests not running with a shim, but the setting is default false value.

@pothos
Copy link
Member

pothos commented Nov 29, 2021

no_shim = false means it uses the shim ;) - but yes, the comment above is confusing.

I think you would change it to true under [plugins."containerd.runtime.v1.linux"] but maybe it's worth to check if there are possible sections in your config dump output (I don't have a Phd in containerd config.toml-ology, don't trust what I say).

@pothos
Copy link
Member

pothos commented Nov 29, 2021

I guess the comment wording (also in https://github.com/containerd/containerd/blob/main/docs/ops.md#linux-runtime-plugin) was done that way to match the config name and what is does when enabled, not the value false that is set…

@george-angel
Copy link
Author

george-angel commented Nov 29, 2021

I'm going to try and removing shim and report back 👍

@jbehling
Copy link

jbehling commented Dec 1, 2021

I'm going to try and removing shim and report back 👍

Any update on this? We are experiencing the same issue and curious to know if removing the shim is a viable option

@george-angel
Copy link
Author

Sorry - not yet.

I'm deploying a count metric for inotify fds on 2 clusters and no_shim on 1 cluster right now.

I will report tomorrow if I can see any difference.

@jepio
Copy link
Member

jepio commented Dec 1, 2021

Thanks for the upstream bug reference, this is very easily reproducible (start a pod with /bin/false as the command under k8s, every CrashLoop leaks an inotify instance and a goroutine blocked in inotify_read). I'm testing a fix and will submit an upstream bugfix once I validated it.

@jbehling
Copy link

jbehling commented Dec 3, 2021

@jepio Any update on the timeline for a bugfix?

@jepio
Copy link
Member

jepio commented Dec 6, 2021

The upstream PR's have been submitted, I'm waiting for reviews, then merge, release and then we'll pick it into Flatcar. Don't know how long that might take overall.

@jbehling
Copy link

jbehling commented Dec 6, 2021

Thanks, can you link to the upstream PRs?

@jepio
Copy link
Member

jepio commented Dec 8, 2021

containerd/cgroups#212 is the initial one, after this the changes will need to be vendored into containerd/containerd (second PR).

@jepio jepio added the area/cgroup2 Issues uncovered through the migration to cgroup2. label Jan 6, 2022
@jepio
Copy link
Member

jepio commented Feb 17, 2022

The inotify leak fix has been merged and is part of containerd 1.6.0. This will be a part of the next alpha release (flatcar-archive/coreos-overlay#1650).

@kmmanto
Copy link

kmmanto commented Sep 23, 2022

We're still experiencing this using containerd 1.6.6

containerd --version
containerd github.com/containerd/containerd 1.6.6 d0d56c1a4ace8bae8c7c98d28ba98f0537ebe704
Client:
 Context:    default
 Debug Mode: false

Server:
 Containers: 469
  Running: 74
  Paused: 0
  Stopped: 395
 Images: 36
 Server Version: 20.10.14
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: false
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: d0d56c1a4ace8bae8c7c98d28ba98f0537ebe704
 runc version: 886750b989c082700828ec1d3bbb1b397219bfac
 init version: 
 Security Options:
  seccomp
   Profile: default
  selinux
  cgroupns
 Kernel Version: 5.15.63-flatcar
 Operating System: Flatcar Container Linux by Kinvolk 3227.2.2 (Oklo)
 OSType: linux
 Architecture: x86_64
 CPUs: 32
 Total Memory: 125.8GiB
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

@jepio
Copy link
Member

jepio commented Sep 23, 2022

@kmmanto can you provide more details to back that up? One thing to note is that with cgroupsv2 you will require at least 1 inotify instance per container, and 2+ in the case of a kubernetes pod. So together with systemd internal inotify usage, the default fs.inotify.max_user_instances limit of 128 may need to be increased.

@kmmanto
Copy link

kmmanto commented Sep 30, 2022

@jepio This is one of the logs of a pod running in a Flatcar node in Openstack. Doing a kubect logs -f <pod_name> prints this and then exits.

I, [2022-09-30T11:16:52.004621 #1]  INFO -- : Finished   'health_check.alive'
I, [2022-09-30T11:17:52.002410 #1]  INFO -- : Triggering 'health_check.alive'
I, [2022-09-30T11:17:52.002940 #1]  INFO -- : Finished 'health_check.alive' duration_ms=0 error=nil
I, [2022-09-30T11:17:52.003031 #1]  INFO -- : Finished   'health_check.alive'
failed to create fsnotify watcher: too many open files

Increased fs.inotify.max_user_instances to 8192 as suggested by OP. Will monitor if the issues comes back.

@jepio
Copy link
Member

jepio commented Sep 30, 2022

When you hit this, try running this command and paste the output here: sudo find /proc/*/fd -lname anon_inode:inotify | cut -d/ -f3 | xargs -I '{}' -- ps --no-headers -o '%p %U %c %a %P' -p '{}' | uniq -c | sort -nr

@tormath1
Copy link
Contributor

tormath1 commented Sep 8, 2023

As the main issue seems to be fixed since containerd 1.6.0 and current version of containerd on stable is 1.6.16 I'm going ahead and closing this issue.

Do not hesitate to reopen this issue or to create a new one if you have issue with containerd.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cgroup2 Issues uncovered through the migration to cgroup2. kind/bug Something isn't working
Development

No branches or pull requests

6 participants