Skip to content
This repository has been archived by the owner on Nov 27, 2023. It is now read-only.

dnsmasq process in wrong cgroup #21

Open
AlbanBedel opened this issue May 19, 2020 · 5 comments
Open

dnsmasq process in wrong cgroup #21

AlbanBedel opened this issue May 19, 2020 · 5 comments

Comments

@AlbanBedel
Copy link

When a pod/container is started directly with the podman command, the dnsmasq process created by dnsname end up in the calling user cgroup:

$ sudo podman start hello
$ cat /proc/$(sudo cat /run/containers/cni/dnsname/podman/pidfile)/cgroup
12:pids:/user.slice/user-1000.slice/[email protected]
11:memory:/user.slice/user-1000.slice/[email protected]
...

That's probably not ideal as the dnsmasq process is then bound to the user slice.

But when systemd service file generated by podman are used, the dnsmasq process ends up in the service's cgroup:

$ cd /run/systemd/system
$ sudo podman generate systemd --files --name hello
$ sudo systemctl start container-hello.service
$ cat /proc/$(sudo cat /run/containers/cni/dnsname/podman/pidfile)/cgroup
12:pids:/system.slice/container-hello.service
11:memory:/system.slice/container-hello.service

This is problematic as the dnsmasq process and the container have totally different life cycles. I noticed this problem as I'm trying to start containers using transient units. Transient units are normally automatically removed when they are stopped, but if the dnsmasq process is still running because of another container, it prevent the transient unit it was started in from being destroyed.

I can probably workaround this problem in some way for my setup, but I think the dnsmasq process, or any other long running process related to a cni network, should be in a cgroup whose life cycle match the cni network life cycle.

@baude
Copy link
Member

baude commented May 19, 2020

@mheon WDYT?

@mheon
Copy link
Member

mheon commented May 19, 2020

We could try moving to the container's cgroups, but that's problematic because it should outlive any single container as long as another container in the network is started and running.

Best way is likely to make a scope exclusively for dnsmasq (podman-dnsmasq-$NETWORK.service maybe?) under Libpod's default cgroup parent. We have code to do this for cgroupfs and systemd in Podman (we use it for making pod cgroups, but it could easily be repurposed for this).

@carbolymer
Copy link

Is there any workaround for this? This makes using more than one network in podman impossible.

@AlbanBedel
Copy link
Author

@mheon and what about #12? Each pod can have a unique combination of networks attached, to support that we would probably need a dnsmasq process per pod anyway. It's a larger change but that would solve both bugs at once.

On the other hand it would make sense to have a generic solution to handle the case where a cni plugin start a process that should outlive the pod it was started for.

@mheon
Copy link
Member

mheon commented Apr 15, 2021

@AlbanBedel We're presently discussing an extensive rearchitecture/rewrite of dnsname to resolve that, that should also resolve this. I'm just waiting for the OK to go and ahead and get started on it.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants