From 85276077eba9473a5de7522c6275e494d2dbc654 Mon Sep 17 00:00:00 2001 From: Kir Kolyshkin Date: Mon, 27 Sep 2021 14:34:45 -0700 Subject: [PATCH] config-linux: MAY reject an unfit cgroup It makes sense for runtime to reject a cgroup which is frozen (for both new and existing container), otherwise the runtime command (create/run/exec) may end up being stuck. It makes sense for runtime to make sure the cgroup for a new container is empty (i.e. there are no processes it in), and reject it otherwise. The scenario in which a non-empty cgroup is used for a new container has multiple problems, for example: * If two or more containers share the same cgroup, and each container has its own limits configured, the order of container starts ultimately determines whose limits will be effectively applied. * If two or more containers share the same cgroup, and one of containers is paused/unpaused, all others are paused, too. * If cgroup.kill is used to forcefully kill the container, it will also kill other processes that are not part of this container but merely belong to the same cgroup. * When a systemd cgroup manager is used, this becomes even worse. Such as, stop (or even failed start) of any container results in stopTransientUnit command being sent to systemd, and so (depending on unit properties) other containers can receive SIGTERM, be killed after a timeout etc. * Many other bad scenarios are possible, as the implicit assumption of 1:1 container:cgroup mapping is broken. https://github.com/opencontainers/runc/issues/3132 https://github.com/containers/crun/issues/716 Signed-off-by: Kir Kolyshkin --- config-linux.md | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/config-linux.md b/config-linux.md index b4f8b7c67..c427748a0 100644 --- a/config-linux.md +++ b/config-linux.md @@ -171,6 +171,14 @@ Also known as cgroups, they are used to restrict resource usage for a container cgroups provide controls (through controllers) to restrict cpu, memory, IO, pids, network and RDMA resources for the container. For more information, see the [kernel cgroups documentation][cgroup-v1]. +A runtime MAY refuse to create or start a new container, or a process inside an +existing container, if its cgroup (the one which the container process is to be +put in) is considered not fit for purpose. Examples include an existing frozen +or (for a new container) non-empty cgroup. The reason for this is that +accepting such configurations could cause container operation outcomes that +users may not anticipate or understand, such as operation on one container +inadvertently affecting other containers. + ### Cgroups Path **`cgroupsPath`** (string, OPTIONAL) path to the cgroups.