Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--cgroup-parent has no effect #10173

Closed
yaqwsx opened this issue Apr 29, 2021 · 20 comments · Fixed by #10177
Closed

--cgroup-parent has no effect #10173

yaqwsx opened this issue Apr 29, 2021 · 20 comments · Fixed by #10177
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.

Comments

@yaqwsx
Copy link

yaqwsx commented Apr 29, 2021

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

Podman creates container's cgroup under parent's cgroup instead of the one specified via --cgroup-parent.

Steps to reproduce the issue:

Follow this script:

#!/usr/bin/env bash

set -e

# Create a systemd delegated scope
busctl call --user \
    org.freedesktop.systemd1 /org/freedesktop/systemd1 \
    org.freedesktop.systemd1.Manager StartTransientUnit \
    'ssa(sv)a(sa(sv))' TEST.scope \
    fail 4 PIDs au 1 $$ \
    Delegate b 1 MemoryAccounting b 1 CPUAccounting b 1 0
CGROUP=$(cat /proc/$$/cgroup | cut -d":" -f3)
echo "Base cgroup: $CGROUP"
FSBASE=/sys/fs/cgroup$CGROUP

# Create manager subgroup
mkdir $FSBASE/manager
echo $$ > $FSBASE/manager/cgroup.procs

# Enable controllers
echo "+cpu +memory" > $FSBASE/cgroup.subtree_control

# Create group for container
mkdir $FSBASE/container
echo "+cpu +memory" > $FSBASE/container/cgroup.subtree_control

# Create a container
CONNAME=$(podman container create \
            --cgroup-parent $CGROUP/container --cgroup-manager cgroupfs \
            ubuntu:20.04 /bin/bash -c 'echo XXX; sleep 5')
echo "Starting a container:"
podman container start $CONNAME

# Get the process from container
PID=$(ps ax | grep "echo XXX" | head -n1 | cut -d" " -f2)
echo "We get the cgroup of the parent process instead of the specified one via --cgroup-parent:"
cat /proc/$PID/cgroup
echo "But the configuration is correct:"
podman inspect $CONNAME | grep "CgroupParent"

Describe the results you received:

Podman creates container's cgroup under parent's cgroup instead of the one specified via --cgroup-parent.

Describe the results you expected:

Podman creates the container's cgroup under the cgroup specified via ---cgroup-parent

Additional information you deem important (e.g. issue happens only occasionally):

Output of podman version:

Version:      3.1.2
API Version:  3.1.2
Go Version:   go1.15.2
Built:        Thu Jan  1 01:00:00 1970
OS/Arch:      linux/amd64

Output of podman info --debug:

host:
  arch: amd64
  buildahVersion: 1.20.1
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: 'conmon: /usr/libexec/podman/conmon'
    path: /usr/libexec/podman/conmon
    version: 'conmon version 2.0.27, commit: '
  cpus: 8
  distribution:
    distribution: ubuntu
    version: "20.04"
  eventLogger: journald
  hostname: zoidber
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 5.8.0-50-generic
  linkmode: dynamic
  memFree: 233103360
  memTotal: 16541982720
  ociRuntime:
    name: crun
    package: 'crun: /usr/bin/crun'
    path: /usr/bin/crun
    version: |-
      crun version 0.19.1.3-9b83-dirty
      commit: 33851ada2cc9bf3945915565bf3c2df97facb92c
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  remoteSocket:
    path: /run/user/1000/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    selinuxEnabled: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: 'slirp4netns: /usr/bin/slirp4netns'
    version: |-
      slirp4netns version 1.1.8
      commit: unknown
      libslirp: 4.3.1-git
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.4.3
  swapFree: 16524886016
  swapTotal: 17179865088
  uptime: 39h 49m 9.36s (Approximately 1.62 days)
registries:
  search:
  - docker.io
  - quay.io
store:
  configFile: /home/xmrazek7/.config/containers/storage.conf
  containerStore:
    number: 14
    paused: 0
    running: 3
    stopped: 11
  graphDriverName: overlay
  graphOptions:
    overlay.mount_program:
      Executable: /usr/bin/fuse-overlayfs
      Package: 'fuse-overlayfs: /usr/bin/fuse-overlayfs'
      Version: |-
        fusermount3 version: 3.9.0
        fuse-overlayfs: version 1.4
        FUSE library version 3.9.0
        using FUSE kernel interface version 7.31
  graphRoot: /home/xmrazek7/.local/share/containers/storage
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  imageStore:
    number: 52
  runRoot: /run/user/1000/containers
  volumePath: /home/xmrazek7/.local/share/containers/storage/volumes
version:
  APIVersion: 3.1.2
  Built: 0
  BuiltTime: Thu Jan  1 01:00:00 1970
  GitCommit: ""
  GoVersion: go1.15.2
  OsArch: linux/amd64
  Version: 3.1.2

Package info (e.g. output of rpm -q podman or apt list podman):

podman/unknown,now 100:3.1.2-1 amd64 [installed]
podman/unknown 100:3.1.2-1 arm64
podman/unknown 100:3.1.2-1 armhf
podman/unknown 100:3.1.2-1 s390x

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide? (https://github.com/containers/podman/blob/master/troubleshooting.md)

Yes

Additional environment details (AWS, VirtualBox, physical, etc.):

Ubuntu 20.04 on a desktop

@openshift-ci-robot openshift-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Apr 29, 2021
@mheon
Copy link
Member

mheon commented Apr 29, 2021

You probably need to use --cgroup-manager=cgroupfs with the container so we use the cgroup you created, instead of creating a sub-cgroup via systemd. However, I would expect an error on invalid cgroup, so we may just be ignoring it completely somewhere.

@yaqwsx
Copy link
Author

yaqwsx commented Apr 29, 2021

I already invoke Podman with --cgroup-manager cgroupfs (see the script reproducing the behavior) - if I don't specify it, I get an error that systemd slice was expected. Or do you mean to specify it somewhere else?

@mheon
Copy link
Member

mheon commented Apr 29, 2021

Are you running the script as root, or a non-root user?

@yaqwsx
Copy link
Author

yaqwsx commented Apr 29, 2021

Non-root - that is why I acquire the scope via systemd. My goal is to collect accounting information when the container exists, so that's why I would like to run in a separate cgroup.

@mheon
Copy link
Member

mheon commented Apr 29, 2021

Out of curiosity: can you try the runc OCI runtime with --runtime runc? I wonder if this is a crun bug, not a Podman one.

@yaqwsx
Copy link
Author

yaqwsx commented Apr 29, 2021

Could you be more specific? I am rather noob when it comes to the implementation of containers so I probably miss the relation between podman and runc. Where should I specify --runtime runc? And what should I try to execute?

@mheon
Copy link
Member

mheon commented Apr 29, 2021

Add to your podman container create the --runtime runc argument. This switches the OCI runtime implementation - Podman doesn't actually make the cgroup itself, we simply pass a configuration including the requested cgroup down to an OCI runtime. The default (which, per podman info, you seem to be using) is crun, but there is also runc (the similarity between the names is unfortunate). I'm thinking that we're seeing a bug in crun where it's ignoring the cgroup config Podman passes to it.

@yaqwsx
Copy link
Author

yaqwsx commented Apr 29, 2021

When I issue the podman commands with:

  • --runtime runc podman creates croup TEST.scope/<containerId> and the container is there. The <containerId> group dies when podman exits. So I have no way of collecting the accounting information.
  • --runtime crun podman uses cgroup of the parent process TEST.scope/manage

I expect that it should run under TEST.scope/container/.

@mheon
Copy link
Member

mheon commented Apr 29, 2021

The crun bit does sound like a bug, then. @giuseppe PTAL

@yaqwsx
Copy link
Author

yaqwsx commented Apr 29, 2021

And the runc case is an expected behavior? It runs the container in different cgroup than specified in --cgroup-parent. If so, what should I pass to podman to run in the desired group?

@mheon
Copy link
Member

mheon commented Apr 29, 2021

The --cgroup-parent specifies a parent cgroup, under which we will create our own cgroup - the OCI runtime wants to configure the cgroup in a specific way, and also does not want to modify system cgroup configuration, so it creates a child cgroup which it can manage itself.

@yaqwsx
Copy link
Author

yaqwsx commented Apr 29, 2021

Yeh, but it creates it outside the specified --cgroup-parent. So I would expect OCI will create a cgroup, but that cgroup would be inside the specified --cgroup-parent. Or do I miss something?

@mheon
Copy link
Member

mheon commented Apr 29, 2021

Ah, yes - that definitely does sound like a bug.

@yaqwsx
Copy link
Author

yaqwsx commented Apr 29, 2021

No matter how deep I create the hierarchy of cgroups, runc always places its cgroup into the first parent cgroup that is managed by systemd (i.e., TEST.scope in my test case above). Is it possible that I miss something obvious in configuration of the cgroups?

@mheon
Copy link
Member

mheon commented Apr 29, 2021

Looking at the code further: it looks like we're not actually doing anything with cgroups (just setting the path to "") in the rootless Podman + cgroupfs case. @giuseppe Did we do this for a specific reason? And if so, should cgroupfs + --cgroup-parent cause an error when rootless?

@yaqwsx
Copy link
Author

yaqwsx commented Apr 29, 2021

Is there a reason why --cgroup-parent should not work on rootless containers? Especially with groups v2?

@giuseppe
Copy link
Member

Looking at the code further: it looks like we're not actually doing anything with cgroups (just setting the path to "") in the rootless Podman + cgroupfs case. @giuseppe Did we do this for a specific reason? And if so, should cgroupfs + --cgroup-parent cause an error when rootless?

I think that is an error. If --cgroup-parent was specified, then we need to honor it

@giuseppe
Copy link
Member

I think we should have:

diff --git a/libpod/container_internal_linux.go b/libpod/container_internal_linux.go
index eb70f92a9..254fd2fe6 100644
--- a/libpod/container_internal_linux.go
+++ b/libpod/container_internal_linux.go
@@ -2224,12 +2224,11 @@ func (c *Container) getOCICgroupPath() (string, error) {
        }
        cgroupManager := c.CgroupManager()
        switch {
+       case c.config.CgroupParent != "":
+               return c.config.CgroupParent, nil
        case (rootless.IsRootless() && (cgroupManager == config.CgroupfsCgroupsManager || !unified)) || c.config.NoCgroups:
                return "", nil
        case c.config.CgroupsMode == cgroupSplit:
-               if c.config.CgroupParent != "" {
-                       return c.config.CgroupParent, nil
-               }
                selfCgroup, err := utils.GetOwnCgroup()
                if err != nil {
                        return "", err

@giuseppe
Copy link
Member

PR here: #10177

giuseppe added a commit to giuseppe/libpod that referenced this issue May 3, 2021
if --cgroup-parent is specified, always honor it without doing any
detection whether cgroups are supported or not.

Closes: containers#10173

Signed-off-by: Giuseppe Scrivano <[email protected]>
@yaqwsx
Copy link
Author

yaqwsx commented May 12, 2021

It works, thank you.

@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 22, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 22, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants