-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Running nested inside docker starts failing with Ubuntu Jammy: writing file /sys/fs/cgroup/buildah-<random>/cgroup.procs
: Operation not supported
#14884
Comments
the issue could be that it started using cgroup v2. I think this is a dup of #12559. You need to make sure Podman is not running in the root cgroup |
@giuseppe sorry for being dense, my first rodeo with this particular bit of cgroup v2 -- is that what the commands at #12559 (comment) do? |
Thanks. Could you please show me how your cgroup configuration looks like now inside the container? |
We don't do any explicit setup of cgroups inside the container; it simply calls "podman" (behind a lot of substitution variables [1])? I can get the CI to report arbitrary things before it tries this; e.g. I'm running https://review.opendev.org/c/openstack/diskimage-builder/+/849274/2/diskimage_builder/elements/containerfile/root.d/08-containerfile now. I'll put results when I get them. I'll also try to setup an ad-hoc manual replication environment tomorrow which makes things a bit faster. Another thing I could try is running the nested podman as root; this is a privileged container. Not sure if that is a bug or a feature? :) [2] [1] https://opendev.org/openstack/diskimage-builder/src/branch/master/diskimage_builder/elements/containerfile/root.d/08-containerfile#L91 |
Hopefully the following gives some clues (from inside the container that wants to run podman on a failing Focal based CI)
|
I don't see any sub-cgroup, that means all processes are running in the root cgroup. With such setup, you first need to move the processes to a new sub-cgroup. Have you tried the fix suggested here: |
@giuseppe hrm, I guess I got the same thing as the reporter in #12559 (comment)
This is happening inside the container. Also, I tested running the nested podman as root with |
after you run that command, is the content of I'll see if I can fix it in Podman and perhaps let it automatically create a sub-cgroup when running in a container, but before that, I'd need to know more about the entrypoint for your container. Do you run directly Podman or something else? |
opened a PR: #14904 |
thank you; i'll need a little time but should be able to test this |
Unfortunately I didn't capture the output after; I can if this is still relevant
For reference; this runs forked from a daemon process. Actually the python daemon forks the diskimge-builder program, which then runs a shell-script, which then calls podman as one of its steps. So, no, it doesn't directly run podman :) |
so I am afraid my PR won't be enough. You need to make sure these programs run into a separate sub-cgroup. Have you considered running systemd in the container? |
Well it has never come up before :) Would I be on the right path in the forked shell-script that starts podman, making a cgroup and running podman with |
you need to run the container entrypoint itself in a new cgroup. There should not be any process left running in the root cgroup |
This is a squash of two changes that have unfortunately simultaneously broken the gate. The functests are failing with sha256sum: bionic-server-cloudimg-amd64.squashfs.manifest: No such file or directory I think what has happened here is that the SHA256 sums file being used has got a new entry "bionic-server-cloudimg-amd64.squashfs.manifest" which is showing up in a grep for "bionic-server-cloudimg-amd64.squashfs". sha256 then tries to also check this hash, and has started failing. To avoid this, add an EOL marker to the grep so it only matches the exact filename. Change I7fb585bc5ccc52803eea107e76dddf5e9fde8646 updated the containerfile tests to Jammy and it seems that cgroups v2 prevents podman running inside docker [1]. While we investigate, move this testing back to focal. [1] containers/podman#14884 Change-Id: I1af9f5599168aadc1e7fcdfae281935e6211a597
if podman is running in the root cgroup, it will create a new subcgroup and move itself there. [NO NEW TESTS NEEDED] it needs nested podman Closes: containers#14884 Signed-off-by: Giuseppe Scrivano <[email protected]>
@giuseppe I tried pulling master just to double check, but I still hit the same issue, so I think that giuseppe@e3419c0 doesn't fix this particular use case? So I wonder if this really is closed by that... One thing from the prior comment #12559 (comment) is that So what I came up with is https://review.opendev.org/c/zuul/nodepool/+/849273/4/Dockerfile
This seems to work in our CI [1]. However, I have to convince myself and two project reviewers that this is not a terrible hack. From podman's perspective, is this pretty much what is required to ensure running under our daemon inside a docker container? Are there any other suggestions? Thanks [1] https://zuul.opendev.org/t/openstack/build/eb5d6f2b2fe9448f8e0ae8cce6b500c6 |
not sure if terrible, but it is still a hack :) If you need such a complex configuration inside a container, maybe you should consider running systemd |
This is a squash of two changes that have unfortunately simultaneously broken the gate. The functests are failing with sha256sum: bionic-server-cloudimg-amd64.squashfs.manifest: No such file or directory I think what has happened here is that the SHA256 sums file being used has got a new entry "bionic-server-cloudimg-amd64.squashfs.manifest" which is showing up in a grep for "bionic-server-cloudimg-amd64.squashfs". sha256 then tries to also check this hash, and has started failing. To avoid this, add an EOL marker to the grep so it only matches the exact filename. Change I7fb585bc5ccc52803eea107e76dddf5e9fde8646 updated the containerfile tests to Jammy and it seems that cgroups v2 prevents podman running inside docker [1]. While we investigate, move this testing back to focal. [1] containers/podman#14884 Change-Id: I1af9f5599168aadc1e7fcdfae281935e6211a597 (cherry picked from commit 78d3895)
if podman is running in the root cgroup, it will create a new subcgroup and move itself there. [NO NEW TESTS NEEDED] it needs nested podman Closes: containers#14884 Signed-off-by: Giuseppe Scrivano <[email protected]>
You may add |
Per the comments in containers/podman#14884 there is basically no way to run podman nested in the container in a cgroups v2 environment (e.g. Ubuntu Jammy) with the processes in the same context the container starts in. One option is to run systemd in the container, which puts things in separate slices, etc. This is unappealing. This takes what I think is the simplest approach which is to check if we're under cgroups v2 and move everything into a new group before nodepool-builder starts. The referenced change tests this by running the containerfile elements on Jammy. Neded-By: https://review.opendev.org/c/openstack/diskimage-builder/+/849274 Change-Id: Ie663d01d77e17f560a92887cba1e2c86b421b24d
In our CI, we run an application (
nodepool
/diskimage-builder
) in container under Docker that then, nested inside this continaer, runspodman
(we use it to extract the root image of containers that we then modify withdiskimage-builder
). Our CI recently updated from an Ubuntu Focal distribution to Ubuntu Jammy and this started failing.The container the app runs in (under Docker) is Debian; it then runs podman
3.4.7+ds1-3+b1
. This has not changed.When running on Ubuntu Focal host, this works [1]. Our CI saves quite a bit of info about the host but perhaps the most interesting thing is kernel
5.4.0-121-generic
.A change switched only the base-os to Ubuntu Jammy and it started failing to run podman with [2]
This runs
5.15.0-40-generic
. Both of these are running dockerversion=20.10.17
; i.e. afaict the only difference here is the host distribution.#12559 feels similar?
[1] https://zuul.opendev.org/t/openstack/build/ffce49bb9ee04d3aa66d852792e4d747/logs
[2] https://zuul.opendev.org/t/openstack/build/53e3e8a9468b471896ec5be0718e4f02
The text was updated successfully, but these errors were encountered: