Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

podman system prune: error removing pod conmon cgroup: EBUSY #11946

Closed
edsantiago opened this issue Oct 12, 2021 · 8 comments · Fixed by containers/common#858
Closed

podman system prune: error removing pod conmon cgroup: EBUSY #11946

edsantiago opened this issue Oct 12, 2021 · 8 comments · Fixed by containers/common#858
Labels
flakes Flakes from Continuous Integration locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.

Comments

@edsantiago
Copy link
Member

This is not a new flake, it dates back to June or earlier. Possibly I didn't file it because we have a ton of other EBUSY flakes and I thought this was the same one. Symptom seems to be:

time="2021-10-12T16:12:29Z" level=error msg="Deleting pod XXX 
    cgroup /libpod_parent/XXX: 
    remove /sys/fs/cgroup/misc/libpod_parent/XXX: 
    remove /sys/fs/cgroup/misc/libpod_parent/XXX/conmon:
    remove /sys/fs/cgroup/misc/libpod_parent/XXX/conmon: device or resource busy"

(newlines added for legibility).

Podman prune [It] podman system prune - pod,container stopped

The fact that it only seems to happen on f33 suggests a cgroups v1 problem.

@edsantiago edsantiago added the flakes Flakes from Continuous Integration label Oct 12, 2021
@mheon
Copy link
Member

mheon commented Oct 13, 2021

It's all Cgroupsv1 podman-in-container.

We have a protection against this (setting parent cgroup max processes to 0, to prevent cloning, to prevent the cleanup process from launching; this is usually caused by the cleanup process still being active in the cgroup after the last container exits). Possible this doesn't work in a container?

@edsantiago
Copy link
Member Author

Another one, f33 root again

@edsantiago
Copy link
Member Author

And another

@github-actions
Copy link

A friendly reminder that this issue had no activity for 30 days.

@edsantiago
Copy link
Member Author

Still happening

Podman prune [It] podman system prune - pod,container stopped

@vrothberg
Copy link
Member

vrothberg commented Dec 14, 2021

It's all Cgroupsv1 podman-in-container.

We have a protection against this (setting parent cgroup max processes to 0, to prevent cloning, to prevent the cleanup process from launching; this is usually caused by the cleanup process still being active in the cgroup after the last container exits). Possible this doesn't work in a container?

@giuseppe could you take a look at this one?

giuseppe added a commit to giuseppe/common that referenced this issue Dec 14, 2021
on a busy system, the conmon process could take longer to complete or
to be reaped by the parent, leaving the cgroup busy.  If the rmdir
fails with EBUSY, try again up to 5 seconds before reporting an
error.

Closes: containers/podman#11946

Signed-off-by: Giuseppe Scrivano <[email protected]>
@giuseppe
Copy link
Member

PR here: containers/common#858

giuseppe added a commit to giuseppe/common that referenced this issue Dec 14, 2021
on a busy system, the conmon process could take longer to complete or
to be reaped by the parent, leaving the cgroup busy.  If the rmdir
fails with EBUSY, try again up to 5 seconds before reporting an
error.

Closes: containers/podman#11946

Signed-off-by: Giuseppe Scrivano <[email protected]>
giuseppe added a commit to giuseppe/common that referenced this issue Dec 14, 2021
on a busy system, the conmon process could take longer to complete or
to be reaped by the parent, leaving the cgroup busy.  If the rmdir
fails with EBUSY, try again up to 5 seconds before reporting an
error.

Closes: containers/podman#11946

Signed-off-by: Giuseppe Scrivano <[email protected]>
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 21, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 21, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
flakes Flakes from Continuous Integration locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants