Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cgroup: rmdir the entire systemd scope #1146

Merged
merged 2 commits into from
Feb 20, 2023

Conversation

giuseppe
Copy link
Member

commit 7ea7617 caused a regression on cgroup v1, and some directories that are created manually are not cleaned up on container termination causing a cgroup leak.

Fix it by deleting the entire systemd scope directory instead of deleting only the final cgroup.

Closes: #1144

Signed-off-by: Giuseppe Scrivano [email protected]

path_to_scope = xstrdup (cgroup_status->path);
tmp = strstr (path_to_scope, cgroup_status->scope);
if (tmp)
tmp[strlen(cgroup_status->scope)] = '\0';
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
tmp[strlen(cgroup_status->scope)] = '\0';
tmp[strlen (cgroup_status->scope)] = '\0';

@@ -2485,7 +2485,14 @@ libcrun_container_run_internal (libcrun_container_t *container, libcrun_context_
return ret;

fail:
return cleanup_watch (context, def, cgroup_status, pid, sync_socket, terminal_fd, err);
ret = cleanup_watch (context, def, cgroup_status, pid, sync_socket, terminal_fd, err);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ret = cleanup_watch (context, def, cgroup_status, pid, sync_socket, terminal_fd, err);
ret = cleanup_watch (context, def, cgroup_status, pid, sync_socket, terminal_fd, err);

Copy link
Collaborator

@flouthoc flouthoc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR LGTM clang-format is missing.

@giuseppe giuseppe force-pushed the fix-cgroupv1-regression branch from 91533f3 to 51d3ff6 Compare February 20, 2023 09:05
commit 7ea7617 caused a regression on
cgroup v1, and some directories that are created manually are not
cleaned up on container termination causing a cgroup leak.

Fix it by deleting the entire systemd scope directory instead of
deleting only the final cgroup.

Closes: containers#1144

Signed-off-by: Giuseppe Scrivano <[email protected]>
make sure the cgroup is destroyed on errors.

Signed-off-by: Giuseppe Scrivano <[email protected]>
@giuseppe giuseppe force-pushed the fix-cgroupv1-regression branch from 51d3ff6 to 6cdf51c Compare February 20, 2023 10:31
@rhatdan
Copy link
Member

rhatdan commented Feb 20, 2023

LGTM

@rhatdan rhatdan merged commit d253c3f into containers:main Feb 20, 2023
return destroy_cgroup_path (cgroup_status->path, mode, err);
path_to_scope = get_cgroup_scope_path (cgroup_status->path, cgroup_status->scope);

return destroy_cgroup_path (path_to_scope, mode, err);
Copy link
Contributor

@rst0git rst0git Oct 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@giuseppe This change breaks crun restore with cgroups v1:

$ podman pull ubi8
$ podman run -d ubi8 sleep 300
$ podman container checkpoint -l
$ podman container restore -l
Error: OCI runtime error: crun: CRIU restoring failed -52.  Please check CRIU logfile `/var/lib/containers/storage/overlay-containers/b10e1136e6cefcf97759deef7377036b6cefc0eacf9591976671712f4e3c4ac9/userdata/restore.log`
$ tail /var/lib/containers/storage/overlay-containers/b10e1136e6cefcf97759deef7377036b6cefc0eacf9591976671712f4e3c4ac9/userdata/restore.log
(00.000882) cg: 	Making controller dir .criu.cgyard.8cbQy1/freezer (freezer), type cgroup
(00.000900) cg: Created cgroup dir freezer/machine.slice/libpod-b10e1136e6cefcf97759deef7377036b6cefc0eacf9591976671712f4e3c4ac9.scope/container
(00.000904) cg: Restore special props
(00.000905) cg: 	Making controller dir .criu.cgyard.8cbQy1/cpuset (cpuset), type cgroup
(00.000929) cg: Created cgroup dir cpuset/machine.slice/libpod-b10e1136e6cefcf97759deef7377036b6cefc0eacf9591976671712f4e3c4ac9.scope/container
(00.000932) cg: Restore special props
(00.000933) cg: Restoring cgroup property value [0-15] to [cpuset/machine.slice/libpod-b10e1136e6cefcf97759deef7377036b6cefc0eacf9591976671712f4e3c4ac9.scope/container/cpuset.cpus]
(00.000939) Error (criu/cgroup.c:1490): cg: Failed writing 0-15 to cpuset/machine.slice/libpod-b10e1136e6cefcf97759deef7377036b6cefc0eacf9591976671712f4e3c4ac9.scope/container/cpuset.cpus: Permission denied
(00.000944) Error (criu/cgroup.c:1756): cg: Restoring cpuset.cpus special property failed
(00.000944) Error (criu/cgroup.c:1815): cg: Restoring special cpuset props failed!

The Permission denied error appears when writing to cpuset/machine.slice/libpod-b10e1136e6cefcf97759deef7377036b6cefc0eacf9591976671712f4e3c4ac9.scope/container/cpuset.cpus because the parent cpuset.cpus is empty.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this change was made 2 years ago. How did it work until now? Was it always broken?

Copy link
Contributor

@rst0git rst0git Oct 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How did it work until now? Was it always broken?

crun restore with cgroups v1 has been broken since version 1.8.1, but it works with cgroups v2. We have seen this error reported in checkpoint-restore/criu#2000, checkpoint-restore/criu#2091 and https://issues.redhat.com/browse/RHEL-32180

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't really care about cgroup v1: #1149

support for it is going away.

Given cgroup v1 is also giving problems with #1589

I think I'll just open a PR to add a warning when cgroup v1 is used

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

cgroup v1 regression in crun 1.7 and above
4 participants