Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[1.1] libctr/cgroups: don't take init's cgroup into account #3811

Merged

Conversation

haircommander
Copy link
Contributor

Sometimes, the init process is not in the root cgroup. This can be noted by GetInitPath, which already scrubs the path of init.scope.

This was encountered when trying to patch the Kubelet to handle systemd being in a separate cpuset from root (to allow load balance disabling for containers). At present, there's no way to have libcontainer or runc manage cgroups in a hierarchy outside of the one init is in (unless the path contains init.scope, which is limiting)

@haircommander
Copy link
Contributor Author

PTAL @kolyshkin @mrunalp

@kolyshkin
Copy link
Contributor

To reviewers: this is a backport of PR #3784 to release-1.1 branch. In comments to that PR, I tried to find out whether this change can break anything. TL;DR, my conclusion was (and still is) that this is not a breaking change. More to say, using PID 1's cgroup as a base is not correct when host pid ns is used, and thus it was removed a long time ago (but not for systemd cgroup driver).

Now, this PR is required for the case where systemd is placed into a non-root cgroup.

Such placement is needed in recent kernels, when we want some containers (cgroups) to have CPU load-balancing disabled (cpuset.sched_load_balance = 0). Unfortunately, this also requires all the cgroup parents (up to root cgroup) to also have LB disabled. So, in order to achieve this without disabling CPU load balancing systemd-wide, all other processes on the system (including systemd) are moved to a separate cgroup hierarchy.

kolyshkin
kolyshkin previously approved these changes Apr 4, 2023
Copy link
Contributor

@kolyshkin kolyshkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

mrunalp
mrunalp previously approved these changes Apr 5, 2023
@kolyshkin kolyshkin added this to the 1.1.6 milestone Apr 5, 2023
@AkihiroSuda
Copy link
Member

DCO check is failing

@haircommander haircommander dismissed stale reviews from kolyshkin and mrunalp via 6cc1a3d April 6, 2023 13:29
@haircommander haircommander force-pushed the root-cgroup-no-init-1.1 branch from 43d85b3 to 6cc1a3d Compare April 6, 2023 13:29
@haircommander
Copy link
Contributor Author

fixed!

@haircommander haircommander force-pushed the root-cgroup-no-init-1.1 branch from 6cc1a3d to 15675a8 Compare April 6, 2023 22:47
@kolyshkin kolyshkin added area/systemd backport/1.1-pr A backport PR to release-1.1 labels Apr 6, 2023
kolyshkin
kolyshkin previously approved these changes Apr 6, 2023
Copy link
Contributor

@kolyshkin kolyshkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still LGTM

mrunalp
mrunalp previously approved these changes Apr 6, 2023
@haircommander haircommander dismissed stale reviews from kolyshkin and mrunalp via 9b17bdc April 7, 2023 01:38
@haircommander haircommander force-pushed the root-cgroup-no-init-1.1 branch from 8a82a42 to 9b17bdc Compare April 7, 2023 01:38
Copy link
Contributor

@kolyshkin kolyshkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still LGTM

@kolyshkin kolyshkin mentioned this pull request Apr 7, 2023
Copy link
Member

@AkihiroSuda AkihiroSuda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use git cherry-pick with -x

Sometimes, the init process is not in the root cgroup.
This can be noted by GetInitPath, which already scrubs the path of `init.scope`.

This was encountered when trying to patch the Kubelet to handle systemd being in a separate cpuset
from root (to allow load balance disabling for containers). At present, there's no way to have libcontainer or runc
manage cgroups in a hierarchy outside of the one init is in (unless the path contains `init.scope`, which is limiting)

Signed-off-by: Peter Hunt <[email protected]>
(cherry picked from commit 54e2021)
@haircommander haircommander force-pushed the root-cgroup-no-init-1.1 branch from 9b17bdc to 10cfd81 Compare April 7, 2023 13:41
@haircommander
Copy link
Contributor Author

Please use git cherry-pick with -x

done!

Copy link
Member

@AkihiroSuda AkihiroSuda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@mrunalp mrunalp merged commit 53333a5 into opencontainers:release-1.1 Apr 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/systemd backport/1.1-pr A backport PR to release-1.1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants