Skip to content
This repository has been archived by the owner on Mar 9, 2022. It is now read-only.

[WIP] Use format of the parent cgroup to determine whether to use cgroupfs or systemd cgroup driver #1071

Closed
wants to merge 4 commits into from

Conversation

filbranden
Copy link
Contributor

The parent cgroup will always be a systemd "slice" which can be detected by looking at a ".slice" suffix in the cgroup name. Use that information to drive whether to tell runc to use systemd or cgroupfs.

The advantage of this approach is that the full control of which cgroup driver to use is now in the CRI client. containerd-cri will simply match what the CRI client requested, and ask runc to do the same. This way it is possible to switch cgroup drivers by simply reconfiguring the CRI client (Kubelet.)

Note: There's a corner case for the root slice, which in theory would be "/" for both cases and auto-detection would not work. But the correct way to specify the root slice for the systemd case is "-.slice" (or "/-.slice", since we only consider the last component of the path), so that corner case actually needs to be handled in the CRI client and not here.

This still needs testing! But sending it along to get some early feedback on the idea itself.

There are a few follow up commits updating comments, documentation and other minor fixes.

It's unclear to me what can be done about the systemd_cgroup field of Runtime.Options, since that seems to be part of an external API, so not sure how easy it can be deprecated... Opinions?

/cc @Random-Liu @yujuhong

@k8s-ci-robot
Copy link

Hi @filbranden. Thanks for your PR.

I'm waiting for a containerd member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

The parent cgroup will always be a systemd "slice" which can be detected
by looking at a ".slice" suffix in the cgroup name. Use that information
to drive whether to tell `runc` to use systemd or cgroupfs.

The advantage of this approach is that the full control of which cgroup
driver to use is now in the CRI client. `containerd-cri` will simply
match what the CRI client requested, and ask `runc` to do the same. This
way it is possible to switch cgroup drivers by simply reconfiguring the
CRI client (Kubelet.)

Note: There's a corner case for the root slice, which in theory would be
"/" for both cases and auto-detection would not work. But the correct
way to specify the root slice for the systemd case is "-.slice" (or
"/-.slice", since we only consider the last component of the path), so
that corner case actually needs to be handled in the CRI client and not
here.

Tested: TODO!

Signed-off-by: Filipe Brandenburger <[email protected]>
Systemd slice names are always hierarchical, so the nested slices will
always have the name of their parent as a prefix.

Fix the example and test case to use valid slice names.

Signed-off-by: Filipe Brandenburger <[email protected]>
This field was deprecated in favor of the same one in Runtime.Options,
but we're no longer looking at that field either, so document it as
doubly deprecated now.

Deprecating the one in Runtime.Options is a bit more tricky, since it's
in a protobuf that's part of an official API. Need to figure out what to
do about that one.

Signed-off-by: Filipe Brandenburger <[email protected]>
The field is doubly deprecated and doesn't really do anything anymore,
so let's just stop documenting it altogether.

Signed-off-by: Filipe Brandenburger <[email protected]>
@mikebrow
Copy link
Member

mikebrow commented Mar 2, 2019

/ok-to-test

@Random-Liu
Copy link
Member

I've actually included this as part of #1070. :p

@Random-Liu
Copy link
Member

This has been covered by #1224.

Thanks for suggesting this @filbranden

Close this one. Feel free to reopen if you want to add more things. :)

@Random-Liu Random-Liu closed this Sep 12, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants