Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update the kernel configuration for cgroup v2 #36

Closed
KentaTada opened this issue Mar 28, 2023 · 20 comments · Fixed by #41
Closed

Update the kernel configuration for cgroup v2 #36

KentaTada opened this issue Mar 28, 2023 · 20 comments · Fixed by #41

Comments

@KentaTada
Copy link

Kubernetes 1.25 brings cgroup v2 to GA. 
cgroup v2 needs some additional kernel configs.
For example, you need to enable CONFIG_CGROUP_BPF if you want to use the the device controller.
When it comes to Kubernetes, I have never investigated what config is actually needed.
But I just create an issue at first.

@pacoxu
Copy link
Member

pacoxu commented Mar 29, 2023

#12 (comment) by @odinuge

CONFIG_CGROUP_BPF - Required for cgroupv2 (for controlling devices)

containerd/containerd#3799 (comment) by @AkihiroSuda

kernel >= 4.15 with CONFIG_CGROUP_DEVICE and CONFIG_CGROUP_BPF is required.

/cc @bobbypage @mrunalp
for cgroup v2 GA

kubernetes/minikube#6572

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 27, 2023
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 19, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot closed this as not planned Won't fix, can't repro, duplicate, stale Feb 18, 2024
@neolit123
Copy link
Member

/remove-lifecycle rotten
@pacoxu @KentaTada do we still need such a change?

@neolit123 neolit123 reopened this Feb 18, 2024
@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Feb 18, 2024
@KentaTada
Copy link
Author

Yes. We need to investigate and prepare for the kernel configuration for cgroupv2.
If I have time, I'll investigate it.

@neolit123
Copy link
Member

thanks, @KentaTada

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 2, 2024
@KentaTada
Copy link
Author

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 4, 2024
@pacoxu
Copy link
Member

pacoxu commented Jul 24, 2024

The v1.31 KEP kubernetes/enhancements#4569

I opened #37 for the kernel version.

  • for KernelConfig part, I need to check the list.

@neolit123
Copy link
Member

Yes. We need to investigate and prepare for the kernel configuration for cgroupv2. If I have time, I'll investigate it.

@KentaTada
should we include this in the next system-validators release too?

@pacoxu recently added:

@KentaTada
Copy link
Author

@KentaTada should we include this in the next system-validators release too?

Yes.
Although I haven't completely caught up with these commits yet, we should include the info about cgroup v2 before moving cgroup v1 support into maintenance mode.

BTW, we also need to update the list of kernel config.
For example, CONFIG_CGROUP_FREEZER is not actually required for cgroup v2.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/init/Kconfig?h=v5.8#n1006
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/init/Kconfig?h=v6.10#n1103

@neolit123
Copy link
Member

would you have time to send a PR or give us the info of all the required changes that we need to do?

@pacoxu
Copy link
Member

pacoxu commented Aug 13, 2024

we also need to update the list of kernel config.
For example, CONFIG_CGROUP_FREEZER is not actually required for cgroup v2.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/init/Kconfig?h=v5.8#n1006
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/init/Kconfig?h=v6.10#n1103

For cgroup v2, we use cgroup.freeze instead, which needs kernel 5.2+.

@KentaTada
Copy link
Author

I'm sorry but  I need to support my family this week.
In addition to that, I need to chair a meeting of KubeDay and my local CNCF-related eBPF event in August.
https://sched.co/1eh9w
https://community.cncf.io/e/m4e8cg/

For cgroup v2, we use cgroup.freeze instead, which needs kernel 5.2+.

What I wanted to say is that we need to prepare for how to detect what cgroup v2 features are available.
Unlike in v1, the kernel config alone cannot determine whether the required v2 features are enabled.
Maybe, it is possible if this validator checks for the existence of the file of cgroup.freeze.
We also need to think about BPF-based interfaces like the device controller.
In addition to that, we need to confirm what v2 features are currently needed for k8s at first.
This change should be made with caution because k8s users all over the world may recompile their kernel by the result of kubeadm.

@neolit123
Copy link
Member

I'm sorry but I need to support my family this week.
In addition to that, I need to chair a meeting of KubeDay and my local CNCF-related eBPF event in August.

1.31 was just released so this is planned for 1.32. we have a whole k8s release cycle to tackle the feature detection for cgroups v2.

What I wanted to say is that we need to prepare for how to detect what cgroup v2 features are available.
Unlike in v1, the kernel config alone cannot determine whether the required v2 features are enabled.
Maybe, it is possible if this validator checks for the existence of the file of cgroup.freeze.
We also need to think about BPF-based interfaces like the device controller.

i think having the cgroups v2 validation load additional files should be fine.

In addition to that, we need to confirm what v2 features are currently needed for k8s at first.

@pacoxu maybe we need to ask SIG node and bring them this this thread?

This change should be made with caution because k8s users all over the world may recompile their kernel by the result of kubeadm.

completely agree.

@pacoxu
Copy link
Member

pacoxu commented Aug 14, 2024

What I wanted to say is that we need to prepare for how to detect what cgroup v2 features are available.
Unlike in v1, the kernel config alone cannot determine whether the required v2 features are enabled.
Maybe, it is possible if this validator checks for the existence of the file of cgroup.freeze.
We also need to think about BPF-based interfaces like the device controller.

i think having the cgroups v2 validation load additional files should be fine.

In kubernetes/kubernetes#126595, we may use cpu.stat check instead of version check there.

@neolit123
Copy link
Member

neolit123 commented Sep 24, 2024

@KentaTada @pacoxu
can this be done for the 1.32 release cycle?

core freeze is 7th of Nov 2024.
https://github.com/kubernetes/sig-release/tree/master/releases/release-1.32

@pacoxu
Copy link
Member

pacoxu commented Sep 24, 2024

I would like to give it a try in this release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants