-
Notifications
You must be signed in to change notification settings - Fork 835
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kernel command line parameter kernelCommandLine=systemd.unified_cgroup_hierarchy=1 results in creation of cgroup V1 and V2 hierarchy. It is prohibited now, #6662
Comments
Not true at all. The WSL root user has the same access as a normal Windows user. Go ahead and navigate to C:\Windows\System32 and try replacing one of the executables from within WSL2. It will fail. |
WSL does not use systemd so that setting is not being respected. |
@benhillis WSL Kernel may not support systemd, It is separate module that can be supported by 3th party software. But WSL Kernel doesn't create correcty neither cgroup V1 nor V2 and fails with: |
If you enable systemd yourself through something like genie and set it up to boot with that running first, do you still experience the issue? |
@WSLUser After all upgrades to the Kernel 5.10.16.3-microsoft-standard-WSL2 and genie The issue has not going to be solved. Cgroup management and system.d are tightly coupled and the kernel parameter is called systemd.unified_cgroup_hierarchy by the Linux kernel authors. If WSL Kernel doesn't support systemd by itself then I assume that parameter must be called simply unified_cgroup_hierarchy and results in the creation of only the unified group hierarchy without polluting other FS. Unfortunately, it doesn't work. I'm afraid that the entire property kernelCommandLine of wslconfig file is ignored. I see in the of the same |
Ok, I'm going to ask you to do a couple things. First, set up systemd using https://github.com/shayne/wsl2-hacks and modify from script improvements shown in shayne/wsl2-hacks#7. Then compile the 5.10 WSL2 kernel using microsoft/WSL2-Linux-Kernel#245 for the config. Use https://wsl.dev/wsl2-kernel-zfs/ for steps. Once you restart your distro, do you still experience the original issue (docker unable to use cgroupsv2)? |
WSL doesn't run system do so expecting any of the systemd options to be honored will not work. |
I'm working in a different context: running the latest released Podman version on top of Fedora 34 remix distro built by Whitewater Foundation. Systemd functionality is provided by systemd-genie and works very reliable including cgroup management, user management, session management for both root and rootless users. I'm testing both scenarios side-by-side because Podman provides almost equal functionality with some minor restrictions for rootless users mainly in the networking and volume binding. I achieved almost full feature parity in both modes with one exception: When Podman detects cgroup V1 hierarchy in the rootless modes it falls back to cgroupfs because systemd doesn't allow mixed, back-compatibility mode and the systemd version used in Fedora34 doesn't contain a convertor. The systemd version 226 uses unified hierarchy by default. |
Your problem isn't with systemd not seeing the option, because it is passed through: your problem is that systemd isn't the first (and can't be, because of the above-all-distros namespace) init, so by the time systemd gets its hands on it, cgroups have long been initialized. More specifically, if you set the kernel command-line option
i.e., it looks like the Microsoft init is making use of v1 memory cgroups, so it doesn't look like you can get to a unified cgroups v2-only hierarchy unless and until that changes. |
@cerebrate I totally agree with you because the mixed hierarchy is created regardless Of genie usage. All error messages appear before distro's banner message and 1st systemd message. Some parts of WSL Kernel code are written 2 years ago, long before of unified hierarchy adoption by Linux distros and OCI runtimes. There were no real consumers for the unified hierarchy. The back-compatibility mode was required by RunC, Docker-for-win-Desktop based on the old DockerCE version. Docker needs cgroup V2 starting from 20-xx version. |
@WSLUser After all upgrades to the 5.10.16.3-microsoft-standard-WSL2 and genie 1.40 I still stuck with this issue: although parameter systemd.unified_cgroup_hierarchy is passed and accepted by WSL Kernel the kernel insists to create on re-populate cgroup V1 hierarchy and create unified as well. I'm passing now parameter unified_cgroup_hierarchy without systemd. ... With kernelCommandLine=cgroup_no_v1=all No group hierarchy is created, neither V1 nor V2. |
I would repurpose this issue to fix the proprietary init to support v2 as celebrate has pointed out this is the issue with getting cgroupsv2 to be supported. Of course it's also possible that systemd is eventually adopted instead of the init but that's an already known feature request. |
The kernel doesn't do anything with the systemd.* command-line parameters, though, because they aren't kernel parameters. (As you can see, they don't show up in
Those parameters only do anything because they're passed on to the init(1) launched by the kernel at the top-non-containerized-level, and require that it be systemd to do anything. Which it isn't, so they don't. (Now, if someone had a lot of time on their hands, they could modify genie so that it pulled the initial kernel command line out of the That wouldn't solve this particular issue, since the cgroup hierarchy is already established by the time genie can start its containerized systemd, and the rest of the potential use cases are obscure enough that it's down on my dogwash-priority list. But. hey, if anyone wants to implement it and PR me, they can go right ahead.) |
After a few tries, I just make cgroup v2 working with following steps:
dmesg will show some error since we set
ls /sys/fs/cgroup The cgroup v2 controllers should be correctly created by systemd.
Also check with docker info
|
@lightmelodies Interesting. I have tried that my own self, but all I get is the crash at the end of kernel boot mentioned above (#6662 (comment)). Can I ask what Windows build you're on, and whether you're using a custom kernel? (And if so, please send .config file?) |
Oh, wait.
You put this in And, curiously enough, I can duplicate your results by just mounting the cgroup2 hierarchy over /sys/fs/cgroup before systemd starts. This doesn't disable cgroups v1 in the kernel (as you can confirm by firing up a second distribution and looking at its /sys/fs/cgroup) or stop its hierarchy from being created/used by earlier processes, but mounting the cgroup2 hierarchy over the hybrid cgroup hierarchy does convince the bottle-container systemd and its children that they should operate in unified mode, not hybrid mode. I'll leave it up to someone with more cgroups knowledge than me to say whether or not this is actually useful in non-cosmetic ways (or whether it solves @PavelSosin-320 's problem). I'm not adverse to adding a "unified-cgroups" option to genie to enable this automagically, but I'd prefer to know if it's actually useful first. |
As a side note, in retrospect, having both wsl.config and .wslconfig existing with disparate functions seems like a bit of a naming oops, what? |
Sorry for mistyping, Just set kernelCommandLine in .wslconfig. I am still using windows build 19402 with a custom 5.12 kernel, but the default 5.4.72 kernel also work. Maybe some change in insider add a check in the init process and faill when cgroup v1 is disable. |
@lightmelodies Ah, right. Guess so, then, since on the current dev build cgroups_no_v1 reliably breaks in it with both the stock and my custom kernel. I am curious, though - if you don't set the kernelCommandLine, but you do mount the cgroup2 fs, does it behave any differently? That seems to get systemd etal. running in unified mode for me even without the kernel part. |
If I don't set the
The docker info also shows Cgroup Version: 2 but with the following warnings, which I think it does not really work.
|
I tested systemd boot process via executing systemctl daemon-reload and systemctl daemon-reexecute and found that the result is absolutely stable, exactly as initialization of the distro using genie. Since daemon-reload and execute don't involve systemd-genie I don't see any reason to change something in the genie. |
I checked. There is exactly one difference between I am as eager as the next chap to see all these things made to work, but before we go making allegations, please remember that |
@lightmelodies Makes sense. Seems like there's not a lot of point in adding support for that way of doing things, then. Thanks for testing it for me. |
oh a rootless discussion 😄 I managed to have NERDctl fully working with ContainerD in rootless mode (writting the blog now), however it works with cgroup v1 and this "hybrid" mode. Also, please note that I will try the PS: I'm switching to kernel 5.13, but for a "nice" rootless experience, you might want to jump to 5.11 at least, as it's where fuse-overlayfs is implemented (and 5.12 has the rootless mount capabilities). Looking forward to your tests 😄 |
So, after some testing, I could get cgroups V2 working somehow (see screenshot below with There is still manual steps to perform, but here is in a nutshell what I've done:
While this should work, the cgroup mount generates errors when we try to write inside it, so for podman I set the Hope this provides some additional hints |
The kernel doc says
So while we can manually umount cgroup v1 then mount cgroup v2 to make systemd work in unified mode, no v2 controllers are available because they are already bound to v1. That's why docker show such warnings. Unfortunately I can not find a way to disable v1 controllers dynamically without cgroup_no_v1=all. |
@nunix I use Arkane System systemd-genie that offers almost 100% systemd functionality including systemd-user with only 1 dependency - Dotnet 5.0 and exists for all popular distros. So, on one hand, the home-brewed Didleddan's script hardly satisfy me, and on another hand is able to support group V2 via systemd-root, systemd-user. The problem is only in the Kernel that is called 5.10 but lacks cgroup V2 module. Once, systemd had a feature to convert V1 to V2 but today, as you mentioned, the version's mix is not functional. |
@hypeitnow I don't think you need to edit containers.conf, but the rest is okey. Though you might want to try genie first. If successful, then try distrod and my hack :) |
I did so, but unfortunately:
I will try my luck with the kerenel release 5.15 from 20.08 they supposedly implemented miscellaneous cgroup maybe it'll pass EDIT I did steps one to 4 after conpiling my kernel in 5.15 and I can call it a success 22:37:13 ❯ findmnt -o PROPAGATION /
The only thing left is to eliminate this irritating warning WARN[0000] "/" is not a shared mount, this could cause issues or missing mounts with rootless containers |
With systemd now being officially supported, is there any plan to officially support cgroups v2? At least refactoring that piece of init code that's dependent on cgroups v1 memory controller would be very nice :) |
Well, looks like there's going to have to be a plan. From the release notes from the just-released systemd 252:
|
Looks like we can use
everything seems to be running happily in cgroups v2 only mode. Bar some irritating errors showing up in the |
I can also confirm it seems to be working in WSL 1.0.1.0. (It might not be relevant), but you might have to reboot after |
If I enable 'kernelCommandLine=cgroup_no_v1=all ' |
@ilan-schemoul you need to have wsl version 1.0.1.0. It's noted here:
Funny thing is the referenced issue is only from oct 9 and already had been noticed and "fixed", while this issue is discussing this for many months :P |
Thank you updating to 1.0.1.0 solved my issue ! |
I followed this #6662 (comment) and creating a dir in cgroup and going into it and then doing a very simple with root user : prints "bash: echo: write error: No such file or directory" even though the file exists |
I've added the line to /etc/fstab as suggested in #6662 (comment) by @cerebrate and after Tested both with 1.0.1.0 and 1.0.3.0, with systemd enabled and kernelCommandLine=cgroup_no_v1=all |
Can confirm the @cerebrate trick is working nicely here too on WSL 1.0.3.0 (Docker and Minikube are not complaining for now 👍). This is a step forward, having a native cgroups v2 support would be better ! |
Given the above note regarding the intent of the systemd maintainers, I'm confident it will be coming before too long. |
If Systemd is enabled and the startup program changes from /init to /sbin/init.
kernelCommandLine="cgroup_no_v1=all systemd.unified_cgroup_hierarchy=1"
#!/usr/bin/env bash
rename=/sbin/init-origin
if [ -f $rename ]; then
exit 0
fi
mv -f /sbin/init $rename
cat <<EOF | tee /sbin/init >/dev/null
#!/bin/sh
if [ \$$ -eq 1 ]; then
umount -R /sys/fs/cgroup >/dev/null 2>&1
if ! [ -d /sys/fs/cgroup ]; then
mkdir -p /sys/fs/cgroup >/dev/null 2>&1
fi
mount -t cgroup2 -o rw,nosuid,nodev,noexec,relatime,nsdelegate cgroup2 /sys/fs/cgroup >/dev/null 2>&1
fi
exec "$rename" "\$@"
EOF
chmod +x /sbin/init
cat <<EOF | tee /sbin/init-reset >/dev/null
#!/bin/sh
mv -f $rename /sbin/init && rm -f /sbin/init-reset
EOF
chmod +x /sbin/init-reset
|
@pierre-primary Can you explain a bit about your script? how is it compared to just adding cgroup2 mount in /etc/fstab ? |
This approach ensures that the adjustment of cgroup2's mount points is completed before /sbin/init starts. However, /etc/fstab is handled by /sbin/init, which is not very reasonable in terms of order. But this only applies to systemd with WSL enabled. When systemd is enabled, the startup program changes to /sbin/init, which can then be replaced by a custom script. By default, /init under wsl will be remounted every time, so /init cannot be replaced by a custom script. At the same time, this approach would change the process name. If there are any specific requirements, I can provide a more compatible script. #!/bin/sh
find_shell() {
while [ $# -gt 0 ]; do
if shell=$(command -v "$1"); then
echo "$shell"
return 0
fi
shift
done
return 1
}
echo_cgroup2_mount() {
cat <<"EOF"
if [ $$ -eq 1 ]; then
umount -R /sys/fs/cgroup >/dev/null 2>&1
if ! [ -d /sys/fs/cgroup ]; then
mkdir -p /sys/fs/cgroup >/dev/null 2>&1
fi
mount -t cgroup2 -o rw,nosuid,nodev,noexec,relatime,nsdelegate cgroup2 /sys/fs/cgroup >/dev/null 2>&1
fi
EOF
}
target_file=/sbin/init
rename_file=/sbin/init-origin
reset_file=/sbin/init-reset
# Avoid changing it again
if [ -f $rename_file ]; then
exit 0
fi
# Rename the origin file
mv -f $target_file $rename_file
# If it is a symbolic link, get the real path
real_file=$(readlink $rename_file)
if [ -z "$real_file" ]; then
real_file=$rename_file
else
real_file=$(realpath $rename_file)
fi
# Create the replacement script
{
if shell=$(find_shell ash bash); then
echo "#!$shell"
echo_cgroup2_mount
echo 'exec -a "$0" "'"$real_file"'" "$@"'
else
echo "#!/bin/sh"
echo_cgroup2_mount
echo 'exec '"$real_file"' "$@"'
fi
} | tee $target_file >/dev/null
chmod +x $target_file
# Create a rollback script
cat <<EOF | tee $reset_file >/dev/null
#!/bin/sh
mv -f $rename_file $target_file && rm -f $reset_file
EOF
chmod +x $reset_file |
I suddenly realized that |
Confirming, WSL2, AlmaLinux9, with a custom kernel, adding |
Fixes issues when using systemd 256 microsoft/WSL#6662
Fixes issues when using systemd 256 Ref: microsoft/WSL#6662 Co-authored-by: Andarwinux <[email protected]>
In the meantime, you could use the CONFIG_CMDLINE fixed wsl2 kernel https://github.com/Locietta/xanmod-kernel-WSL2?tab=readme-ov-file#usage |
Environment
Steps to reproduce
Only cgroup V2 hierarchy is built because the "mixed" setup has been prohibited as a dead-end. The recent runC ( Docker 20.10) and cRun switched to support cgroup V2 . It is necessary for rootless user mode, so important for WSL users.
The conversion between mixed mode and cgroup V2 is not supported anymore because of mentioned above reasons.
WSL logs:
Expected behavior
Only cgroup V2 hierarchy is created: /sys/fs/cgroup/unified/ and all controllers are put into the correct place.
Actual behavior
/sys/fs/cgroup is polluted with the random content like controllers and systemd folder
ls /sys/fs/cgroup
blkio cpu,cpuacct cpuset freezer memory net_cls,net_prio perf_event rdma unified
cpu cpuacct devices hugetlb net_cls net_prio pids systemd
Please, correct to allow upgrade Docker and Podman to the recent releases and working as a rootless user. This is also a security issue because WSL root user has unlimited access to the Windows program Files and program Data directories, i.e. can inject any malicious executive into Windows and run it as MyVirus.exe .
The text was updated successfully, but these errors were encountered: