-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reducing CPU period fails for subsystems if existing parent has quota>0 with systemd driver #3084
Comments
Note that systemd >= 242 is required to set systemd I see a few ways to fix this:
|
Do you mean 239? |
Testing that it fails without the fix. Signed-off-by: Kir Kolyshkin <[email protected]>
I think systemd is relevant but for the different reasons than those in items 1 and 2 in my earlier comment. It seems if systemd driver is used, systemd sets the period to 1000000 (i.e. 10x of the default period of In any case, items 1 and 2 are not entirely correct as with systemd we do not set "quota" and "period" separately, but a combined value The item 3 gives a good description of what is happening. Fix is on the way (took me a long time to code the test case). |
Testing that it fails without the fix. Signed-off-by: Kir Kolyshkin <[email protected]>
Thanks so much! This is RedHat Enterprise Linux, which means systemd is a lower version but contains a lot of cherry-picked fixes backported into the lower version - I see same behavior in RHEL7 = systemd 219 and RHEL8 = systemd 239. Maybe that is root cause: It doesn't look like there's an easy way to detect whether
|
Testing that it fails without the fix. Signed-off-by: Kir Kolyshkin <[email protected]>
Testing that it fails without the fix. Signed-off-by: Kir Kolyshkin <[email protected]>
In fact it's not absolutely required to set a period via systemd, as we set if via fallback fs driver anyway, so I am dropping the idea of using more sophisticated methods of figuring out whether systemd is supporting properties that we want to set. I have the fix almost ready, the only problem is the test case, in particular #3090 (comment). Perhaps you can help with that @hk-vmg ? |
Sorry for delay - thank you so much for the progress. I'll see if I can add anything, but I see you've likely got most/all of the way there. |
Sometimes setting CPU quota period fails when a new period is lower, and a parent cgroup has CPU quota limit set. This happens as in cgroup v1 the quota and the period can not be set together (this is fixed in v2), and since the period is being set first, new_limit = old_quota/new_period may be higher than the parent cgroup limit. The fix is to retry setting the period after the quota, to cover all possible scenarios. Tested via runc integration tests. Before the commit, it fails: root@ubu2004:~/git/runc# RUNC=`pwd`/../crun/crun.before bats -f "pod cgroup" -t tests/integration/update.bats 1..1 not ok 1 update cpu period in a pod cgroup with pod limit set # (in test file tests/integration/update.bats, line 424) # `[ "$status" -eq 0 ]' failed # crun.before spec (status=0): # # crun.before run -d --console-socket /tmp/bats-run-30428-dYkMDC/runc.4FdCtn/tty/sock test_update (status=0): # # crun.before update --cpu-quota 600000 test_update (status=1): # writing file `cpu.cfs_quota_us`: Invalid argument # crun.before update --cpu-period 10000 --cpu-quota 3000 test_update (status=1): # writing file `cpu.cfs_period_us`: Invalid argument With the fix, the test passes. Originally reported for runc in opencontainers/runc#3084 Signed-off-by: Kir Kolyshkin <[email protected]>
Sometimes setting CPU quota period fails when a new period is lower, and a parent cgroup has CPU quota limit set. This happens as in cgroup v1 the quota and the period can not be set together (this is fixed in v2), and since the period is being set first, new_limit = old_quota/new_period may be higher than the parent cgroup limit. The fix is to retry setting the period after the quota, to cover all possible scenarios. Tested via runc integration tests. Before the commit, it fails: root@ubu2004:~/git/runc# RUNC=`pwd`/../crun/crun.before bats -f "pod cgroup" -t tests/integration/update.bats 1..1 not ok 1 update cpu period in a pod cgroup with pod limit set # (in test file tests/integration/update.bats, line 424) # `[ "$status" -eq 0 ]' failed # crun.before spec (status=0): # # crun.before run -d --console-socket /tmp/bats-run-30428-dYkMDC/runc.4FdCtn/tty/sock test_update (status=0): # # crun.before update --cpu-quota 600000 test_update (status=1): # writing file `cpu.cfs_quota_us`: Invalid argument # crun.before update --cpu-period 10000 --cpu-quota 3000 test_update (status=1): # writing file `cpu.cfs_period_us`: Invalid argument With the fix, the test passes. Originally reported for runc in opencontainers/runc#3084 Signed-off-by: Kir Kolyshkin <[email protected]>
When setting a lower CFS CPU period, creating a runc container in Kubernetes using systemd driver fails with:
write /sys/fs/cgroup/cpu,cpuacct/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod<HASH>.slice/cri-containerd-<HASH>.scope/cpu.cfs_period_us: invalid argument: unknown
cgroupfs driver works fine.
It appears that order matters, when both period and quota files are changed. I believe the root cause is that the one-at-a-time updates to the child files cause the parent limits to be exceeded. This is seen in Kubernetes - if all containers in a pod have limits, then the parent "pod" slice will set a quota and the child "container" settings can't be updated.
https://github.com/opencontainers/runc/blob/master/libcontainer/cgroups/fs/cpu.go#L74
Tested with latest
master
branch runc.systemd is version 219 but I believe I have all the required OS/systemd patches:
@sureshvis
The text was updated successfully, but these errors were encountered: