Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runc update: skip devices #2994

Merged
merged 1 commit into from
Jun 4, 2021

Conversation

kolyshkin
Copy link
Contributor

@kolyshkin kolyshkin commented Jun 3, 2021

Since runc update CLI is not able to modify devices (and it is not clear
how to implement that properly), let's set SkipDevices=true,
so that a cgroup controller won't try to update devices cgroup.

This helps use cases when some other device management (NVIDIA GPUs)
applies its configuration on top of what runc does.

Make sure we do not save SkipDevices into state.json.

This is an alternative to #2988.

Proposed changelog entry

  • runc update no longer updates cgroup devices configuration.

The runc update CLI is not able to modify devices, so let's set SkipDevices
(so that a cgroup controller won't try to update devices cgroup).

This helps use cases when some other device management (NVIDIA GPUs)
applies its configuration on top of what runc does.

Make sure we do not save SkipDevices into state.json.

Signed-off-by: Kir Kolyshkin <[email protected]>
Copy link
Member

@cyphar cyphar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I think dropping SkipDevices from the state is a good idea regardless of the original issue (and I prefer this to the other PR).

@cyphar
Copy link
Member

cyphar commented Jun 4, 2021

Though it should be noted this pretty much invalidates the runc update test I added in #2951 -- now runc update no longer triggers the cgroupv2 devices re-apply. So maybe it'd be better to remove the test since it's a complete no-op now (we can add a Go-based integration test later).

@kolyshkin
Copy link
Contributor Author

Yes, I am going to remove the test and maybe try to add a unit test.

I'd rather do it separately from this PR though. Let's discuss it in there: #3000

@kolyshkin kolyshkin merged commit 2f8e8e9 into opencontainers:master Jun 4, 2021
vasiliy-ul referenced this pull request in kubevirt/kubevirt Jan 3, 2024
CPU manager periodically updates cgroup settings via the container
runtime interface. Runc prior to version v1.0.0 drops all 'custom'
cgroup device rules on 'update' and that causes a race with live
migration when block volumes are hotplugged. Try to setup the test in a
way so that the VMI is migrated to a node without CPU manager.

Signed-off-by: Vasiliy Ulyanov <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants