-
Notifications
You must be signed in to change notification settings - Fork 460
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
userns: Skip tests if the host doesn't support idmap mounts #1489
Conversation
Welcome @rata! |
@@ -876,6 +879,11 @@ var _ = framework.KubeDescribe("Security Context", func() { | |||
if !supportsUserNamespaces { | |||
Skip("no runtime handler found which supports user namespaces") | |||
} | |||
|
|||
pathIDMap := rootfsPath(resp.GetInfo()) | |||
if err := supportsIDMap(pathIDMap); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why don't we use the runtime handler features feature? shouldn't containerd not report the feature is available if it's not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The runtime handler features reports if the OCI runtime knows about idmap mounts and how to handle them, this is done at compile time. The thing is, OCI spec mandates that unrecognized lines are ignored. So, an old OCI runtime that doesn't know about idmap mounts, will just ignore the lines "uidMapping" and so on the mounts.
The end result if that happens is horrible, because there is no mapping applied and then you use the volume with the hostUID (the ID on the host the container is mapped to) and then create files. The hostUID changes every time you create the pod, so the volume will be with garbage in the UID/GID of files.
This is why we added to the runtime-spec the way to report in features if idmap mounts are supported: opencontainers/runtime-spec#1219
That is to avoid that situation: you just don't give idmap mounts to runtime that don't know about it (that report no support for idmap mounts). Because they will ignore it and create the aforementioned mess.
So once you know the runtime will handle it properly (using the features you mentioned), you can pass it to the runtime. But idmap mount support is filesystem-dependant. That means, each filesystem used needs to support it. And more filesystems are adding support every kernel release.
So you send the request to the OCI runtime to use idmap mounts on some volumes, but the runtime can return an error if the given filesystem doesn't support idmap mounts on the kernel it is running.
In particular in AlmaLinux 8 what is happening is: runc is new enough to handle them correctly, so we can pass a config.json with idmap mounts to that runc version. So we do. But the kernel is old and doesn't support the syscalls needed (so it doesn't support idmap mounts), so runc returns an error when trying to set it up.
What this PR does is just testing for idmap mounts supports and, if it is not supported, it skips the test.
tl; dr: the features command is useful to know if the OCI runtime knows about idmap mounts at compile time (see more info on the problem it solves here: opencontainers/runtime-spec#1219). However, we can run runc in filesystems that don't support idmap mounts (one fs can support it but another don't), and runc will return an error in those cases. We just skip the tests instead, to avoid the expected error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe the shorter answer is: containerd can only report it will be used or fail, but not ignored (as it was before if containerd/runc didn't know about idmap mounts and userns).
But each file-system used by mounts on the container needs to support idmap mounts. So it can't report yes/no without knowing which pod (e.g. if the pod users NFS, then it will fail, but if only uses ext4 and tmpfs, it will work on lot of kernels).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what file system are we trying to create a mount here? if runc thinks alma support idmapped mounts I'd expect all the mounts we create to work. like, we control what environments we run these tests in. We could make sure they're not nfs but instead ext4 or tmpfs (whatever the alma kernel supports at that time). What file systems are these tests running on that are failing, but runc is saying it's supporteD?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@haircommander I'm not sure what you are thinking. I'm definitely missing something, but I'm not sure what.
runc can't do that, that is really not possible. For several reasons, one is that the spec mandates to not do that:
The Features structure is irrelevant to the actual availability of the features in the host operating system. Hence, the content of the Features structure SHOULD be determined on the compilation time of the runtime, not on the execution time.
(from https://github.com/opencontainers/runtime-spec/blob/main/features.md)
And this to be at compilation time is really reasonable, for several reasons.
Another reason is that the syscalls are not supported in AlmaLinux 8 (opentree, mount_setattr and move_mount).
As I said in the first message: the features thing is only useful to see if the runtime will not ignore the idmap setting and create a mess. It's okay to return an error if some fs is not supported, the kernel is too old or something. That is what we want. We just need to skip tests on setups that will fail.
Am I missing something?
critest is used in projects like containerd, that test against older distros (like AlmaLinux 8). In those distros, CI will fail when we upgrade to runc 1.2.0. With runc 1.1 those test don't fail because runc doesn't support idmap mounts and the tests are skipped in that case. But with runc 1.2.0-rc.2, that supports idmap mounts, the tests are not skipped but fail on distros with older kernels that don't support idmap mounts. This commit just tries to detect if the path used for the container rootfs supports idmap mounts. To do that it uses the Status() message from CRI with verbose param set to true. It parses the output that containerd sets (it's quite unspecified that field), and otherwise fallbacks to "/var/lib" as the path to test idmap mounts support. Signed-off-by: Rodrigo Campos <[email protected]>
160798c
to
d929d7a
Compare
Sascha suggested to run this only once. Let's cache the answer from the runtime and move the tests that need idmap mounts on the host to `When("Host idmap mount support is needed"`. While we split the tests in that way, let's just query idmap mount support for the tests that need it, using the cache. Signed-off-by: Rodrigo Campos <[email protected]>
/hold Feel free to lift the hold once we got more review or you think this is good to be merged. |
3b082e9
to
2db7481
Compare
The test fail was unrelated to this, some github issue:
I pushed again due to that. @saschagrunert thanks a lot! I polished the code and fixed some edge cases now. I've also verified it passes all the tests in the containerd CI too(https://github.com/containerd/containerd/actions/runs/9859447900/job/27223414193?pr=9313) :) |
containerd creates a userns and inside there, it runs the critest tool. However, in that setup, the length of containerd's userns is not the whole UID space. Let's verify that the length of the userns inside the pod, when we created it with NamespaceMode_NODE (IOW, when not using a new userns for the pod) is the same as outside the pod. This works fine when contained itself runs inside a userns. Signed-off-by: Rodrigo Campos <[email protected]>
/unhold |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: rata, saschagrunert The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
I created |
I'm testing containerd with runc 1.2.0-rc.2 and I found this issue. This (and other fixes that belong to the containerd repo) make the CI happy again.
It will be great if we can include this in a 1.30.1 crit-tools release, so we can update it in the containerd CI. We need to prepare for the runc 1.2.0 final release, so we can just update whenever it is released :).
The containerd PR is this: containerd/containerd#9313. I did a hack (use my fork) to test the fixes indeed fix everything on containerd CI. It does, see here: https://github.com/containerd/containerd/actions/runs/9808474525/job/27084476978?pr=9313.
Without this commit, it fails on AlmaLinux 8 saying the syscall is not implemented: https://github.com/containerd/containerd/actions/runs/9797348691/job/27053863516?pr=9313#step:9:203
Is it possible to release cri-tools 1.30.1 soon with this change?
cc @saschagrunert
critest is used in projects like containerd, that test against older distros (like AlmaLinux 8). In those distros, CI will fail when we upgrade to runc 1.2.0.
With runc 1.1 those test don't fail because runc doesn't support idmap mounts and the tests are skipped in that case. But with runc 1.2.0-rc.2, that supports idmap mounts, the tests are not skipped but fail on distros with older kernels that don't support idmap mounts.
This commit just tries to detect if the path used for the container rootfs supports idmap mounts. To do that it uses the Status() message from CRI with verbose param set to true. It parses the output that containerd sets (it's quite unspecified that field), and otherwise fallbacks to "/var/lib" as the path to test idmap mounts support.
What type of PR is this?
/kind bug
/kind failing-test
What this PR does / why we need it:
We need it to fix the CI in containerd, once we update runc to 1.2.
It might be needed here too, once runc 1.2 is used here
Which issue(s) this PR fixes:
None
Special notes for your reviewer:
Does this PR introduce a user-facing change?