-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Group mapping in rootless #13090
Comments
Because those groupids are not mapped inside of the containers user namespace. User Namespace maps all UIDs not mapped into the User Namespace as 65534(nobody) |
Hi @rhatdan I'm a bit confused. When I run HGROUPS
558749,558749,2001 Do I need to create group1, group2 inside the container? Thanks |
No, did you run your container with |
Now in a different terminal run |
Yes, that's what I did: $ podman run -it --rm --userns=keep-id --annotation run.oci.keep_original_groups=1 docker.io/library/bash
bash-5.1$ id
uid=2001(test) gid=2001(xxxxxxx) groups=65534(nobody),65534(nobody),2001(test) $ podman ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
070992ab2903 docker.io/library/bash:latest bash 5 seconds ago Up 5 seconds ago quirky_nas
$ podman top 070992ab2903 hgroups
HGROUPS
427677,427677,2001 |
That looks correct, although podman top might have a bug here, since it printed out the first leaked group twice. @vrothberg PTAL, it looks like we might have a bug in podman top. |
@giuseppe PTAL I am not sure we are leaking groups in podman 4.0 |
|
it seems to work for me. What groups do you have on the host? Can you check |
Just to make sure to understand well what's happen. If I run podman in rootless, and add the |
The Linux kernel maps gids that are not part of the user namespace mapping to the overflow gid. |
Yes, an example would be:
|
What can be solutions to be able to also map gid that are not part of the user namespace ? |
In my case: grep ^Groups /proc/2410/status
Groups: 1001 2000 2001 |
So I think we had to find a way to "leak" the host process' groups (e.g., |
Is the HOSTS_GROUPS available inside of the container, or just to podman top? |
It does not exist yet but I would leak it before re-execing into Podman's User NS. |
Right, I thing you could set this in the user namespace by default then
@giuseppe should we leak this always to the user namespace or only when running top, we could force this to happen in rootless.c? |
would that work though? We are injecting the groups of the current process, but we should read the |
Yes, this kind of sucks, Is there away to first look at the process out side of the user namespace and then enter the user namespace to continue into the pid namespace? |
I am still looking into it, if we can leak
The only way so far seems to do it in two steps, do not join directly the user namespace and read this information from the host, then re-exec a helper process to read everything else. It looks like a corner case though, is it even worth to support in |
I agree. It looks like a substantial massaging of the code for a corner case.
Can you elaborate on what you mean by "marking"? |
Just convert the overflow id to something clearer like "Not Mapped" or something people can understand more easily |
Well that is the issue, everyone who has hit this errors is already complaining about seeing the $ podman run --group-add=keep-groups alpine groups Couldn't we just leak in a list of groups via environment variable on podman top, and then substiture the nobody for IDs on the list other then the primary group. |
A friendly reminder that this issue had no activity for 30 days. |
@vrothberg @giuseppe Lets talk about this at Watercooler tomorrow. |
A friendly reminder that this issue had no activity for 30 days. |
A friendly reminder that this issue had no activity for 30 days. |
@rhatdan how would that env variable look like? Wouldn't we need to inject the entire mapping? That would make me nervous for security reasons. |
it could be the output of The problem I see is that this information may be different than what the container process is using. It is rarely changed, but if it happens then it is going to be difficult to find out what happened and why |
Well Podman top returns the wrong information now. The issue is we can not get the actual GIDs of the leaked FDs, If we just leaked the FDs in as the Current list and we found a matching list of NOBODYS we would be 99% sure that they are the leaked FDs. |
Actually I think we would need to record the grep ^Groups /proc/self/status. into the container info, so we could record these were leaked. Then podman top could look this information up, when it sees multiple NOBODY groups in the /etc/group. |
A friendly reminder that this issue had no activity for 30 days. |
UpdateApparently what I want is called rootless id mounts and it is not supported yet in the kernel due to security concerns in the design. My "solution" here is a proposal for (1) a permission system for rootless id mounts and (2) an idea of not only mapping "container uids to high uids at the host" ( However I guess the following applies:
Thanks anyway for reading. And apologies for probably wasting your time, I'm learning. ContextWhen using rootless containers, for instance with podman, podman creates a user namespace following settings defined at /etc/subuid and /etc/subgid. These settings allow to map users and groups in the user namespace (inside the container) to a reserved range (if done correctly the range is unique for each user) in the host/parent namespace. This correspondence is used so we can create files with different user/group ownership inside the namespace that do not collide with any other user in the host namespace. Specifically the 0 UID and the 0 GID in the userspace are mapped to the default user id and the default group id, so it's easy for the user namespace processes to know how to make files owned by the parent user: just assign them to root inside the user namespace. ProblemI do not know of an easy way to configure the opposite: I would like to map groups in the host to a reserved range inside the namespace (you have called this "group leaking"). For instance if I have an "engineering" group in my host system, e.g. with gid 1000, as system administrator, I would like for the default user namespaces in rootless podman to see mounted host files belonging to the "engineering" group (and ideally not other random files in the container) as belonging to the "engineering" group inside the user namespace as well. Solution?I believe it would make sense to have a This list could be given in the following format:
For instance:
Would automatically map, for all users in the engineering group (as given by the last field) the engineering group (first field). This would be convenient for rootless containers that are expected to access directories mounted as volumes owned by secondary groups. podman (via crun) can now use Besides leaking the groups in the namespace, podman could additionally append the leaked groups into the container Final wordsIf that's already doable with some setting and I have missed it, I apologize. I would appreciate your feedback. I am not sure if I can contribute to this, since this is far from my field of knowledge, but for sure I'd love to use this feature. Thank you for your time reading this and your work in podman. |
Adding my Developers are using Having a straight forward way to map at least some host groups into the container would be useful. We've worked around this today and mark a subset of tests as unsupported in the container configurations that lack the requisite configurations. This isn't new, there are some tests we can't run in containers at all (like tests which use namespace isolation themselves to test things). |
this won't work even if we solve the issue above. A group will show as For a rootless user, you need to make sure these additional GIDs are added through |
It would be great if user groups could be added to the new user namespace via newgidmap, but I guess the risk DAC_OVERRIDE, might allow users to modify group files. |
/kind bug
Description
User group mapping are not keep when using
--annotation run.oci.keep_original_groups=1
On the host:
When I run the container:
I'm not sure to understand why my group1 and group2 are mapped with
nobody
.Steps to reproduce the issue:
Create a user
Create 2 groups and add it to the user
run a container with userns keep-id and with annotation
run.oci.keep_original_groups=1
and check what are your groups. They should be mapped as your hostDescribe the results you received:
Describe the results you expected:
Additional information you deem important (e.g. issue happens only occasionally):
Output of
podman version
:Output of
podman info --debug
:Package info (e.g. output of
rpm -q podman
orapt list podman
):I use fedora coreOS aws AMI
Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide? (https://github.com/containers/podman/blob/main/troubleshooting.md)
Yes
Additional environment details (AWS, VirtualBox, physical, etc.):
Run on AWS fedora coreos official image
The text was updated successfully, but these errors were encountered: