-
Notifications
You must be signed in to change notification settings - Fork 183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better support for rootless containers #636
Conversation
Support for rootless RStudio for podman has been discussed before, e.g. rocker-org/rocker#202, #108, #346, and I think some users may have gotten it working? (@agila5 , or @hute37 may have pointers?) Rootless RStudio sessions are also possible through singularity, which is actually even documented https://rocker-project.org/use/singularity.html (yay!, though I think it needs updating?) @zeehio do you have a working rootless image using the approach you outline here? This looks like only a change to the init_userconf.sh script, which by itself I don't think will work (the default command on the rstudio-based Dockerfiles is not changed in this PR, which is anyway, rootless support is certainly an important theme and we welcome discussion, at very least we could probably add a docs page on the topic as well! |
@zeehio Out of curiosity: What environment is the use case here: A multi-tenant Linux server with rootless docker/podman and possibly X11 installed? |
Well in my case no X11 but yes. Ubuntu server with rootless podman |
IMHO only one user should be allowed to deploy applications on a server – but that is a matter of opinion. The deployed application should be multi-tenant and as much isolated as possible. Yes: user namespaces; No: mounting of volumes (= Not even rootless docker/podman for other users). |
In our statistical labs, we need to provide all students with a user customizable R environment without admin (root) access, based on Azure Ubuntu workstations. Using containers was a natural choice, so we decided to base our standard project on Rocker Images. But Docker wasn't an option 'cause of (too) many security issues in our context. Singularity could be a good alternative, used in HPC environments. The alternative we chose was Red-Hat "podman", that provides a service-less "root-less", "sandboxed" in user home, without admin access, able to share mounted directories. Using standard RStudio in rocker images poses a problem: "by default, the RStudio login requires a process user change to an unprivileged user in container" This is seen as a security requirement in a host based server installation, but in a podman "root-less" environment, not only not imply a security gain, but cause siduid/sudgid mapping issues in volume sharing between host and container. Our approach was to "patch" rstudio config, in order to allow "root" as a valid RStudio user, running provileged (uid:0, gid:0) in container. For details: Also consider The login in rstudio still poses some issues anyway:
We need to enable alternative group membership in order to access data in shared volumes. This is not a complete solution. It works only in R/Rscript but not alter RStudio login, after which alternative group membership is lost anyway. Now we are busy trying to enable NVIDIA CUDA support in container. Rootless container support in Kubernetes in another thing. It was discssed here: For podman don't miss Dan Walsh's Blog: __ As a security note, be careful about storing your ~/.ssh private keys if github/gitlab enabled ... |
@hute37 You might be interested in my CUDA-enabled JupyterLab R docker stack. |
The Dockerfile I used:
where init_userconf.sh is the modified file in this PR. I created the container with:
And it worked. Here is my podman info if you want:
It works! |
Thanks @zeehio , this is very nice. Would you be interested in drafting an Like you say, this is a nice approach as it mostly an opt-in extension and so doesn't look likely to create any breaking changes. @eitsupi @noamross @eddelbuettel any thoughts or concerns here? |
(not directly related, but I think the other context we often get inqueries about rootless deploy are for kubernetes setups, though not strictly necessary. I guess there is some open question about documenting rootless concepts generally vs podman-specific) |
Thank you for working on this! To be honest, I am not sure if this works properly as I do not normally use rootless Docker, Podman, etc. |
Some notes ... The solution of patching
enables To avoid warnings, it is possible to add also:
In documentation, these podman options could be reported:
For CUDA support, I follow these docs: Podman invocation options example:
|
@cboettig I will work on the documentation. With respect to my first post, I said:
I'm happy to say I have found the way. 👍 Short answerOn I will update my pull request accordingly. RUNROOTLESS still can be defined by the user, but now we have a default detection method that should "just work". Long explanationI found a way to detect if a process is running inside a user namespace or not. When we are not mapping any namespace there is a 1-1 uid map for all 32-bit possible user ids. On my ubuntu system:
With podman running as root we are not mapping ids:
If we are under any namespace, we won't be mapping everything linearly so we won't find a Rootless podman:
According to podman's source code both the runc and crun container runtimes use this to check if we are on a rootless container or not https://github.com/containers/podman/blob/d05a9807928c1deada0f0294cd912825a0487832/pkg/rootless/rootless_linux.go#L235. |
Hi @hute37, You were saying:
I have a solution, but it comes with a bit of a longer explanation: The way this "keep-groups" works is that it sets to process that runs when the container starts the groups of the "podman run" process (your user groups). However, since the additional group IDs are not mapped into the namespace where the container processes run, the processes inside the container do not have a mapped GID that they can use to name the groups they belong to and that's why you see "nobody" o "nogroup" if you ask for their IDs. When we run R or an RScript directly, the process already has the groups set and it does not need to refer to the IDs in any way, so it works. However, when we start a new process through the RStudio web interface we can't set those groups directly because inside the container they do not even have a group ID. We are not even able to Our only option to workaround this is to ask a sysadmin to subordinate the GID of the group we want to see to our users. This subordination happens through the What are those We are going to tell the linux kernel that we want to allow our users
With this file, alice can impersonate 65536 group ids starting from 100000 and up to 165535 (required to run containers in general) and impersonate one additional group with id 2000. For bob we see the same, except that the range of group IDs does not overlap with Alice. Since the group id 2000 is now subordinated to them, podman will automatically map the group id to the container, and the container will be able to give an internal id to it. With a mapped gid we can run usermod inside the container, so when the root user logs in through the web interface he/she will have that group set. And everything will work. Everything? Almost! We can't control how podman assigns that internal id, so it usually is GID 1 and it usually has an ugly name like But if we don't mind about this ugly name, everything works. TLDRWith the userconf script in this pull request, you can keep groups inside rstudio as well, as long as you delegate the group id you want at the host, editing If your
You need to append one line per user. And you may need to This allows them to impersonate that group in the containers. I believe it is a smaller security risk than giving them I'm trying to improve the group naming and numbering inside the container at: where we may get more feedback |
About "group mapping" ... I agree with
In our context, I found a pair of use-cases when group sharing was a requirement. Unable to address this requirement in RStudio configuration, we applied external workarounds:
VolumesIn general, native podman volume management probably is the best option here. I'm not an expert, but i think this is the standard approach in Kubernetes environments. Maybe With bind volumes the are also some mount options that support uid/gid mapping (I didn't tried) For default podman group membership mapping:
RStudio "pseudo-login"RStudio comes in two flavors: "Community" and "Pro", where most differences are in full user environment setup alter RStudio login. In community version, the rsession process does not support a true (pam compliant) login process, but implements a "basic" change in process credentials. Cfr. discussion here: This "pseudo-login" process probably does not support SecurityWhile disabling RStudio login could help in this context, I think that is not a valid solution. One important point in security is that "root" escalation is not the only sensible target. Your work is a "target":
While this is true also in "rootless" podman environment, to enable unauthorized "sudo" on privileged ("docker" group membership) docker environments can compromise host machine security. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you format like this? (Please check the CI log)
--- /github/workspace/scripts/init_userconf.sh.orig
+++ /github/workspace/scripts/init_userconf.sh
@@ -13,7 +13,7 @@
RUNROOTLESS=${RUNROOTLESS:=auto}
if [ "${RUNROOTLESS}" = "auto" ]; then
- RUNROOTLESS=$(grep 4294967295 /proc/self/uid_map > /dev/null && echo "false" || echo "true")
+ RUNROOTLESS=$(grep 4294967295 /proc/self/uid_map >/dev/null && echo "false" || echo "true")
fi
USERHOME="/home/${USER}"
Once this is fixed, may I merge this?
I fixed that linting you mentioned and I replaced my (hardcoded) overflow GID with the actual overflow GID reported by the kernel, as it should be. Meanwhile I'm trying to provide a pull request to podman providing the features discussed in their thread, which will make the configuration of additional groups straightforward in rootless settings with rstudio. The pull request for the webpage is accurate, but I hope to eventually be able to simplify it a lot |
Thanks for updates! I would like to merge this for now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Merge this for now.
Please let me know if you encounter any problems, thanks.
Tagging @cboettig since he suggested me to write something here I have a solution for running the container rootless: - rocker-org/rocker-versioned2#636 There are many advantages on rootless containers, the main one being security. The main caveat with rootless containers is when we want to map additional groups to the container (for instance when we have an additional group that owns a "shared_data" directory we want to access). In that case, we still need to learn quite a bit about id mapping. I've done my best to explain how things work and to provide a step by step guide in this pull request. Hopefully this will eventually be simplified. It may be that I have overlooked something - containers/podman#18333 I guess we can wait some days to see how the issue evolves. It may be that I've missed something and my solution is overly complicated or that some feature needs to land in podman to simplify additional group management. English is not my primary language. I would appreciate feedback or change in wordings. Besides I've been writing this for too long. I may need to take some time to get some perspective and re-read it again, but I believe it is worth a first read. --------- Co-authored-by: eitsupi <[email protected]>
@zeehio @hute37 You may be interested in my multi-arch ( |
@zeehio My CUDA-enabled JupyterLab docker stacks now support Docker/Podman rootless mode. E.g. Create an empty home directory: mkdir "${PWD}/jupyterlab-root" Use the following command to run the container as podman run -it --rm \
-p 8888:8888 \
-u root \
-v "${PWD}/jupyterlab-root":/home/root \
-e NB_USER=root \
-e NB_UID=0 \
-e NB_GID=0 \
glcr.b-data.ch/jupyterlab/r/verse start-notebook.sh --allow-root ❗ The home directory of @zeehio I would appreciate feedback from you. |
It looks good to me. I would however remove the Unfortunately my testing machine for rootless podman happens on a VM with a rather small disk on an older laptop, and when I tried to pull your image the disk got full. I hope you can test it on your end. |
@zeehio Thank you for the feedback.
For convenience reasons, there is e.g. |
Short summary
Running containers under rootless docker/podman lets us:
For backwards compatibility, this pull request assumes users still don't run rootless docker by default, so we require users running rootless containers to define
-e RUNROOTLESS=true
. Until a method is devised to detect if a container is running rootless or not from within the container.In this pull request, I modify the userconf script so if we are running rootless
-e RUNROOTLESS=true
we do not create users or change sudoers or do any of those things. With this change, we can run rstudio server under an unprivileged user without issues.Visit localhost:10000 and login using root and helloworld.
Thanks for considering merging this.
Alternative
As an alternative solution we could have different images: Something like
rocker/rstudio-rootless
orrocker-rootless/rstudio
But I think it is easier to just merge this
Historic background
Here is some longer story of why things are the way they are, to bring some context to all this user mess that exists in docker images. It may not be fully accurate, but good enough
Stage 1: We are root
sudo
(e.g.sudo docker run...
).This situation is problematic because:
People wants to use docker so much that a
docker
user group is created, and all users in thatdocker
group do not need to typesudo
to run docker anymore, although effectively it is as if they did. The risk is still there, only hidden.Stage 2: Images drop root privileges
Docker is a complex piece of software that relies on namespaces and control groups, features of the linux kernel that are under heavy development.
Therefore the fastest and easiest solution to address some of these issues comes from the image builders. The container starts running as root, but image builders following best-practices drop those permissions as soon as possible and read environment variables set when the container is created so users can choose the user id they would like to be used to create files, avoiding the file permission mess.
This depends on the good-will of the image builder, but "works".
This adds a lot of complexity, because images now have to consider multiple users and permissions (allow root inside the container for apt-get install, use sudoers...)
Stage 3: Run with
docker run --user
Docker allows to specify the user the image will run as. The Docker daemon runs as root, but the container is started as running with a user id, and that user in the container typically does not have root privileges anymore.
Docker here avoids file permission issues, but at some cost. Since now the image starting scripts do not have root access in the container, allowing for apt-get inside the container becomes far more tricky.
To my knowledge, this option does not get a lot of adoption in rocker and jupyter notebook images.
Stage 4: User namespaces
docker run --user-ns
The linux kernel starts having support for user namespaces.
Basically we can map user ids in the container to a range of user ids in the host. Depending on how this is used, this can lead to files created by the image not being owned by root anymore, but by a super high user ID.
To my knowledge, this option does not get a lot of adoption in rocker and jupyter notebooks.
Stage 5: Rootless docker
Enough namespace and cgroup solutions exist in the kernel for the docker daemon to be able to run containers without root permissions.
Running the docker daemon as root is a security risk, and it is also an auditability issue, so this becomes an actual better-designed solution to the permission problem. Now each user can run its own docker daemon. Since the docker daemon runs as the user, it does not have any special permissions, no damage to the host can be made.
Therefore we do not need images to drop privileges or follow any best practices to be responsible, since they are not allowed to do any damage by default anymore. Things can be simple again! However we must still be backwards compatible with those who have or want to use docker as root.
Hi podman! Podman was designed to run rootless by default, and maybe even was able to do so before docker (I don't really know). podman does not even need a daemon, although podman supports a daemon to increase compatibility with docker.
How does this work? Using user namespaces in a more transparent way:
--volume
) without caring for permissions or user idsroot
from within the container, but from the host it appears to run asalice
. If the entrypoint scripts do not do anything weird with the users, the entrypoint and commands that run afterwards run as well as root/alice. Files created on the mounted volume appear to be owned by alice. If alice tries to mount a volume she can't write on (e.g. --volume /sbin:/somewhere) the root/alice user in the container won't be able to write on it, because alice does not have permissions to do that.That's all we want and need!
But... backwards compatiblity!
Until someone finds a convention to determine if a container is running under rootless docker, we, image builders, can't tell if we should drop privileges or simply use them.
So, I would like to ask alice to set an environment variable to tell me if she is running rootless, so I can just use the root/alice user without a care for setting up users and permissions and sudoers. I'll have to ask alice to use an environment variable for now...