feat: run all services as unprivileged containers #505

regisb · 2021-10-04T23:13:00Z

With this change, containers are no longer run as "root" but as unprivileged
users. This is necessary in some environments, notably some Kubernetes
clusters.

To make this possible, we need to manually fix bind-mounted volumes in
docker-compose. This is pretty much equivalent to the behaviour in Kubernetes,
where permissions are fixed at runtime if the volume owner is incorrect. Thus,
we have a consistent behaviour between docker-compose and Kubernetes.

We achieve this by bind-mounting some repos inside "*-permissions" services.
These services run as root user on docker-compose and will fix the required
permissions, as per build/permissions/setowner.sh These services simply do not
run on Kubernetes, where we don't rely on bind-mounted volumes. There, we make
use of Kubernete's built-in volume ownership feature.

With this change, we get rid of the "openedx-dev" Docker image, in the sense
that it no longer has its own Dockerfile. Instead, the dev image is now simply
a different target in the multi-layer openedx Docker image. This makes it much
faster to build the openedx-dev image.

Because we declare the APP_USER_ID in the dev/docker-compose.yml file, we need
to pass the user ID from the host there. The only way to achieve that is with a
tutor config variable. The downside of this approach is that the
dev/docker-compose.yml file is no longer portable from one machine to the next.
We consider that this is not such a big issue, as it affects the development
environment only.

We take this opportunity to replace the base image of the "forum" image. There
is now no need to re-install ruby inside the image. The total image size is
only decreased by 10%, but re-building the image is faster.

In order to run the smtp service as non-root, we switch from namshi/smtp to
devture/exim-relay. This change should be backward-compatible.

Note that the nginx container remains privileged. We could switch to
nginxinc/nginx-unprivileged, but it's probably not worth the effort, as we are
considering to get rid of the nginx container altogether.

Close #323.

regisb · 2021-10-04T23:13:51Z

It should be noted that this PR was heavily influenced by this other one, by @menardorama #364

thomas-skillup · 2021-10-08T19:22:14Z

Hello @regisb, sorry for the delayed response (busy week!). This looks great. I am adding a couple of remarks about potential alternative implementations/some food for thought.

The combined Dockerfile

I think the combined Dockerfile with --target is definitely the best solution. No need for explicit caching logic this way, great thinking.

Volume permissions

As for the permissions service, I'm curious: in the case of bind-mounted directories during development, does this permissions change affect the files as they appear to the host? And could this be an issue?

I think what I may be unsure about is what, exactly, is the desired behavior for a development use-case. Here are the two basic scenarios that come to mind:

The user in the container should match the UID of the host user.
The user in the container should have a UID which doesn't exist on this host, but this UID should be given access to mounted files.

In the first case, this amounts to the container user having the same host permissions as the host user. This might be fine in development. Or, it might be undesirable (ideally an issue in a container could never be destructive to the host machine's files outside of a mounted, writable volume).

In the second case (which, as I understand it, is the implementation used in this PR), the problem with unwanted permissions is solved. However there is a new problem: the host user probably shouldn't see any (significant) change to the permissions of mounted files. For instance, if I bind-mount the sources of a python dependency, I should be able to continue editing/using the files as usual on host, while also granting access to the container's user.

To make permissions changes insignificant to the host user, it may be worth considering the use of file access control lists rather than directly modifying the ownership bits. For instance, setowner.sh could be implemented as something like the following:

#!/bin/bash
set -e
setfacl -R u:"$1":rwX "${@:2}"

The container user will then have the same permissions as the host user, and the host user's access will not be modified, AFAIK (the rwX permission is ultimately unioned with those of the default/host user).

The permission service itself

I also wanted to add that it is possible to do this without adding an additional service (though not strictly necessary--actually, it might be detrimental for k8s purposes, but I'll mention it anyway).

This can be accomplished by modifying the Dockerfile to switch back to the root user at the end for the sake of executing the entrypoint script. The entrypoint script could perform the same logic as the permissions service would. After performing these changes, the entrypoint could then run the normal command without root privileges by first becoming the user again using gosu.

I'm looking forward to this PR being merged!

Edit: rwx should have been rwX.

regisb · 2021-10-11T08:04:13Z

Hey @thomas-skillup! I'm not sure I understand all of your comments. If we need to discuss this further I suggest we schedule a call. In particular, I am not familiar at all with setfacl, so I would appreciate more in-depth explanations.

Let me just emphasize that we need to address the following scenarios:

We must be able to run the "lms" and "cms" services in unprivileged mode, with user != 0, both locally and on Kubernetes. Thus, gosu and sudo are out of the question.
In development mode, the "lms" and "cms" containers must have write access to the shared written folders (/openedx/media and /openedx/data).
In development, when bind-mounting edx-platform, the repo file permissions must not be modified.

regisb · 2021-10-14T11:17:24Z

For the record, this PR was modified to merge the changes in the "nightly" branch: that's because we don't want to merge this breaking change in the current release branch.

thomas-skillup · 2021-10-15T18:11:59Z

Hi @regisb,

I agree that the permissions service makes sense as a clean way to avoid behavior differences between k8s and Docker Compose.

If the gosu approach were used, it would need to be in a separate build stage in which USER root is reestablished and a different ENTRYPOINT script is used to become the correct user on the fly. The stage would be targeted only when building a development image for Docker Compose use locally. Production/k8s images would use a different target stage. But it’s probably just unneeded complexity.

My concern with the chmod approach for permissions management just for the case that a developer wants to access a directory from the host while it is simultaneously bound as a container volume. If both the host and guest users are i.e. UID 1000, this isn’t a problem. But if not, attempting to work with the mounted files from the host machine likely won’t work, since the file permissions will have been changed by the new service, and the host user may no longer have access.

ACLs let us keep the existing file permissions on such a mounted directory (so that the host user never loses access), while separately granting “extended” permissions which allow the guest user to also have the correct permissions. This way, even if the host and guest UIDs don’t match, both will still have the correct permissions.

Alternatively, it may work to just define a canonical group number used by the guest user. Developers could add themselves to the group of the same GID on the host machine so they retain access to bound directories even after the owner is changed. We may have to fiddle with /etc/passwd (or something like that) to get that working.

I’d be happy to schedule a call for further discussion! Ultimately, this is just a suggestion for solving a possible issue (which does have other workarounds). So the solution as it stands could be left as-is.

regisb · 2021-10-21T10:13:30Z

My concern with the chmod approach for permissions management just for the case that a developer wants to access a directory from the host while it is simultaneously bound as a container volume. If both the host and guest users are i.e. UID 1000, this isn’t a problem. But if not, attempting to work with the mounted files from the host machine likely won’t work, since the file permissions will have been changed by the new service, and the host user may no longer have access.

The point of this PR is precisely to make sure that the user ID in the openedx-dev container matches the one on the host. So we shouldn't have a problem here.

ACLs let us keep the existing file permissions on such a mounted directory (so that the host user never loses access), while separately granting “extended” permissions which allow the guest user to also have the correct permissions. This way, even if the host and guest UIDs don’t match, both will still have the correct permissions.

I played around with setfacl, which I didn't know about before. Actually, I had no idea that there was such a thing as Access Control Lists on UNIX. I must say that it looks very promising. I am only concerned by the support of ACLs across operating systems. At this point, I think I need to make further investigations.

With this change, containers are no longer run as "root" but as unprivileged users. This is necessary in some environments, notably some Kubernetes clusters. To make this possible, we need to manually fix bind-mounted volumes in docker-compose. This is pretty much equivalent to the behaviour in Kubernetes, where permissions are fixed at runtime if the volume owner is incorrect. Thus, we have a consistent behaviour between docker-compose and Kubernetes. We achieve this by bind-mounting some repos inside "*-permissions" services. These services run as root user on docker-compose and will fix the required permissions, as per build/permissions/setowner.sh These services simply do not run on Kubernetes, where we don't rely on bind-mounted volumes. There, we make use of Kubernete's built-in volume ownership feature. With this change, we get rid of the "openedx-dev" Docker image, in the sense that it no longer has its own Dockerfile. Instead, the dev image is now simply a different target in the multi-layer openedx Docker image. This makes it much faster to build the openedx-dev image. Because we declare the APP_USER_ID in the dev/docker-compose.yml file, we need to pass the user ID from the host there. The only way to achieve that is with a tutor config variable. The downside of this approach is that the dev/docker-compose.yml file is no longer portable from one machine to the next. We consider that this is not such a big issue, as it affects the development environment only. We take this opportunity to replace the base image of the "forum" image. There is now no need to re-install ruby inside the image. The total image size is only decreased by 10%, but re-building the image is faster. In order to run the smtp service as non-root, we switch from namshi/smtp to devture/exim-relay. This change should be backward-compatible. Note that the nginx container remains privileged. We could switch to nginxinc/nginx-unprivileged, but it's probably not worth the effort, as we are considering to get rid of the nginx container altogether. Close #323.

regisb · 2021-10-25T14:25:53Z

I'd like to merge this PR in nightly such that we can start using it in Maple. If necessary, we can always switch to setfacl later.

regisb mentioned this pull request Oct 4, 2021

Creating ‘openedx’ user takes too long when building openedx-dev image #323

Closed

regisb force-pushed the regisb/no-root branch from 4ea73d4 to ca4377a Compare October 5, 2021 08:15

regisb changed the base branch from master to nightly October 14, 2021 11:09

regisb force-pushed the regisb/no-root branch from ca4377a to 0233ce5 Compare October 14, 2021 11:15

regisb force-pushed the regisb/no-root branch from 0233ce5 to cae6ffa Compare October 25, 2021 14:20

regisb merged commit f9402f7 into nightly Oct 25, 2021

regisb deleted the regisb/no-root branch October 25, 2021 14:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: run all services as unprivileged containers #505

feat: run all services as unprivileged containers #505

regisb commented Oct 4, 2021

regisb commented Oct 4, 2021

thomas-skillup commented Oct 8, 2021 •

edited

Loading

regisb commented Oct 11, 2021

regisb commented Oct 14, 2021

thomas-skillup commented Oct 15, 2021 •

edited

Loading

regisb commented Oct 21, 2021

regisb commented Oct 25, 2021

feat: run all services as unprivileged containers #505

feat: run all services as unprivileged containers #505

Conversation

regisb commented Oct 4, 2021

regisb commented Oct 4, 2021

thomas-skillup commented Oct 8, 2021 • edited Loading

regisb commented Oct 11, 2021

regisb commented Oct 14, 2021

thomas-skillup commented Oct 15, 2021 • edited Loading

regisb commented Oct 21, 2021

regisb commented Oct 25, 2021

thomas-skillup commented Oct 8, 2021 •

edited

Loading

thomas-skillup commented Oct 15, 2021 •

edited

Loading