Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: run all services as unprivileged containers #505

Merged
merged 1 commit into from
Oct 25, 2021
Merged

Conversation

regisb
Copy link
Contributor

@regisb regisb commented Oct 4, 2021

With this change, containers are no longer run as "root" but as unprivileged
users. This is necessary in some environments, notably some Kubernetes
clusters.

To make this possible, we need to manually fix bind-mounted volumes in
docker-compose. This is pretty much equivalent to the behaviour in Kubernetes,
where permissions are fixed at runtime if the volume owner is incorrect. Thus,
we have a consistent behaviour between docker-compose and Kubernetes.

We achieve this by bind-mounting some repos inside "*-permissions" services.
These services run as root user on docker-compose and will fix the required
permissions, as per build/permissions/setowner.sh These services simply do not
run on Kubernetes, where we don't rely on bind-mounted volumes. There, we make
use of Kubernete's built-in volume ownership feature.

With this change, we get rid of the "openedx-dev" Docker image, in the sense
that it no longer has its own Dockerfile. Instead, the dev image is now simply
a different target in the multi-layer openedx Docker image. This makes it much
faster to build the openedx-dev image.

Because we declare the APP_USER_ID in the dev/docker-compose.yml file, we need
to pass the user ID from the host there. The only way to achieve that is with a
tutor config variable. The downside of this approach is that the
dev/docker-compose.yml file is no longer portable from one machine to the next.
We consider that this is not such a big issue, as it affects the development
environment only.

We take this opportunity to replace the base image of the "forum" image. There
is now no need to re-install ruby inside the image. The total image size is
only decreased by 10%, but re-building the image is faster.

In order to run the smtp service as non-root, we switch from namshi/smtp to
devture/exim-relay. This change should be backward-compatible.

Note that the nginx container remains privileged. We could switch to
nginxinc/nginx-unprivileged, but it's probably not worth the effort, as we are
considering to get rid of the nginx container altogether.

Close #323.

@regisb
Copy link
Contributor Author

regisb commented Oct 4, 2021

It should be noted that this PR was heavily influenced by this other one, by @menardorama #364

@thomas-skillup
Copy link

thomas-skillup commented Oct 8, 2021

Hello @regisb, sorry for the delayed response (busy week!). This looks great. I am adding a couple of remarks about potential alternative implementations/some food for thought.

The combined Dockerfile

I think the combined Dockerfile with --target is definitely the best solution. No need for explicit caching logic this way, great thinking.

Volume permissions

As for the permissions service, I'm curious: in the case of bind-mounted directories during development, does this permissions change affect the files as they appear to the host? And could this be an issue?

I think what I may be unsure about is what, exactly, is the desired behavior for a development use-case. Here are the two basic scenarios that come to mind:

  • The user in the container should match the UID of the host user.
  • The user in the container should have a UID which doesn't exist on this host, but this UID should be given access to mounted files.

In the first case, this amounts to the container user having the same host permissions as the host user. This might be fine in development. Or, it might be undesirable (ideally an issue in a container could never be destructive to the host machine's files outside of a mounted, writable volume).

In the second case (which, as I understand it, is the implementation used in this PR), the problem with unwanted permissions is solved. However there is a new problem: the host user probably shouldn't see any (significant) change to the permissions of mounted files. For instance, if I bind-mount the sources of a python dependency, I should be able to continue editing/using the files as usual on host, while also granting access to the container's user.

To make permissions changes insignificant to the host user, it may be worth considering the use of file access control lists rather than directly modifying the ownership bits. For instance, setowner.sh could be implemented as something like the following:

#!/bin/bash
set -e
setfacl -R u:"$1":rwX "${@:2}"

The container user will then have the same permissions as the host user, and the host user's access will not be modified, AFAIK (the rwX permission is ultimately unioned with those of the default/host user).

The permission service itself

I also wanted to add that it is possible to do this without adding an additional service (though not strictly necessary--actually, it might be detrimental for k8s purposes, but I'll mention it anyway).

This can be accomplished by modifying the Dockerfile to switch back to the root user at the end for the sake of executing the entrypoint script. The entrypoint script could perform the same logic as the permissions service would. After performing these changes, the entrypoint could then run the normal command without root privileges by first becoming the user again using gosu.

I'm looking forward to this PR being merged!

Edit: rwx should have been rwX.

@regisb
Copy link
Contributor Author

regisb commented Oct 11, 2021

Hey @thomas-skillup! I'm not sure I understand all of your comments. If we need to discuss this further I suggest we schedule a call. In particular, I am not familiar at all with setfacl, so I would appreciate more in-depth explanations.

Let me just emphasize that we need to address the following scenarios:

  1. We must be able to run the "lms" and "cms" services in unprivileged mode, with user != 0, both locally and on Kubernetes. Thus, gosu and sudo are out of the question.
  2. In development mode, the "lms" and "cms" containers must have write access to the shared written folders (/openedx/media and /openedx/data).
  3. In development, when bind-mounting edx-platform, the repo file permissions must not be modified.

@regisb regisb changed the base branch from master to nightly October 14, 2021 11:09
@regisb
Copy link
Contributor Author

regisb commented Oct 14, 2021

For the record, this PR was modified to merge the changes in the "nightly" branch: that's because we don't want to merge this breaking change in the current release branch.

@thomas-skillup
Copy link

thomas-skillup commented Oct 15, 2021

Hi @regisb,

I agree that the permissions service makes sense as a clean way to avoid behavior differences between k8s and Docker Compose.

If the gosu approach were used, it would need to be in a separate build stage in which USER root is reestablished and a different ENTRYPOINT script is used to become the correct user on the fly. The stage would be targeted only when building a development image for Docker Compose use locally. Production/k8s images would use a different target stage. But it’s probably just unneeded complexity.

My concern with the chmod approach for permissions management just for the case that a developer wants to access a directory from the host while it is simultaneously bound as a container volume. If both the host and guest users are i.e. UID 1000, this isn’t a problem. But if not, attempting to work with the mounted files from the host machine likely won’t work, since the file permissions will have been changed by the new service, and the host user may no longer have access.

ACLs let us keep the existing file permissions on such a mounted directory (so that the host user never loses access), while separately granting “extended” permissions which allow the guest user to also have the correct permissions. This way, even if the host and guest UIDs don’t match, both will still have the correct permissions.

Alternatively, it may work to just define a canonical group number used by the guest user. Developers could add themselves to the group of the same GID on the host machine so they retain access to bound directories even after the owner is changed. We may have to fiddle with /etc/passwd (or something like that) to get that working.

I’d be happy to schedule a call for further discussion! Ultimately, this is just a suggestion for solving a possible issue (which does have other workarounds). So the solution as it stands could be left as-is.

@regisb
Copy link
Contributor Author

regisb commented Oct 21, 2021

My concern with the chmod approach for permissions management just for the case that a developer wants to access a directory from the host while it is simultaneously bound as a container volume. If both the host and guest users are i.e. UID 1000, this isn’t a problem. But if not, attempting to work with the mounted files from the host machine likely won’t work, since the file permissions will have been changed by the new service, and the host user may no longer have access.

The point of this PR is precisely to make sure that the user ID in the openedx-dev container matches the one on the host. So we shouldn't have a problem here.

ACLs let us keep the existing file permissions on such a mounted directory (so that the host user never loses access), while separately granting “extended” permissions which allow the guest user to also have the correct permissions. This way, even if the host and guest UIDs don’t match, both will still have the correct permissions.

I played around with setfacl, which I didn't know about before. Actually, I had no idea that there was such a thing as Access Control Lists on UNIX. I must say that it looks very promising. I am only concerned by the support of ACLs across operating systems. At this point, I think I need to make further investigations.

With this change, containers are no longer run as "root" but as unprivileged
users. This is necessary in some environments, notably some Kubernetes
clusters.

To make this possible, we need to manually fix bind-mounted volumes in
docker-compose. This is pretty much equivalent to the behaviour in Kubernetes,
where permissions are fixed at runtime if the volume owner is incorrect. Thus,
we have a consistent behaviour between docker-compose and Kubernetes.

We achieve this by bind-mounting some repos inside "*-permissions" services.
These services run as root user on docker-compose and will fix the required
permissions, as per build/permissions/setowner.sh These services simply do not
run on Kubernetes, where we don't rely on bind-mounted volumes. There, we make
use of Kubernete's built-in volume ownership feature.

With this change, we get rid of the "openedx-dev" Docker image, in the sense
that it no longer has its own Dockerfile. Instead, the dev image is now simply
a different target in the multi-layer openedx Docker image. This makes it much
faster to build the openedx-dev image.

Because we declare the APP_USER_ID in the dev/docker-compose.yml file, we need
to pass the user ID from the host there. The only way to achieve that is with a
tutor config variable. The downside of this approach is that the
dev/docker-compose.yml file is no longer portable from one machine to the next.
We consider that this is not such a big issue, as it affects the development
environment only.

We take this opportunity to replace the base image of the "forum" image. There
is now no need to re-install ruby inside the image. The total image size is
only decreased by 10%, but re-building the image is faster.

In order to run the smtp service as non-root, we switch from namshi/smtp to
devture/exim-relay. This change should be backward-compatible.

Note that the nginx container remains privileged. We could switch to
nginxinc/nginx-unprivileged, but it's probably not worth the effort, as we are
considering to get rid of the nginx container altogether.

Close #323.
@regisb
Copy link
Contributor Author

regisb commented Oct 25, 2021

I'd like to merge this PR in nightly such that we can start using it in Maple. If necessary, we can always switch to setfacl later.

@regisb regisb merged commit f9402f7 into nightly Oct 25, 2021
@regisb regisb deleted the regisb/no-root branch October 25, 2021 14:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Creating ‘openedx’ user takes too long when building openedx-dev image
2 participants