Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creating ‘openedx’ user takes too long when building openedx-dev image #323

Closed
shashikiranraifox opened this issue May 11, 2020 · 20 comments · Fixed by #505
Closed

Creating ‘openedx’ user takes too long when building openedx-dev image #323

shashikiranraifox opened this issue May 11, 2020 · 20 comments · Fixed by #505
Assignees
Labels
enhancement Enhancements will be processed by decreasing priority

Comments

@shashikiranraifox
Copy link

As per the documentation, I ran the following command to build the openedx-dev image

tutor images build openedx-dev

OS Version is: 18.04.4 LTS
Tutor Version is: 3.12.3
I left it system to run overnight(almost 12 hours), yet the command was not successful. It was stuck at running the create-user.sh. Below is the log:
(I have used a workaround as specified here, but this is a stop gap arrangement)

Building image docker.io/overhangio/openedx-dev:3.12.3
docker build -t docker.io/overhangio/openedx-dev:3.12.3 /home/machine/.local/share/tutor/env/build/openedx-dev --build-arg USERID=1000
Sending build context to Docker daemon  5.632kB
Step 1/11 : FROM docker.io/overhangio/openedx:3.12.3
 ---> 4e83d402b431
Step 2/11 : MAINTAINER Overhang.io <[email protected]>
 ---> Using cache
 ---> ab86cbe8487e
Step 3/11 : RUN apt update &&     apt install -y vim telnet     && rm -rf /var/lib/apt/lists/*
 ---> Using cache
 ---> 6cdbaa38645e
Step 4/11 : RUN pip install -r requirements/edx/development.txt
 ---> Using cache
 ---> 39e71c2bff12
Step 5/11 : RUN pip install ipdb==0.12.2 ipython==5.8.0
 ---> Using cache
 ---> 014cec4bd517
Step 6/11 : RUN rm -r /openedx/staticfiles &&     mkdir /openedx/staticfiles &&     openedx-assets webpack --env=dev
 ---> Using cache
 ---> 752e39197dfd
Step 7/11 : COPY ./bin /openedx/bin
 ---> Using cache
 ---> ccb0d4bf5d2a
Step 8/11 : RUN chmod a+x /openedx/bin/*
 ---> Using cache
 ---> 8fd152d4cbca
Step 9/11 : ARG USERID=1000
 ---> Using cache
 ---> dc04a982062b
Step 10/11 : RUN create-user.sh $USERID


 ---> Running in e6215f4a433c
Creating 'openedx' user with id 1000

@bandirsen
Copy link

Any solution yet ? this happen to me too, for latest version of tutor.
I'm dong little investigation on build container by running following command on host

  1. "docker exec -it build-container-name bash"
  2. and on prompt I run "less /etc/passwd",
    The openedx user with user id 1000 was created in the build container, but the build process still hanged
    I suspecting the chown-ing folder /openedx in create_user.sh file is the problem, because the /openedx folder in overhangio/openedx:3.12.6 image contains huge number of files, including nodejs modules, and chowning that folder will be time consuming especially if it was done inside a docker container.
    Maybe USER and Layers Discussion moby/moby#30110 can be a guide to solve this problem

In the mean time I bypass the openedx-dev user creation by commenting following line in ~/.local/share/tutor/env/build/openedx-dev/Dockerfile :

# Configure new user
#ARG USERID=1000
#RUN create-user.sh $USERID

the openedx-dev image was successfully built with drawback it will be running with root user rather than openedx user.

@shashikiranraifox
Copy link
Author

Any solution yet ? this happen to me too, for latest version of tutor.
I'm dong little investigation on build container by running following command on host

  1. "docker exec -it build-container-name bash"
  2. and on prompt I run "less /etc/passwd",
    The openedx user with user id 1000 was created in the build container, but the build process still hanged
    I suspecting the chown-ing folder /openedx in create_user.sh file is the problem, because the /openedx folder in overhangio/openedx:3.12.6 image contains huge number of files, including nodejs modules, and chowning that folder will be time consuming especially if it was done inside a docker container.
    Maybe moby/moby#30110 can be a guide to solve this problem

In the mean time I bypass the openedx-dev user creation by commenting following line in ~/.local/share/tutor/env/build/openedx-dev/Dockerfile :

# Configure new user
#ARG USERID=1000
#RUN create-user.sh $USERID

the openedx-dev image was successfully built with drawback it will be running with root user rather than openedx user.

I don't have any solutions yet for the drawback you have specified in the end.

@blakenguyen97
Copy link

Any solution yet? Having the same problem :(

@ToddLichty
Copy link
Contributor

I just had this happen to me on version 10.0.11. Commenting out the lines in Dockerfile as @shashikiranraifox outlined worked for me.

@regisb regisb added the enhancement Enhancements will be processed by decreasing priority label Jul 25, 2020
@regisb
Copy link
Contributor

regisb commented Jul 25, 2020

I am currently investigating this, but it's not easy for me to replicate the issue, so it would be great if I could get some help from the community. In particular, I would like to know if replacing the chown statement in create-user.sh by a COPY --from={{ DOCKER_IMAGE_OPENEDX }} --chown=${USERID}:${USERID} /openedx /openedx statement in the Dockerfile would help.

Could someone please try this out? In case of trouble I can provide a more detailed example in a separate branch.

@ToddLichty
Copy link
Contributor

I would be willing to give it a shot. Can you give me a bit more details on what you would like me to try.

@regisb
Copy link
Contributor

regisb commented Jul 26, 2020

@ToddLichty You should install tutor from the "regisb/openedx-dev-chown" branch of the source repo. Then run:

tutor config save
tutor images build openedx-dev

@ToddLichty
Copy link
Contributor

ToddLichty commented Jul 26, 2020

@regisb That worked. There was a 5 minute pause on COPY --from=development-root --chown=${USERID}:${USERID} /openedx / but it did complete.

However, I then tried to run the dev lms just to make sure it worked and it has been stuck at:

The lms service will be available at http://local.overhang.io:8000
docker-compose -f /home/todd/.local/share/tutor/env/local/docker-compose.yml -f /home/todd/.local/share/tutor/env/dev/docker-compose.yml --project-name tutor_dev run --rm --service-ports lms
Starting tutor_dev_mysql_1         ... done
Starting tutor_dev_memcached_1     ... done
Starting tutor_dev_elasticsearch_1 ... done
Starting tutor_dev_smtp_1          ... done
Starting tutor_dev_mongodb_1       ... done
Starting tutor_dev_rabbitmq_1      ... done
Starting tutor_dev_forum_1         ... done
Setting file permissions for user openedx...

for the past 15 minutes.

@regisb
Copy link
Contributor

regisb commented Jul 27, 2020

@ToddLichty please paste here the output of the following command:

time tutor dev run --no-deps --entrypoint="bash -e -c" lms 'find /openedx -not -path "/openedx/edx-platform/*" -not -user openedx > /dev/null'

@ToddLichty
Copy link
Contributor

ToddLichty commented Jul 27, 2020 via email

@mrtndwrd
Copy link
Contributor

mrtndwrd commented Oct 14, 2020

I was running into this issue as well. I just "hotfixed" my current dockerfile by putting this line after the create-user line:

COPY --from=docker.io/overhangio/openedx:10.2.2 --chown=${USERID}:${USERID} /openedx /openedx

I uncommented the chown line in the create-user.sh script and ran tutor images build openedx-dev (after having built openedx locally as well, not sure if that makes a difference).

This worked for me and the lms starts without problems.

Unfortunately I don't know why I started having this problem: I'm reasonably sure that a week ago I could build the same version Dockerfile without problems.

@mrtndwrd
Copy link
Contributor

I just remembered something I changed in between not having/noticing this problem and having this problem: I moved my /var/lib/docker folder from SSD to a spinner disk. @shashikiranraifox are you also running this on a HDD? That might help to get the problem reproducible.

@shadinaif
Copy link

shadinaif commented Jan 4, 2021

Hi @regisb , I believe there is nothing we can do about it, it's kernal related. See this docker/for-linux#388

@regisb
Copy link
Contributor

regisb commented Jan 4, 2021

@shadinaif Well, we could COPY --chown ... instead of RUN chown -R .... But it will take some effort to get it work properly.

@Muntasir00
Copy link

I just had this happen to me on version 10.0.11. Commenting out the lines in Dockerfile as @shashikiranraifox outlined worked for me.
how to do it,I have the same issue

@thomas-skillup
Copy link

Hi, I have been experimenting with a rework of Tutor's Dockerfiles to take full advantage of BuildKit optimizations, and alleviate a few Docker image pain-points, in this Tutor fork.

Included in the rework is a solution to this issue. I was also previously seeing hours-long build times for the 'openedx-dev' image.

To get around this particular problem, I modified the dev image such that it no longer extends to the 'openedx' (local) image. Instead, /openedx, /opt/pyenv, and /usr/local/bin/dockerize are copied from the local image using COPY --chown=openedx:openedx ... into a fresh image which is initialized with an openedx user.

Quite a few other optimizations, fixes, and nice-to-haves are also included:

  • The openedx user in the dev image doesn't require a chroot and is granted passwordless sudo privileges.
    • Login time is now nearly instant.
  • The production stage of the local image no longer has to re-pip-install the Edx Platform's local.in requirements file since it reuses the .egg-info directories from the python-requirements stage.
    • This makes the final image smaller (no pip cache or unnecessary packages included).
    • More importantly, in combination with moving custom local requirement installations (private.txt) to the end of the stage, this allows users to override even direct, internal platform dependencies using private.txt (they won't be overwritten by this instruction anymore).
  • Layers in the 'openedx-dev' image are optimized for maximum reuse potential from the 'openedx' image using --cache-from.
  • Invocations of apt are changed to apt-get so that CLI stability is guaranteed.
  • Instructions which specify requirements (like apt-get ... or pip install ...) have each requirement moved to its own line to make VCS easier to work with.
  • Stages are more clearly demarcated with descriptive header comments.

I also made some CLI changes which handle automatically sharing layer caches between images.

Performance wise, I have noticed significantly faster build times for both the 'openedx' and 'openedx-dev' image. Because of the problem described in this issue, combined with build cache optimizations, the speedup for the 'openedx-dev' image is extreme. I haven't performed a proper benchmark, but the build times went from well over 1 hour to less than 30 minutes for me. The combined size of the two images should also be smaller.

The most relevant files are:

https://github.com/SkillUpTech/tutor/blob/docker-buildkit/tutor/templates/build/openedx/Dockerfile
https://github.com/SkillUpTech/tutor/blob/docker-buildkit/tutor/templates/build/openedx-dev/Dockerfile

The changes are still somewhat of a work in progress and have not been thoroughly tested. However, I would be happy work on rolling these into a PR/TEP. Is this something that the Tutor project would be interested in, @regisb?

@regisb
Copy link
Contributor

regisb commented Sep 30, 2021

Hi @thomas-skillup! It's really funny that you post this now, as I am currently working on pretty much the same thing. My goal is twofold:

  1. to be able to run Docker containers in unprivileged mode, as described in this blog post: https://discuss.overhang.io/t/running-tutor-containers-in-unprivileged-mode/1970
  2. to resolve the long build time described in this issue.

Your comment includes many good ideas (COPY, local.in, sudo), and some less so (multiple apt-get). For now, I propose that we stick to the issue at stake here, and address other questions later. Since I am working precisely on this right now, I'll make sure to tag you when I open my PR.

I have one question: with your development image, at runtime, you do go through chroot via /openedx/dev_bin/docker-entrypoint.sh, right? Does the "Setting file permissions for user openedx..." step take long in your environment?

@thomas-skillup
Copy link

thomas-skillup commented Sep 30, 2021

Hello again @regisb!

About unprivileged containers (and sudo)

Interesting about running the containers unprivileged--I didn't have this in mind for the same reason, but I already have this implemented in the 'dev-user' stage of the openedx-dev Dockerfile here.

The 'dev-user' stage could be simply moved into the production ('openedx') Dockerfile so that we start operating as the openedx user sooner (and thus also is used in the production image). We would, of course, probably want to just rename the stage to 'user', in that case.

sudo is configured in this same stage, and has worked without issue in my (limited) testing.

About local.in

For eliminating the duplicate installation of requirements in local.in, it was a simple matter of changing the line in the final stage of the 'openedx' Dockerfile which read

COPY --from=code /openedx/edx-platform /openedx/edx-platform`

to

COPY --from=python-requirements /openedx/edx-platform /openedx/edx-platform

The reason being that the .egg-info folders were generated in the 'python-requirements' stage, but were subsequently left behind.

About the apt-get concern

I'm not sure if I was entirely clear about what I meant here. I didn't mean to imply that I wanted to create a catastrophic quantity of layers, like one per dependency. I was referring purely to the formatting of the Dockerfile.

To illustrate, here's an example. Say you have a line that looks like:

RUN apt-get update && apt-get install -y wget curl llvm && rm -rf /var/lib/apt/lists/*

Now, if you have two different developers working on separate features, and one would like to modify/remove/pin the wget dependency, and the other would like to modify/remove/pin llvm, you have to deal with a merge conflict. So a more ideal formatting of the line would be:

RUN apt-get update && apt-get install -y \
    wget \
    curl \
    llvm \
&& rm -rf /var/lib/apt/lists/*

Which could easily be modified, if needed, to i.e.

RUN apt-get update && apt-get install -y \
    wget==1.8.7 \
    curl \
    llvm==9.7.1 \
&& rm -rf /var/lib/apt/lists/*

Unless, of course, you're referring to the first stage of my 'openedx-dev' Dockerfile implementation.

The only reason for having a few RUNs in this stage is that this instruction in the 'openedx-dev' Dockerfile is identical to this instruction in the 'openedx' Dockerfile. Or, at least, I meant it to be (oops--I see they are actually slightly different at time of writing--an easy fix).

The same logic applies to this 'openedx-dev' instruction and this 'openedx' instruction.

Because of this, there's no need to regenerate these two layers while building the development image. They can just be reused from the production image.

If this is undesirable for some reason, the three RUNs in the 'base' phase of my 'openedx-dev' Dockerfile could simply be combined--no functionality would be affected.

About the entrypoint and chroot

The "Setting file permissions for user openedx..." step has taken no-time flat for me, as the permissions changes are already in place, at least for the most part.

As it stands I'm still going through the /openedx/dev_bin/docker-entrypoint.sh script, yes. There shouldn't actually be any need for the chroot at all though--this part could just be removed.

About file permissions EDIT: Take these remarks with a grain of salt, I think several of my assumptions here were incorrect.

I'm a bit out of my element here, but I was under the impression that this wouldn't be much of an issue. I think Openshift should automatically mount the files with the correct permissions for the user running in the container (I could very well be wrong, though).

For Docker Compose, you could perform the bind-mounts as a host user with the same UID as the container user. Since I kept the UID parameterized, the image could easily be built such that the UID matches that of the host user.

If the host user is root though, this might be an issue. But it might still work just fine (with the caveat that the container is now, by definition, running privileged).

Or, as you mentioned, there could be a separate job for handling permissions on mounted directories, either via a direct chmod or using file access control lists. This might have the effect of ballooning the size of the containers' writable layer, though (again, I could very well be wrong here).

EDIT:

I think clever usage of groups would be the better alternative here.

The mounted volumes should have some group ID which can be determined at the convenience of the host. The user inside of the container can then be added to this group by some means at run time.

The end result is that the user in the container has access to all of the files in the system, either as owner or as a member of the correct group. And no mass-modification of ownership bits is needed at all--just the one groupmod invocation, perhaps somehow included in the entrypoint script, does the entire job.

If the UID of the container user itself needs to be dynamic, we would just change the COPY commands to specify a unique group, too. The the user would be created and added to both the group used for the container layers and the group in the bind-mounts at some point prior to executing the service.

Or maybe none of these approaches are correct ¯\_(ツ)_/¯. Just throwing out ideas.

@regisb regisb self-assigned this Oct 4, 2021
regisb added a commit that referenced this issue Oct 4, 2021
With this change, containers are no longer run as "root" but as unprivileged
users. This is necessary in some environments, notably some Kubernetes
clusters.

To make this possible, we need to manually fix bind-mounted volumes in
docker-compose. This is pretty much equivalent to the behaviour in Kubernetes,
where permissions are fixed at runtime if the volume owner is incorrect. Thus,
we have a consistent behaviour between docker-compose and Kubernetes.

We achieve this by bind-mounting some repos inside "*-permissions" services.
These services run as root user on docker-compose and will fix the required
permissions, as per build/permissions/setowner.sh These services simply do not
run on Kubernetes, where we don't rely on bind-mounted volumes. There, we make
use of Kubernete's built-in volume ownership feature.

With this change, we get rid of the "openedx-dev" Docker image, in the sense
that it no longer has its own Dockerfile. Instead, the dev image is now simply
a different target in the multi-layer openedx Docker image. This makes it much
faster to build the openedx-dev image.

Because we declare the APP_USER_ID in the dev/docker-compose.yml file, we need
to pass the user ID from the host there. The only way to achieve that is with a
tutor config variable. The downside of this approach is that the
dev/docker-compose.yml file is no longer portable from one machine to the next.
We consider that this is not such a big issue, as it affects the development
environment only.

We take this opportunity to replace the base image of the "forum" image. There
is now no need to re-install ruby inside the image. The total image size is
only decreased by 10%, but re-building the image is faster.

Close #323.
regisb added a commit that referenced this issue Oct 4, 2021
With this change, containers are no longer run as "root" but as unprivileged
users. This is necessary in some environments, notably some Kubernetes
clusters.

To make this possible, we need to manually fix bind-mounted volumes in
docker-compose. This is pretty much equivalent to the behaviour in Kubernetes,
where permissions are fixed at runtime if the volume owner is incorrect. Thus,
we have a consistent behaviour between docker-compose and Kubernetes.

We achieve this by bind-mounting some repos inside "*-permissions" services.
These services run as root user on docker-compose and will fix the required
permissions, as per build/permissions/setowner.sh These services simply do not
run on Kubernetes, where we don't rely on bind-mounted volumes. There, we make
use of Kubernete's built-in volume ownership feature.

With this change, we get rid of the "openedx-dev" Docker image, in the sense
that it no longer has its own Dockerfile. Instead, the dev image is now simply
a different target in the multi-layer openedx Docker image. This makes it much
faster to build the openedx-dev image.

Because we declare the APP_USER_ID in the dev/docker-compose.yml file, we need
to pass the user ID from the host there. The only way to achieve that is with a
tutor config variable. The downside of this approach is that the
dev/docker-compose.yml file is no longer portable from one machine to the next.
We consider that this is not such a big issue, as it affects the development
environment only.

We take this opportunity to replace the base image of the "forum" image. There
is now no need to re-install ruby inside the image. The total image size is
only decreased by 10%, but re-building the image is faster.

In order to run the smtp service as non-root, we switch from namshi/smtp to
devture/exim-relay. This change should be backward-compatible.

Note that the nginx container remains privileged. We could switch to
nginxinc/nginx-unprivileged, but it's probably not worth the effort, as we are
considering to get rid of the nginx container altogether.

Close #323.
@regisb
Copy link
Contributor

regisb commented Oct 4, 2021

@thomas-skillup please have a look at the following PR: #505 I'm curious to hear what you think.
In the end, I did not resort to sudo, because it was unnecessary. I also did not take the time to reformat apt statements -- please feel free to do so in a later PR.

regisb added a commit that referenced this issue Oct 5, 2021
With this change, containers are no longer run as "root" but as unprivileged
users. This is necessary in some environments, notably some Kubernetes
clusters.

To make this possible, we need to manually fix bind-mounted volumes in
docker-compose. This is pretty much equivalent to the behaviour in Kubernetes,
where permissions are fixed at runtime if the volume owner is incorrect. Thus,
we have a consistent behaviour between docker-compose and Kubernetes.

We achieve this by bind-mounting some repos inside "*-permissions" services.
These services run as root user on docker-compose and will fix the required
permissions, as per build/permissions/setowner.sh These services simply do not
run on Kubernetes, where we don't rely on bind-mounted volumes. There, we make
use of Kubernete's built-in volume ownership feature.

With this change, we get rid of the "openedx-dev" Docker image, in the sense
that it no longer has its own Dockerfile. Instead, the dev image is now simply
a different target in the multi-layer openedx Docker image. This makes it much
faster to build the openedx-dev image.

Because we declare the APP_USER_ID in the dev/docker-compose.yml file, we need
to pass the user ID from the host there. The only way to achieve that is with a
tutor config variable. The downside of this approach is that the
dev/docker-compose.yml file is no longer portable from one machine to the next.
We consider that this is not such a big issue, as it affects the development
environment only.

We take this opportunity to replace the base image of the "forum" image. There
is now no need to re-install ruby inside the image. The total image size is
only decreased by 10%, but re-building the image is faster.

In order to run the smtp service as non-root, we switch from namshi/smtp to
devture/exim-relay. This change should be backward-compatible.

Note that the nginx container remains privileged. We could switch to
nginxinc/nginx-unprivileged, but it's probably not worth the effort, as we are
considering to get rid of the nginx container altogether.

Close #323.
regisb added a commit that referenced this issue Oct 14, 2021
With this change, containers are no longer run as "root" but as unprivileged
users. This is necessary in some environments, notably some Kubernetes
clusters.

To make this possible, we need to manually fix bind-mounted volumes in
docker-compose. This is pretty much equivalent to the behaviour in Kubernetes,
where permissions are fixed at runtime if the volume owner is incorrect. Thus,
we have a consistent behaviour between docker-compose and Kubernetes.

We achieve this by bind-mounting some repos inside "*-permissions" services.
These services run as root user on docker-compose and will fix the required
permissions, as per build/permissions/setowner.sh These services simply do not
run on Kubernetes, where we don't rely on bind-mounted volumes. There, we make
use of Kubernete's built-in volume ownership feature.

With this change, we get rid of the "openedx-dev" Docker image, in the sense
that it no longer has its own Dockerfile. Instead, the dev image is now simply
a different target in the multi-layer openedx Docker image. This makes it much
faster to build the openedx-dev image.

Because we declare the APP_USER_ID in the dev/docker-compose.yml file, we need
to pass the user ID from the host there. The only way to achieve that is with a
tutor config variable. The downside of this approach is that the
dev/docker-compose.yml file is no longer portable from one machine to the next.
We consider that this is not such a big issue, as it affects the development
environment only.

We take this opportunity to replace the base image of the "forum" image. There
is now no need to re-install ruby inside the image. The total image size is
only decreased by 10%, but re-building the image is faster.

In order to run the smtp service as non-root, we switch from namshi/smtp to
devture/exim-relay. This change should be backward-compatible.

Note that the nginx container remains privileged. We could switch to
nginxinc/nginx-unprivileged, but it's probably not worth the effort, as we are
considering to get rid of the nginx container altogether.

Close #323.
regisb added a commit that referenced this issue Oct 25, 2021
With this change, containers are no longer run as "root" but as unprivileged
users. This is necessary in some environments, notably some Kubernetes
clusters.

To make this possible, we need to manually fix bind-mounted volumes in
docker-compose. This is pretty much equivalent to the behaviour in Kubernetes,
where permissions are fixed at runtime if the volume owner is incorrect. Thus,
we have a consistent behaviour between docker-compose and Kubernetes.

We achieve this by bind-mounting some repos inside "*-permissions" services.
These services run as root user on docker-compose and will fix the required
permissions, as per build/permissions/setowner.sh These services simply do not
run on Kubernetes, where we don't rely on bind-mounted volumes. There, we make
use of Kubernete's built-in volume ownership feature.

With this change, we get rid of the "openedx-dev" Docker image, in the sense
that it no longer has its own Dockerfile. Instead, the dev image is now simply
a different target in the multi-layer openedx Docker image. This makes it much
faster to build the openedx-dev image.

Because we declare the APP_USER_ID in the dev/docker-compose.yml file, we need
to pass the user ID from the host there. The only way to achieve that is with a
tutor config variable. The downside of this approach is that the
dev/docker-compose.yml file is no longer portable from one machine to the next.
We consider that this is not such a big issue, as it affects the development
environment only.

We take this opportunity to replace the base image of the "forum" image. There
is now no need to re-install ruby inside the image. The total image size is
only decreased by 10%, but re-building the image is faster.

In order to run the smtp service as non-root, we switch from namshi/smtp to
devture/exim-relay. This change should be backward-compatible.

Note that the nginx container remains privileged. We could switch to
nginxinc/nginx-unprivileged, but it's probably not worth the effort, as we are
considering to get rid of the nginx container altogether.

Close #323.
regisb added a commit that referenced this issue Oct 25, 2021
With this change, containers are no longer run as "root" but as unprivileged
users. This is necessary in some environments, notably some Kubernetes
clusters.

To make this possible, we need to manually fix bind-mounted volumes in
docker-compose. This is pretty much equivalent to the behaviour in Kubernetes,
where permissions are fixed at runtime if the volume owner is incorrect. Thus,
we have a consistent behaviour between docker-compose and Kubernetes.

We achieve this by bind-mounting some repos inside "*-permissions" services.
These services run as root user on docker-compose and will fix the required
permissions, as per build/permissions/setowner.sh These services simply do not
run on Kubernetes, where we don't rely on bind-mounted volumes. There, we make
use of Kubernete's built-in volume ownership feature.

With this change, we get rid of the "openedx-dev" Docker image, in the sense
that it no longer has its own Dockerfile. Instead, the dev image is now simply
a different target in the multi-layer openedx Docker image. This makes it much
faster to build the openedx-dev image.

Because we declare the APP_USER_ID in the dev/docker-compose.yml file, we need
to pass the user ID from the host there. The only way to achieve that is with a
tutor config variable. The downside of this approach is that the
dev/docker-compose.yml file is no longer portable from one machine to the next.
We consider that this is not such a big issue, as it affects the development
environment only.

We take this opportunity to replace the base image of the "forum" image. There
is now no need to re-install ruby inside the image. The total image size is
only decreased by 10%, but re-building the image is faster.

In order to run the smtp service as non-root, we switch from namshi/smtp to
devture/exim-relay. This change should be backward-compatible.

Note that the nginx container remains privileged. We could switch to
nginxinc/nginx-unprivileged, but it's probably not worth the effort, as we are
considering to get rid of the nginx container altogether.

Close #323.
@regisb
Copy link
Contributor

regisb commented Oct 25, 2021

This should be resolved in the nightly branch, and I expect that it will also be resolved once Maple is released (Dec. 9th). Feel free to reopen if you think otherwise.

And thanks to everyone for your participation!

@regisb regisb closed this as completed Oct 25, 2021
regisb added a commit that referenced this issue Dec 20, 2021
With this change, containers are no longer run as "root" but as unprivileged
users. This is necessary in some environments, notably some Kubernetes
clusters.

To make this possible, we need to manually fix bind-mounted volumes in
docker-compose. This is pretty much equivalent to the behaviour in Kubernetes,
where permissions are fixed at runtime if the volume owner is incorrect. Thus,
we have a consistent behaviour between docker-compose and Kubernetes.

We achieve this by bind-mounting some repos inside "*-permissions" services.
These services run as root user on docker-compose and will fix the required
permissions, as per build/permissions/setowner.sh These services simply do not
run on Kubernetes, where we don't rely on bind-mounted volumes. There, we make
use of Kubernete's built-in volume ownership feature.

With this change, we get rid of the "openedx-dev" Docker image, in the sense
that it no longer has its own Dockerfile. Instead, the dev image is now simply
a different target in the multi-layer openedx Docker image. This makes it much
faster to build the openedx-dev image.

Because we declare the APP_USER_ID in the dev/docker-compose.yml file, we need
to pass the user ID from the host there. The only way to achieve that is with a
tutor config variable. The downside of this approach is that the
dev/docker-compose.yml file is no longer portable from one machine to the next.
We consider that this is not such a big issue, as it affects the development
environment only.

We take this opportunity to replace the base image of the "forum" image. There
is now no need to re-install ruby inside the image. The total image size is
only decreased by 10%, but re-building the image is faster.

In order to run the smtp service as non-root, we switch from namshi/smtp to
devture/exim-relay. This change should be backward-compatible.

Note that the nginx container remains privileged. We could switch to
nginxinc/nginx-unprivileged, but it's probably not worth the effort, as we are
considering to get rid of the nginx container altogether.

Close #323.
regisb added a commit that referenced this issue Dec 20, 2021
With this change, containers are no longer run as "root" but as unprivileged
users. This is necessary in some environments, notably some Kubernetes
clusters.

To make this possible, we need to manually fix bind-mounted volumes in
docker-compose. This is pretty much equivalent to the behaviour in Kubernetes,
where permissions are fixed at runtime if the volume owner is incorrect. Thus,
we have a consistent behaviour between docker-compose and Kubernetes.

We achieve this by bind-mounting some repos inside "*-permissions" services.
These services run as root user on docker-compose and will fix the required
permissions, as per build/permissions/setowner.sh These services simply do not
run on Kubernetes, where we don't rely on bind-mounted volumes. There, we make
use of Kubernete's built-in volume ownership feature.

With this change, we get rid of the "openedx-dev" Docker image, in the sense
that it no longer has its own Dockerfile. Instead, the dev image is now simply
a different target in the multi-layer openedx Docker image. This makes it much
faster to build the openedx-dev image.

Because we declare the APP_USER_ID in the dev/docker-compose.yml file, we need
to pass the user ID from the host there. The only way to achieve that is with a
tutor config variable. The downside of this approach is that the
dev/docker-compose.yml file is no longer portable from one machine to the next.
We consider that this is not such a big issue, as it affects the development
environment only.

We take this opportunity to replace the base image of the "forum" image. There
is now no need to re-install ruby inside the image. The total image size is
only decreased by 10%, but re-building the image is faster.

In order to run the smtp service as non-root, we switch from namshi/smtp to
devture/exim-relay. This change should be backward-compatible.

Note that the nginx container remains privileged. We could switch to
nginxinc/nginx-unprivileged, but it's probably not worth the effort, as we are
considering to get rid of the nginx container altogether.

Close #323.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancements will be processed by decreasing priority
Projects
Development

Successfully merging a pull request may close this issue.

9 participants