Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement run-hooks as a separate script #1979

Merged
merged 4 commits into from
Aug 24, 2023

Conversation

mathbunnyru
Copy link
Member

@mathbunnyru mathbunnyru commented Aug 24, 2023

Describe your changes

Issue ticket if applicable

Work on: #1478

Checklist (especially for first-time contributors)

  • I have performed a self-review of my code
  • If it is a core feature, I have added thorough tests
  • I will try not to use force-push to make the review process easier for reviewers
  • I have updated the documentation for significant changes

@mathbunnyru
Copy link
Member Author

@benz0li take a look, please.

@benz0li
Copy link
Contributor

benz0li commented Aug 24, 2023

@benz0li take a look, please.

@mathbunnyru Looks good!

Tested (on macOS using Docker Desktop) with a local build including your modifications:

Create home directory

docker run --rm \
  -v "${PWD}/jupyterlab-benz0li":/dummy \
  alpine chown 1000:100 /dummy

Run container

docker run -it --rm \
  -p 8888:8888 \
  -u root \
  -v "${PWD}/jupyterlab-benz0li":/home/benz0li \
  -e NB_USER=benz0li \
  -e NB_UID=$(id -u) \
  -e NB_GID=$(id -g) \
  -e CHOWN_HOME=yes \
  -e CHOWN_HOME_OPTS='-R' \
  jupyterlab/qgis/base:3.32.2
Entered start.sh with args: jupyter lab
Running hooks in: /usr/local/bin/start-notebook.d as uid: 0 gid: 0
Sourcing shell script: /usr/local/bin/start-notebook.d/populate.sh
Done running hooks in: /usr/local/bin/start-notebook.d
Updated the jovyan user:
- username: jovyan       -> benz0li
- home dir: /home/jovyan -> /home/benz0li
Update benz0li's UID:GID to 501:20
userdel: group benz0li not removed because it is not the primary group of user benz0li.
useradd warning: benz0li's uid 501 outside of the UID_MIN 1000 and UID_MAX 60000 range.
Populating home dir /home/benz0li...
Success!
Changing working directory to /home/benz0li/
Ensuring /home/benz0li is owned by 501:20 (chown options: -R)
Running hooks in: /usr/local/bin/before-notebook.d as uid: 0 gid: 0
Sourcing shell script: /usr/local/bin/before-notebook.d/init.sh
Sourcing shell script: /usr/local/bin/before-notebook.d/limits.sh
Done running hooks in: /usr/local/bin/before-notebook.d
Running as benz0li: jupyter lab
[...]

@mathbunnyru
Copy link
Member Author

@benz0li thanks 👍

I think it would be nice if your images somehow were able to update scripts like this, so we have fewer differences and these differences are easier to find. I don't know if it's easy for you or not though.

@benz0li
Copy link
Contributor

benz0li commented Aug 24, 2023

Very few additional modifications to yours:

diff --git a/base/scripts/usr/local/bin/start.sh.orig b/base/scripts/usr/local/bin/start.sh
index d8b97bd..eb62fc7 100755
--- a/base/scripts/usr/local/bin/start.sh.orig
+++ b/base/scripts/usr/local/bin/start.sh
@@ -53,6 +53,8 @@ if [ "$(id -u)" == 0 ] ; then
     # - CHOWN_HOME: a boolean ("1" or "yes") to chown the user's home folder
     # - CHOWN_EXTRA: a comma separated list of paths to chown
     # - CHOWN_HOME_OPTS / CHOWN_EXTRA_OPTS: arguments to the chown commands
+    # - CHMOD_HOME: a boolean ("1" or "yes") to chmod the user's home folder
+    # - CHMOD_HOME_MODE: mode argument to the chmod command
 
     # Refit the jovyan user to the desired the user (NB_USER)
     if id jovyan &> /dev/null ; then
@@ -75,7 +77,7 @@ if [ "$(id -u)" == 0 ] ; then
         fi
         # Recreate the desired user as we want it
         userdel "${NB_USER}"
-        useradd --no-log-init --home "/home/${NB_USER}" --uid "${NB_UID}" --gid "${NB_GID}" --groups 100 "${NB_USER}"
+        useradd --no-log-init --home "/home/${NB_USER}" --shell "$(which zsh)" --uid "${NB_UID}" --gid "${NB_GID}" --groups 100 "${NB_USER}"
     fi
 
     # Move or symlink the jovyan home directory to the desired users home
@@ -97,12 +99,22 @@ if [ "$(id -u)" == 0 ] ; then
                     exit 1
                 fi
             fi
+        # The home directory could be bind mounted. Populate it if it is empty
+        elif [[ "$(ls -A "/home/${NB_USER}" 2> /dev/null)" == "" ]]; then
+            _log "Populating home dir /home/${NB_USER}..."
+            if cp -a /home/jovyan/. "/home/${NB_USER}/"; then
+                _log "Success!"
+            else
+                _log "Failed to copy data from /home/jovyan to /home/${NB_USER}!"
+                exit 1
+            fi
         fi
         # Ensure the current working directory is updated to the new path
         if [[ "${PWD}/" == "/home/jovyan/"* ]]; then
             new_wd="/home/${NB_USER}/${PWD:13}"
             _log "Changing working directory to ${new_wd}"
             cd "${new_wd}"
+            export CODE_WORKDIR=/home/${NB_USER}/projects
         fi
     fi
 
@@ -120,6 +132,10 @@ if [ "$(id -u)" == 0 ] ; then
             chown ${CHOWN_EXTRA_OPTS} "${NB_UID}:${NB_GID}" "${extra_dir}"
         done
     fi
+    # Optionally change the mode of the user's home folder
+    if [[ "${CHMOD_HOME}" == "1" || "${CHMOD_HOME}" == "yes" ]]; then
+        chmod "${CHMOD_HOME_MODE:-755}" "/home/${NB_USER}"
+    fi
 
     # Update potentially outdated environment variables since image build
     export XDG_CACHE_HOME="/home/${NB_USER}/.cache"
@@ -140,6 +156,7 @@ if [ "$(id -u)" == 0 ] ; then
     unset_explicit_env_vars
     _log "Running as ${NB_USER}:" "${cmd[@]}"
     exec sudo --preserve-env --set-home --user "${NB_USER}" \
+        LD_LIBRARY_PATH="${LD_LIBRARY_PATH:-}" \
         PATH="${PATH}" \
         PYTHONPATH="${PYTHONPATH:-}" \
         "${cmd[@]}"
@@ -196,7 +213,7 @@ else
             # We cannot use "sed --in-place" since sed tries to create a temp file in
             # /etc/ and we may not have write access. Apply sed on our own temp file:
             sed --expression="s/^jovyan:/nayvoj:/" /etc/passwd > /tmp/passwd
-            echo "${NB_USER}:x:$(id -u):$(id -g):,,,:/home/jovyan:/bin/bash" >> /tmp/passwd
+            echo "${NB_USER}:x:$(id -u):$(id -g):,,,:/home/jovyan:$(which zsh)" >> /tmp/passwd
             cat /tmp/passwd > /etc/passwd
             rm /tmp/passwd
 

@benz0li
Copy link
Contributor

benz0li commented Aug 24, 2023

I think the only part1 that is relevant comes after The home directory could be bind mounted. Populate it if it is empty.
ℹ️ If the directory is not empty, the script fails.

I have no intention of syncing any files or folders with a pre-populated directory between the host and container.

Footnotes

  1. All other parts were specifically introduced for my images.

@benz0li
Copy link
Contributor

benz0li commented Aug 24, 2023

And the populate.sh script is required at /usr/local/bin/start-notebook.d in case someone bind mounts to /home/jovyan.

@benz0li
Copy link
Contributor

benz0li commented Aug 24, 2023

@mathbunnyru
Copy link
Member Author

mathbunnyru commented Aug 24, 2023

I think this PR is ready to go, so I will merge it.

@benz0li I have a few questions though if you don't mind:

  1. Populate it if it is empty is a bit different from what I would like to do (copy from backup without rewrites).
    Is it a matter of personal choice or do you see some issues with the approach I would like to implement?
  2. Why do you need the duplication of the restore step - both in start.sh from /home/jovyan/ and in populate.sh from backup-dir? I mean, why can't you always restore for the backup-dir?

And thank you for :

  1. Reminding me about specifying --shell for useradd. I recently changed useradd to use long options and noticed this useradd lacks specifying shell (the one in Dockerfile does it correctly).
  2. Showing an issue with LD_LIBRARY_PATH - it definitely makes sense, thanks! I think people might install some software on top of our image which requires changing LD_LIBRARY_PATH and then if they use another username, it won't work.

@mathbunnyru mathbunnyru merged commit 74bbd0b into jupyter:main Aug 24, 2023
@benz0li
Copy link
Contributor

benz0li commented Aug 25, 2023

  1. Populate it if it is empty is a bit different from what I would like to do (copy from backup without rewrites).
    Is it a matter of personal choice or do you see some issues with the approach I would like to implement?

This is more of a personal choice. My point of view: Only the original content [from the image] ensures that the container works as intended.

  1. Why do you need the duplication of the restore step - both in start.sh from /home/jovyan/ and in populate.sh from backup-dir? I mean, why can't you always restore for the backup-dir?
  1. Populating /home/jovyan/ in populate.sh with the start-notebook.d hook
    • Use case: docker run -it --rm -p 8888:8888 -v "${PWD}/<empty-dir>":/home/jovyan jupyter/base-notebook
      This ensures that /home/jovyan/ has the original content before
      # If the container started as the root user, then we have permission to refit
      # the jovyan user, and ensure file permissions, grant sudo rights, and such
      # things before we run the command passed to start.sh as the desired user
      # (NB_USER).
      #
      if [ "$(id -u)" == 0 ] ; then
      i.e. anything else.
  2. Populating /home/${NB_USER}/ in start.sh

ℹ️ This way start.sh treats a bind mount similar to how Docker treats a volume:

If you start a container which creates a new volume, and the container has files or directories in the directory to be mounted such as /app/, Docker copies the directory's contents into the volume. The container then mounts and uses the volume, and other containers which use the volume also have access to the pre-populated content.

Volumes | Docker Docs > [...] > Populate a volume using a container

👉 This happens only at the first run and only if the volume is empty.

@benz0li
Copy link
Contributor

benz0li commented Aug 25, 2023

@mathbunnyru elif [[ "$(ls -A "/home/${NB_USER}" 2> /dev/null)" == "" ]]; then also ensures that the home directory remains untouched on subsequent runs.

@benz0li
Copy link
Contributor

benz0li commented Aug 25, 2023

Very simple, entirely sane and the outcome easily predictable.

I might extend to

# The home directory could be bind mounted. Populate it if it is empty
elif [[ "$(ls -A "/home/${NB_USER}" 2> /dev/null)" == "" ]]; then
    _log "Populating home dir /home/${NB_USER}..."
    if cp -a /home/jovyan/. "/home/${NB_USER}/"; then
        _log "Success!"
    else
        _log "ERROR: Failed to copy data from /home/jovyan to /home/${NB_USER}!"
        _log "       The home directory must be empty at the first run!"
        exit 1
    fi
fi

i.e. improving the error message on failure. Also for populate.sh.

@benz0li
Copy link
Contributor

benz0li commented Aug 25, 2023

Cross reference: b-data/jupyterlab-python-docker-stack#1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants