Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mysterious permissions errors when running test_run.sh twice #59

Closed
chrisvanrun opened this issue Oct 18, 2024 · 21 comments · Fixed by #62
Closed

Mysterious permissions errors when running test_run.sh twice #59

chrisvanrun opened this issue Oct 18, 2024 · 21 comments · Fixed by #62
Assignees

Comments

@chrisvanrun
Copy link
Collaborator

chrisvanrun commented Oct 18, 2024

During the workshop run, @koopmant ran into the problem that running the test script twice resulted in permission errors.

The test_run.sh has a secondary docker run, introduced to specifically target these permissions. Root user v.s. build created user. Hence, it is unclear why this occurred. It's being investigated.

@koopmant
Copy link

Tried again after a fresh reinstall of my system. I'm on Ubuntu 24.04.1 LTS.
Steps I took:

  • install docker desktop
  • clone repo
  • run the test script

Initial run works. Second run stops after:

=+= Cleaning up any earlier output

Removed the -f argument from chmod. Now we see:

chmod: changing permissions of '.../test/output': Operation not permitted

Tried adding my user to the docker group to run docker as non-root and see if that fixes it
sudo usermod -aG docker $USER

Did not fix it…

Also tried taking ownership using chown, but get the same permission issue.

@chrisvanrun
Copy link
Collaborator Author

Ok. So it fails on this part somewhere:

if [ -d "$OUTPUT_DIR" ]; then
  # Ensure permissions are setup correctly
  # This allows for the Docker user to write to this location
  rm -rf "${OUTPUT_DIR}"/*
  chmod -f o+rwx "$OUTPUT_DIR"
else
  mkdir --mode=o+rwx "$OUTPUT_DIR"
fi

What are the permissions/ownership of the output directory right when you clone the repo, and what are they after the first run? ls -alh IIRC

@chrisvanrun
Copy link
Collaborator Author

Oh and while we are at it, what does id -u and id -g return at your command line?

@koopmant
Copy link

koopmant commented Oct 18, 2024

Oh and while we are at it, what does id -u and id -g return at your command line?

both return 1000

output of id command:

uid=1000(thomas) gid=1000(thomas) groups=1000(thomas),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),100(users),114(lpadmin)

@koopmant
Copy link

koopmant commented Oct 18, 2024

After clone I'm not sure, will check
after mkdir I'm the owner (obviously, but I checked regardless)

drwxrwxrwx 2 thomas thomas 4096 Oct 18 14:37 output

after a successfull run, the owner is 100999:

drwxrwxrwx 2 100999 100999 4.0K Oct 18 14:38 output

@chrisvanrun
Copy link
Collaborator Author

Thanks for testing. Huh, 100999 is odd! Given your id is 1000, I would expect that this (at the end of the test_run.sh) fixes things:

docker run --rm \
    --quiet \
    --env HOST_UID=`id -u` \
    --env HOST_GID=`id -g` \
    --volume "$OUTPUT_DIR":/output \
    alpine:latest \
    /bin/sh -c 'chown -R ${HOST_UID}:${HOST_GID} /output'

@koopmant
Copy link

Well... When I remove that section, it actually works just fine?
If I don´t run that section, the owner of the files in output/ is 100998 btw. But the owner of the output folder itself is still me.

@koopmant
Copy link

koopmant commented Oct 18, 2024

I think it has something to do with this; but I don't fully understand it yet
https://docs.docker.com/engine/security/userns-remap/

https://docs.docker.com/desktop/faqs/linuxfaqs/#how-do-i-enable-file-sharing

In this scenario if a shared file is chowned inside a Docker Desktop container owned by a user with a UID of 1000, it shows up on the host as owned by a user with a UID of 100999. This has the unfortunate side effect of preventing easy access to such a file on the host. The problem is resolved by creating a group with the new GID and adding our user to it, or by setting a recursive ACL (see setfacl(1)) for folders shared with the Docker Desktop VM.

@koopmant koopmant changed the title Mysterious permissions errors when running test_user.sh twice Mysterious permissions errors when running test_run.sh twice Oct 18, 2024
@chrisvanrun
Copy link
Collaborator Author

Current suspect: with set -e enabled, if a docker run has an error, the permission setter is not run. Should catch it in an exit so it always runs.

@koopmant
Copy link

disabling set -e does not change anything on my side. permission issue persists.

@chrisvanrun
Copy link
Collaborator Author

@koopmant could you try a test_run.sh with this:

#!/usr/bin/env bash

# Stop at first error
set -e

SCRIPT_DIR=$( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )
DOCKER_TAG="example-algorithm"
DOCKER_NOOP_VOLUME="${DOCKER_TAG}-volume"

INPUT_DIR="${SCRIPT_DIR}/test/input"
OUTPUT_DIR="${SCRIPT_DIR}/test/output"


cleanup() {
    echo "=+=  Cleaning up ..."
    # Ensure permissions are set correctly on the output
    # This allows the host user (e.g. you) to access and handle these files
    docker run --rm \
      --quiet \
      --env HOST_UID=`id -u` \
      --env HOST_GID=`id -g` \
      --volume "$OUTPUT_DIR":/output \
      alpine:latest \
      /bin/sh -c 'chown -R ${HOST_UID}:${HOST_GID} /output'
}

trap cleanup EXIT

if [ -d "$OUTPUT_DIR" ]; then
  # Ensure permissions are setup correctly
  # This allows for the Docker user to write to this location
  cleanup

  rm -rf "${OUTPUT_DIR}"/*
  chmod -f o+rwx "$OUTPUT_DIR"
else
  mkdir -m o+rwx "$OUTPUT_DIR"
fi


echo "=+= (Re)build the container"
docker build "$SCRIPT_DIR" \
  --platform=linux/amd64 \
  --tag $DOCKER_TAG 2>&1


echo "=+= Doing a forward pass"
## Note the extra arguments that are passed here:
# '--network none'
#    entails there is no internet connection
# '--volume <NAME>:/tmp'
#   is added because on Grand Challenge this directory cannot be used to store permanent files
docker volume create "$DOCKER_NOOP_VOLUME" > /dev/null

docker run --rm \
    --platform=linux/amd64 \
    --network none \
    --volume "$INPUT_DIR":/input:ro \
    --volume "$OUTPUT_DIR":/output \
    --volume "$DOCKER_NOOP_VOLUME":/tmp \
    $DOCKER_TAG

docker volume rm "$DOCKER_NOOP_VOLUME" > /dev/null

echo "=+= Wrote results to ${OUTPUT_DIR}"

echo "=+= Save this image for uploading via ./save.sh"

@koopmant
Copy link

output on second run:

=+= Cleaning up ...
=+= Cleaning up ...

@chrisvanrun
Copy link
Collaborator Author

Was that all the output? Not sure if it worked from your comment. Sorry.

@koopmant
Copy link

koopmant commented Oct 18, 2024

Sorry, that was not very clear. Yes, that's all the output; so it's not working. It runs cleanup first, then runs into the permission error again during chmod (leading to immediate exit) and then runs cleanup again on exit.

@chrisvanrun
Copy link
Collaborator Author

No problem! @amickan also ran into a similar looking problem. Hower above solution worked for her. Now I think it's possible that was from a different origin (the set -e origin). Yours likely stems from a slightly different ACL setup on Ubuntu 24.

Albeit if you temp fix it via sudo, forcing ownership to your own user, does it return running the test_run.sh twice?

I am hesitant to add calls like this, because it really kicks the interoperability of the script down a few nodges.

# Set the ACL recursively (-R) for the specified user and group
setfacl -R -m u:$USER:rwx -m g:$GROUP:rwx $DIRECTORY

# Set default ACL for future files and directories (so new files inherit the ACL)
setfacl -R -d -m u:$USER:rwx -m g:$GROUP:rwx $DIRECTORY

@koopmant
Copy link

Albeit if you temp fix it via sudo, forcing ownership to your own user, does it return running the test_run.sh twice?

Sorry, I don't quite understand what you mean.

FWIW, if you change this line:
/bin/sh -c 'chown -R ${HOST_UID}:${HOST_GID} /output'
to
/bin/sh -c 'chown -R ${HOST_UID}:${HOST_GID} /output/*'
the issue of running the script again is fixed. On my system, the ownership is then still on another user id, so I cannot open the file without providing root authentication; but this should still work on other systems.

the line
chmod o+rwx "$OUTPUT_DIR"
is then unneccessary, because the directory then never changed ownership. (But it doesn´t raise an issue.)

@chrisvanrun
Copy link
Collaborator Author

Sorry, let me clarify: if you fix ownership once. Does the proposed test_run.sh then keep working, even if ran multiple times?

The initial chmod is meant to ensure that the Docker's internal user can actually write to the output directory in the first place. The second chown is to ensure that the host user has full access to the files and is allowed to change any and all permission on it.

@koopmant
Copy link

Sorry, let me clarify: if you fix ownership once. Does the proposed test_run.sh then keep working, even if ran multiple times?

Yes, that sort of works. The chown attempted in the last docker command then fails with a permission error. So I remain the owner of the directory and the rest of the script is successfull. Still, that means the current user does not become owner of the files in the output directory. But at least the script runs and creates output.

@chrisvanrun
Copy link
Collaborator Author

@koopmant , let's discuss this during out meet-meeting. =)

@chrisvanrun
Copy link
Collaborator Author

Hint for fix, there seems to be a new mapping in some Linux distros (or new Docker Engine, dunno) that maps internal docker uids to system ones.

docker run --rm --env HOST_UID=1000 --env HOST_GID=1000 --volume /home/thomas/code/rse-workshop-2024/evaluation-methods/test/output:/output --entrypoint /bin/sh example-evaluation -c "chmod o+rwX /output/*"

@chrisvanrun
Copy link
Collaborator Author

See fix in: DIAGNijmegen/rse-workshop-2024@5cae844

Should be replicated to forge.

@chrisvanrun chrisvanrun assigned chrisvanrun and unassigned koopmant Oct 28, 2024
chrisvanrun added a commit that referenced this issue Oct 31, 2024
* Fix some pointers to resources

* Fix permissions setting on test_runs #59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants