Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor Docker fixes & podman "rootless" container support (see comments) #2722

Closed
wants to merge 14 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 11 additions & 15 deletions docker/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ ARG PYTHON_VERSION=3.9
##################
## base image ##
##################
FROM python:${PYTHON_VERSION}-slim AS python-base
FROM docker.io/python:${PYTHON_VERSION}-slim AS python-base

LABEL org.opencontainers.image.authors="[email protected]"

Expand All @@ -13,10 +13,7 @@ RUN rm -f /etc/apt/apt.conf.d/docker-clean \
&& echo 'Binary::apt::APT::Keep-Downloaded-Packages "true";' >/etc/apt/apt.conf.d/keep-cache

# Install necessary packages
RUN \
--mount=type=cache,target=/var/cache/apt,sharing=locked \
--mount=type=cache,target=/var/lib/apt,sharing=locked \
apt-get update \
RUN apt-get update \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you remove the build cache?

This question of course also applys to all further removements of the build cache, but not adding the same comment now x times 😅

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the notes above and in f70fb02 it breaks podman unfortunately.

&& apt-get install -y \
--no-install-recommends \
libgl1-mesa-glx=20.3.* \
Expand All @@ -34,17 +31,17 @@ ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1
# don't fall back to legacy build system
ENV PIP_USE_PEP517=1
# set env for next stages
ENV APPDIR=${APPDIR}
ENV APPNAME=${APPNAME}

#######################
## build pyproject ##
#######################
FROM python-base AS pyproject-builder

# Install dependencies
RUN \
--mount=type=cache,target=/var/cache/apt,sharing=locked \
--mount=type=cache,target=/var/lib/apt,sharing=locked \
apt-get update \
RUN apt-get update \
&& apt-get install -y \
--no-install-recommends \
build-essential=12.9 \
Expand All @@ -57,18 +54,16 @@ ENV PIP_CACHE_DIR ${PIP_CACHE_DIR}
RUN mkdir -p ${PIP_CACHE_DIR}

# create virtual environment
RUN --mount=type=cache,target=${PIP_CACHE_DIR},sharing=locked \
python3 -m venv "${APPNAME}" \
RUN python3 -m venv "${APPNAME}" \
--upgrade-deps

# copy sources
COPY --link . .
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you remove the --link?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See f70fb02. Breaks podman. I hate it too :(

COPY . .

# install pyproject.toml
ARG PIP_EXTRA_INDEX_URL
ENV PIP_EXTRA_INDEX_URL ${PIP_EXTRA_INDEX_URL}
RUN --mount=type=cache,target=${PIP_CACHE_DIR},sharing=locked \
"${APPNAME}/bin/pip" install .
RUN "${APPNAME}/bin/pip" install .

# build patchmatch
RUN python3 -c "from patchmatch import patch_match"
Expand All @@ -84,7 +79,8 @@ RUN useradd \
--no-log-init \
-m \
-U \
"${UNAME}"
"${UNAME}" \
-u 1000
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need to define the UID, isn't it enough to have a user group created with the similar name of the user?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not for podman-- the way that podman works with rootless containers is that it uses another uid that is distinct from the uid of the user running the container on the host. In this case, the uid/guid of 1000:1000 in the container is actually something like 100999:100999 in the mounts/volumes. Or at least that's what you WANT it to be. So I needed to explicitly set that when the account is created so that the user can be run correctly. (I could be wrong and there might be another way to do this, but if there is I don't know it. I could run invokeai in the container as the "root" user rather than as appuser, at which point there would no longer be file access issues, but this seems like a bad idea.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just built the current main branch locally and executed some commands:

./docker/build.sh
Activated virtual environment: /Users/mauwii/git/mauwii/InvokeAI/.venv
You are using these values:

Dockerfile:		./Dockerfile
index-url:		https://download.pytorch.org/whl/cpu
Volumename:		invokeai_data
Platform:		linux/arm64
Container Registry:	ghcr.io
Container Repository:	mauwii/invokeai
Container Tag:		main-cpu
Container Flavor:	cpu
Container Image:	ghcr.io/mauwii/invokeai:main-cpu

Volume already exists

[+] Building 178.5s (23/23) FINISHED
 => [internal] load build definition from Dockerfile                                                                                                                                 0.0s
 => => transferring dockerfile: 2.77kB                                                                                                                                               0.0s
 => [internal] load .dockerignore                                                                                                                                                    0.0s
 => => transferring context: 35B                                                                                                                                                     0.0s
 => resolve image config for docker.io/docker/dockerfile:1                                                                                                                           2.3s
 => [auth] docker/dockerfile:pull token for registry-1.docker.io                                                                                                                     0.0s
 => docker-image://docker.io/docker/dockerfile:1@sha256:39b85bbfa7536a5feceb7372a0817649ecb2724562a38360f4d6a7782a409b14                                                             3.8s
 => => resolve docker.io/docker/dockerfile:1@sha256:39b85bbfa7536a5feceb7372a0817649ecb2724562a38360f4d6a7782a409b14                                                                 0.0s
 => => sha256:39b85bbfa7536a5feceb7372a0817649ecb2724562a38360f4d6a7782a409b14 8.40kB / 8.40kB                                                                                       0.0s
 => => sha256:7f44e51970d0422c2cbff3b20b6b5ef861f6244c396a06e1a96f7aa4fa83a4e6 482B / 482B                                                                                           0.0s
 => => sha256:a28edb2041b8f23c38382d8be273f0239f51ff1f510f98bccc8d0e7f42249e97 2.90kB / 2.90kB                                                                                       0.0s
 => => sha256:9d0cd65540a143ce38aa0be7c5e9efeed30d3580d03667f107cd76354f2bee65 10.82MB / 10.82MB                                                                                     3.1s
 => => extracting sha256:9d0cd65540a143ce38aa0be7c5e9efeed30d3580d03667f107cd76354f2bee65                                                                                            0.6s
 => [internal] load .dockerignore                                                                                                                                                    0.0s
 => [internal] load build definition from Dockerfile                                                                                                                                 0.0s
 => [internal] load metadata for docker.io/library/python:3.9-slim                                                                                                                   0.0s
 => [internal] load build context                                                                                                                                                    0.1s
 => => transferring context: 3.36MB                                                                                                                                                  0.0s
 => [python-base 1/4] FROM docker.io/library/python:3.9-slim                                                                                                                         0.0s
 => CACHED [python-base 2/4] RUN rm -f /etc/apt/apt.conf.d/docker-clean   && echo 'Binary::apt::APT::Keep-Downloaded-Packages "true";' >/etc/apt/apt.conf.d/keep-cache               0.0s
 => CACHED [python-base 3/4] RUN   --mount=type=cache,target=/var/cache/apt,sharing=locked   --mount=type=cache,target=/var/lib/apt,sharing=locked   apt-get update   && apt-get in  0.0s
 => CACHED [python-base 4/4] WORKDIR /usr/src                                                                                                                                        0.0s
 => CACHED [pyproject-builder 1/6] RUN   --mount=type=cache,target=/var/cache/apt,sharing=locked   --mount=type=cache,target=/var/lib/apt,sharing=locked   apt-get update   && apt-  0.0s
 => CACHED [pyproject-builder 2/6] RUN mkdir -p /var/cache/buildkit/pip                                                                                                              0.0s
 => CACHED [pyproject-builder 3/6] RUN --mount=type=cache,target=/var/cache/buildkit/pip,sharing=locked   python3 -m venv "InvokeAI"   --upgrade-deps                                0.0s
 => [pyproject-builder 4/6] COPY --link . .                                                                                                                                          0.1s
 => [pyproject-builder 5/6] RUN --mount=type=cache,target=/var/cache/buildkit/pip,sharing=locked   "InvokeAI/bin/pip" install .                                                    148.7s
 => [pyproject-builder 6/6] RUN python3 -c "from patchmatch import patch_match"                                                                                                      3.9s
 => CACHED [runtime 1/3] RUN useradd   --no-log-init   -m   -U   "appuser"                                                                                                           0.0s
 => CACHED [runtime 2/3] RUN mkdir -p "/data"   && chown -R "appuser" "/data"                                                                                                        0.0s
 => [runtime 3/3] COPY --chown=appuser --from=pyproject-builder /usr/src/InvokeAI InvokeAI                                                                                          11.0s
 => exporting to image                                                                                                                                                               4.8s
 => => exporting layers                                                                                                                                                              4.7s
 => => writing image sha256:d274038b0dd470a06f4bcfb8da22fb1fbe071c73ca947d96ef82c5e346dbf62b                                                                                         0.0s
 => => naming to ghcr.io/mauwii/invokeai:main-cpu                                                                                                                                    0.0s

Use 'docker scan' to run Snyk tests against images to find vulnerabilities and learn how to fix them
 ~/git/mauwii/InvokeAI   main ±  docker run --rm --interactive --tty --entrypoint=/bin/bash ghcr.io/mauwii/invokeai:main-cpu
appuser@299ed35c86f9:/usr/src$ id -u
1000
appuser@299ed35c86f9:/usr/src$ id -g
1000
appuser@299ed35c86f9:/usr/src$ whoami
appuser
appuser@299ed35c86f9:/usr/src$ apt-get update
Reading package lists... Done
E: List directory /var/lib/apt/lists/partial is missing. - Acquire (13: Permission denied)
appuser@299ed35c86f9:/usr/src$ sudo apt-get update
bash: sudo: command not found
appuser@299ed35c86f9:/usr/src$
  • no sudo for building the container
  • appuser already has uid 1000
  • no sudo inside the container

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inside the container the uid, guid is 1000. Same as w/podman. But if this is rootless docker, try to touch /data/outputs/testfile inside the container, and then jump out and look at the uid/guid of the file in ./outputs . With podman, it's something other than 1000, like some big #. I believe with rootless docker it's the same, which is why you'd use newuidmap and newguidmap. Although I don't know much about rootless docker.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, just out of curiosity, are you running docker run from user 1000/1000? What happens if you run it rootlessly from 1001/1001? Will that affect the file ownership of files you create?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since /data/outputs is mounted as a bindmount, on my local file system the file permissions are set for my current user (501:20 / mauwii:staff), while in the container they are mounted with permissions set to 1000:1000 / appuser:appuser.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh... that's very different than rootless podman, where multiple users running in the container would each have their own uid/guid. If you added a second user, 1001:1001 in the container and created a file, would it still appear as 501:20 outside the container?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why should it be different to creating a new user with 1000:1000????

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fat-tire is correct, if the user in container is 1000, then in general the files it creates on the bind mounted volume will also be owned by uid=1000. Is it possible that docker on mac changes ownership to the current user for convenience? If so, that's not a standard or generally expected behaviour.


# create volume directory
ARG VOLUME_DIR=/data
Expand Down
26 changes: 22 additions & 4 deletions docker/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ DOCKERFILE=${INVOKE_DOCKERFILE:-./Dockerfile}

# print the settings
echo -e "You are using these values:\n"
echo -e "Container engine:\t${CONTAINER_ENGINE}"
echo -e "Dockerfile:\t\t${DOCKERFILE}"
echo -e "index-url:\t\t${PIP_EXTRA_INDEX_URL:-none}"
echo -e "Volumename:\t\t${VOLUMENAME}"
Expand All @@ -32,20 +33,37 @@ echo -e "Container Tag:\t\t${CONTAINER_TAG}"
echo -e "Container Flavor:\t${CONTAINER_FLAVOR}"
echo -e "Container Image:\t${CONTAINER_IMAGE}\n"

# Create outputs directory if it does not exist
[[ -d ./outputs ]] || mkdir ./outputs

# Create docker volume
if [[ -n "$(docker volume ls -f name="${VOLUMENAME}" -q)" ]]; then
if [[ -n "$(${CONTAINER_ENGINE} volume ls -f name="${VOLUMENAME}" -q)" ]]; then # Note: newer versions of podman: podman volume exists ${VOLUMENAME}
echo -e "Volume already exists\n"
else
echo -n "creating docker volume "
docker volume create "${VOLUMENAME}"
echo -n "creating ${CONTAINER_ENGINE} volume "
"${CONTAINER_ENGINE}" volume create "${VOLUMENAME}"
fi

# Build Container
DOCKER_BUILDKIT=1 docker build \
DOCKER_BUILDKIT=1 "${CONTAINER_ENGINE}" build \
--platform="${PLATFORM:-linux/amd64}" \
--tag="${CONTAINER_IMAGE:-invokeai}" \
${CONTAINER_FLAVOR:+--build-arg="CONTAINER_FLAVOR=${CONTAINER_FLAVOR}"} \
${PIP_EXTRA_INDEX_URL:+--build-arg="PIP_EXTRA_INDEX_URL=${PIP_EXTRA_INDEX_URL}"} \
${PYTHON_VERSION:+--build-arg="PYTHON_VERSION=${PYTHON_VERSION}"} \
${PIP_PACKAGE:+--build-arg="PIP_PACKAGE=${PIP_PACKAGE}"} \
--file="${DOCKERFILE}" \
..

# Podman only: set ownership for user 1000:1000 (appuser) the right way
if [[ ${CONTAINER_ENGINE} == "podman" ]] ; then
echo Setting ownership for container\'s appuser on /data and /data/outputs
podman run \
--mount type=volume,src="${VOLUMENAME}",target=/data \
--user root --entrypoint "/bin/chown" "${CONTAINER_IMAGE:-invokeai}" \
-R 1000:1000 /data
podman run \
--mount type=bind,source="$(pwd)"/outputs,target=/data/outputs \
--user root --entrypoint "/bin/chown" "${CONTAINER_IMAGE:-invokeai}" \
-R 1000:1000 /data/outputs
fi
20 changes: 17 additions & 3 deletions docker/env.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,12 @@

# This file is used to set environment variables for the build.sh and run.sh scripts.

# docker is the default container engine, but should work with "podman" (rootless) too!
CONTAINER_ENGINE=${CONTAINER_ENGINE:-docker}

# use python v3.9 by default
PYTHON_VERSION=${PYTHON_VERSION:-3.9}

# Try to detect the container flavor if no PIP_EXTRA_INDEX_URL got specified
if [[ -z "$PIP_EXTRA_INDEX_URL" ]]; then

Expand All @@ -13,10 +19,10 @@ if [[ -z "$PIP_EXTRA_INDEX_URL" ]]; then
fi

# Decide which container flavor to build if not specified
if [[ -z "$CONTAINER_FLAVOR" ]] && python -c "import torch" &>/dev/null; then
if [[ -z "$CONTAINER_FLAVOR" ]] && python3 -c "import torch" &>/dev/null; then
# Check for CUDA and ROCm
CUDA_AVAILABLE=$(python -c "import torch;print(torch.cuda.is_available())")
ROCM_AVAILABLE=$(python -c "import torch;print(torch.version.hip is not None)")
CUDA_AVAILABLE=$(python3 -c "import torch;print(torch.cuda.is_available())")
ROCM_AVAILABLE=$(python3 -c "import torch;print(torch.version.hip is not None)")
if [[ "${CUDA_AVAILABLE}" == "True" ]]; then
CONTAINER_FLAVOR="cuda"
elif [[ "${ROCM_AVAILABLE}" == "True" ]]; then
Expand All @@ -41,6 +47,14 @@ REPOSITORY_NAME="${REPOSITORY_NAME-$(basename "$(git rev-parse --show-toplevel)"
REPOSITORY_NAME="${REPOSITORY_NAME,,}"
VOLUMENAME="${VOLUMENAME-"${REPOSITORY_NAME}_data"}"
ARCH="${ARCH-$(uname -m)}"
if [ $ARCH == "aarch64" ]
then
ARCH=arm64
fi
if [ $ARCH == "x86_64" ]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had no problems with neither aarch64 nor x86_64 (tested on a M1 and a I7)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why but both could not find the repository when I tried those arch names until I changed it to the ones displayed on the docker hub site. Also, on podman you have to specify docker.io fwiw

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the missing docker.io in the base image tag is totally my fault and the dockerfile is much "cleaner" with the registry preponed to the base-image tag 😅

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

heh sounds good

then
ARCH=amd64
fi
PLATFORM="${PLATFORM-linux/${ARCH}}"
INVOKEAI_BRANCH="${INVOKEAI_BRANCH-$(git branch --show)}"
CONTAINER_REGISTRY="${CONTAINER_REGISTRY-"ghcr.io"}"
Expand Down
16 changes: 10 additions & 6 deletions docker/run.sh
Original file line number Diff line number Diff line change
Expand Up @@ -8,27 +8,31 @@ cd "$SCRIPTDIR" || exit 1

source ./env.sh

# Create outputs directory if it does not exist
[[ -d ./outputs ]] || mkdir ./outputs

echo -e "You are using these values:\n"
echo -e "Container engine:\t${CONTAINER_ENGINE}"
echo -e "Volumename:\t${VOLUMENAME}"
echo -e "Invokeai_tag:\t${CONTAINER_IMAGE}"
echo -e "local Models:\t${MODELSPATH:-unset}\n"

docker run \
if [[ "${CONTAINER_ENGINE}" == "podman" ]]; then
PODMAN_ARGS="--user=appuser:appuser"
unset PLATFORM #causes problems
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if PLATFORM causes problems, then maybe runpod is not buildkit compatible?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like podman is stlil behind when it comes to buildkit. The good news is that podman 4.1.1 has caching mount support for example. The bad news is debian 11 includes 3.0.1 and even Ubuntu 22.10 (the latest release) uses podman version 3.4.4. The author of that article even suggests using actual buildkit with podman, then says wait never mind it doesn't work very well.

(Since you have security as a primary concern, I recommend considering trying podman as the container is run and managed by a local user (and the container's user is ALSO a local user.) So even if someone breaks out of the container's local user to root user, then they break out of the container entirely, they're STILL constrained within a user process.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I run docker rootless and the user in the container runtime is also not having root permissions 🙈

Please try if your problems are resolved when pulling the built image from https://docs.docker.com/engine/security/rootless/ which would be much better than removing all those features from the Dockerfile 😅

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay I'll try it. I'm also adding a commit with everything discussed so far that I can change and not break the build.

fi

"${CONTAINER_ENGINE}" run \
--interactive \
--tty \
--rm \
--platform="${PLATFORM}" \
${PLATFORM+--platform="${PLATFORM}"} \
--name="${REPOSITORY_NAME,,}" \
--hostname="${REPOSITORY_NAME,,}" \
--mount=source="${VOLUMENAME}",target=/data \
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"mount=source=" is that valid syndax? Podman was confused by it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#2722 (comment)

Was working there, never used Podman 😅

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay I looked in the documentation and didn't see it... maybe type defaults to volume and the second = is the equivalent of a comma..

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docker run -d \
  --name=nginxtest \
  --mount source=nginx-vol,destination=/usr/share/nginx/html,readonly \
  nginx:latest

So replace the equal sign in --mount= with a space and you get the same as the docker docs are refering to. So can be changed if necesarry.

--mount type=volume,src="${VOLUMENAME}",target=/data \
--mount type=bind,source="$(pwd)"/outputs,target=/data/outputs \
${MODELSPATH:+--mount="type=bind,source=${MODELSPATH},target=/data/models"} \
${HUGGING_FACE_HUB_TOKEN:+--env="HUGGING_FACE_HUB_TOKEN=${HUGGING_FACE_HUB_TOKEN}"} \
--publish=9090:9090 \
--cap-add=sys_nice \
${PODMAN_ARGS:+"${PODMAN_ARGS}"} \
${GPU_FLAGS:+--gpus="${GPU_FLAGS}"} \
"${CONTAINER_IMAGE}" ${@:+$@}

Expand Down