-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Minor Docker fixes & podman "rootless" container support (see comments) #2722
Changes from all commits
9512a0a
69c137c
0f4e8e5
614fbd1
c45d7e8
5ecd379
15abc55
6fe0df7
4c1d152
0272285
4611cf1
f70fb02
faf7f83
3253894
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4,7 +4,7 @@ ARG PYTHON_VERSION=3.9 | |
################## | ||
## base image ## | ||
################## | ||
FROM python:${PYTHON_VERSION}-slim AS python-base | ||
FROM docker.io/python:${PYTHON_VERSION}-slim AS python-base | ||
|
||
LABEL org.opencontainers.image.authors="[email protected]" | ||
|
||
|
@@ -13,10 +13,7 @@ RUN rm -f /etc/apt/apt.conf.d/docker-clean \ | |
&& echo 'Binary::apt::APT::Keep-Downloaded-Packages "true";' >/etc/apt/apt.conf.d/keep-cache | ||
|
||
# Install necessary packages | ||
RUN \ | ||
--mount=type=cache,target=/var/cache/apt,sharing=locked \ | ||
--mount=type=cache,target=/var/lib/apt,sharing=locked \ | ||
apt-get update \ | ||
RUN apt-get update \ | ||
&& apt-get install -y \ | ||
--no-install-recommends \ | ||
libgl1-mesa-glx=20.3.* \ | ||
|
@@ -34,17 +31,17 @@ ENV PYTHONDONTWRITEBYTECODE 1 | |
ENV PYTHONUNBUFFERED 1 | ||
# don't fall back to legacy build system | ||
ENV PIP_USE_PEP517=1 | ||
# set env for next stages | ||
ENV APPDIR=${APPDIR} | ||
ENV APPNAME=${APPNAME} | ||
|
||
####################### | ||
## build pyproject ## | ||
####################### | ||
FROM python-base AS pyproject-builder | ||
|
||
# Install dependencies | ||
RUN \ | ||
--mount=type=cache,target=/var/cache/apt,sharing=locked \ | ||
--mount=type=cache,target=/var/lib/apt,sharing=locked \ | ||
apt-get update \ | ||
RUN apt-get update \ | ||
&& apt-get install -y \ | ||
--no-install-recommends \ | ||
build-essential=12.9 \ | ||
|
@@ -57,18 +54,16 @@ ENV PIP_CACHE_DIR ${PIP_CACHE_DIR} | |
RUN mkdir -p ${PIP_CACHE_DIR} | ||
|
||
# create virtual environment | ||
RUN --mount=type=cache,target=${PIP_CACHE_DIR},sharing=locked \ | ||
python3 -m venv "${APPNAME}" \ | ||
RUN python3 -m venv "${APPNAME}" \ | ||
--upgrade-deps | ||
|
||
# copy sources | ||
COPY --link . . | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why do you remove the --link? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. See f70fb02. Breaks podman. I hate it too :( |
||
COPY . . | ||
|
||
# install pyproject.toml | ||
ARG PIP_EXTRA_INDEX_URL | ||
ENV PIP_EXTRA_INDEX_URL ${PIP_EXTRA_INDEX_URL} | ||
RUN --mount=type=cache,target=${PIP_CACHE_DIR},sharing=locked \ | ||
"${APPNAME}/bin/pip" install . | ||
RUN "${APPNAME}/bin/pip" install . | ||
|
||
# build patchmatch | ||
RUN python3 -c "from patchmatch import patch_match" | ||
|
@@ -84,7 +79,8 @@ RUN useradd \ | |
--no-log-init \ | ||
-m \ | ||
-U \ | ||
"${UNAME}" | ||
"${UNAME}" \ | ||
-u 1000 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why do we need to define the UID, isn't it enough to have a user group created with the similar name of the user? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not for podman-- the way that podman works with rootless containers is that it uses another uid that is distinct from the uid of the user running the container on the host. In this case, the uid/guid of 1000:1000 in the container is actually something like 100999:100999 in the mounts/volumes. Or at least that's what you WANT it to be. So I needed to explicitly set that when the account is created so that the user can be run correctly. (I could be wrong and there might be another way to do this, but if there is I don't know it. I could run invokeai in the container as the "root" user rather than as appuser, at which point there would no longer be file access issues, but this seems like a bad idea.) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I just built the current main branch locally and executed some commands: ./docker/build.sh
Activated virtual environment: /Users/mauwii/git/mauwii/InvokeAI/.venv
You are using these values:
Dockerfile: ./Dockerfile
index-url: https://download.pytorch.org/whl/cpu
Volumename: invokeai_data
Platform: linux/arm64
Container Registry: ghcr.io
Container Repository: mauwii/invokeai
Container Tag: main-cpu
Container Flavor: cpu
Container Image: ghcr.io/mauwii/invokeai:main-cpu
Volume already exists
[+] Building 178.5s (23/23) FINISHED
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 2.77kB 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 35B 0.0s
=> resolve image config for docker.io/docker/dockerfile:1 2.3s
=> [auth] docker/dockerfile:pull token for registry-1.docker.io 0.0s
=> docker-image://docker.io/docker/dockerfile:1@sha256:39b85bbfa7536a5feceb7372a0817649ecb2724562a38360f4d6a7782a409b14 3.8s
=> => resolve docker.io/docker/dockerfile:1@sha256:39b85bbfa7536a5feceb7372a0817649ecb2724562a38360f4d6a7782a409b14 0.0s
=> => sha256:39b85bbfa7536a5feceb7372a0817649ecb2724562a38360f4d6a7782a409b14 8.40kB / 8.40kB 0.0s
=> => sha256:7f44e51970d0422c2cbff3b20b6b5ef861f6244c396a06e1a96f7aa4fa83a4e6 482B / 482B 0.0s
=> => sha256:a28edb2041b8f23c38382d8be273f0239f51ff1f510f98bccc8d0e7f42249e97 2.90kB / 2.90kB 0.0s
=> => sha256:9d0cd65540a143ce38aa0be7c5e9efeed30d3580d03667f107cd76354f2bee65 10.82MB / 10.82MB 3.1s
=> => extracting sha256:9d0cd65540a143ce38aa0be7c5e9efeed30d3580d03667f107cd76354f2bee65 0.6s
=> [internal] load .dockerignore 0.0s
=> [internal] load build definition from Dockerfile 0.0s
=> [internal] load metadata for docker.io/library/python:3.9-slim 0.0s
=> [internal] load build context 0.1s
=> => transferring context: 3.36MB 0.0s
=> [python-base 1/4] FROM docker.io/library/python:3.9-slim 0.0s
=> CACHED [python-base 2/4] RUN rm -f /etc/apt/apt.conf.d/docker-clean && echo 'Binary::apt::APT::Keep-Downloaded-Packages "true";' >/etc/apt/apt.conf.d/keep-cache 0.0s
=> CACHED [python-base 3/4] RUN --mount=type=cache,target=/var/cache/apt,sharing=locked --mount=type=cache,target=/var/lib/apt,sharing=locked apt-get update && apt-get in 0.0s
=> CACHED [python-base 4/4] WORKDIR /usr/src 0.0s
=> CACHED [pyproject-builder 1/6] RUN --mount=type=cache,target=/var/cache/apt,sharing=locked --mount=type=cache,target=/var/lib/apt,sharing=locked apt-get update && apt- 0.0s
=> CACHED [pyproject-builder 2/6] RUN mkdir -p /var/cache/buildkit/pip 0.0s
=> CACHED [pyproject-builder 3/6] RUN --mount=type=cache,target=/var/cache/buildkit/pip,sharing=locked python3 -m venv "InvokeAI" --upgrade-deps 0.0s
=> [pyproject-builder 4/6] COPY --link . . 0.1s
=> [pyproject-builder 5/6] RUN --mount=type=cache,target=/var/cache/buildkit/pip,sharing=locked "InvokeAI/bin/pip" install . 148.7s
=> [pyproject-builder 6/6] RUN python3 -c "from patchmatch import patch_match" 3.9s
=> CACHED [runtime 1/3] RUN useradd --no-log-init -m -U "appuser" 0.0s
=> CACHED [runtime 2/3] RUN mkdir -p "/data" && chown -R "appuser" "/data" 0.0s
=> [runtime 3/3] COPY --chown=appuser --from=pyproject-builder /usr/src/InvokeAI InvokeAI 11.0s
=> exporting to image 4.8s
=> => exporting layers 4.7s
=> => writing image sha256:d274038b0dd470a06f4bcfb8da22fb1fbe071c73ca947d96ef82c5e346dbf62b 0.0s
=> => naming to ghcr.io/mauwii/invokeai:main-cpu 0.0s
Use 'docker scan' to run Snyk tests against images to find vulnerabilities and learn how to fix them
~/git/mauwii/InvokeAI main ± docker run --rm --interactive --tty --entrypoint=/bin/bash ghcr.io/mauwii/invokeai:main-cpu
appuser@299ed35c86f9:/usr/src$ id -u
1000
appuser@299ed35c86f9:/usr/src$ id -g
1000
appuser@299ed35c86f9:/usr/src$ whoami
appuser
appuser@299ed35c86f9:/usr/src$ apt-get update
Reading package lists... Done
E: List directory /var/lib/apt/lists/partial is missing. - Acquire (13: Permission denied)
appuser@299ed35c86f9:/usr/src$ sudo apt-get update
bash: sudo: command not found
appuser@299ed35c86f9:/usr/src$
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. inside the container the uid, guid is 1000. Same as w/podman. But if this is rootless docker, try to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also, just out of curiosity, are you running docker run from user 1000/1000? What happens if you run it rootlessly from 1001/1001? Will that affect the file ownership of files you create? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. since /data/outputs is mounted as a bindmount, on my local file system the file permissions are set for my current user (501:20 / mauwii:staff), while in the container they are mounted with permissions set to 1000:1000 / appuser:appuser. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Huh... that's very different than rootless podman, where multiple users running in the container would each have their own uid/guid. If you added a second user, 1001:1001 in the container and created a file, would it still appear as 501:20 outside the container? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why should it be different to creating a new user with 1000:1000???? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @fat-tire is correct, if the user in container is 1000, then in general the files it creates on the bind mounted volume will also be owned by uid=1000. Is it possible that docker on mac changes ownership to the current user for convenience? If so, that's not a standard or generally expected behaviour. |
||
|
||
# create volume directory | ||
ARG VOLUME_DIR=/data | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2,6 +2,12 @@ | |
|
||
# This file is used to set environment variables for the build.sh and run.sh scripts. | ||
|
||
# docker is the default container engine, but should work with "podman" (rootless) too! | ||
CONTAINER_ENGINE=${CONTAINER_ENGINE:-docker} | ||
|
||
# use python v3.9 by default | ||
PYTHON_VERSION=${PYTHON_VERSION:-3.9} | ||
|
||
# Try to detect the container flavor if no PIP_EXTRA_INDEX_URL got specified | ||
if [[ -z "$PIP_EXTRA_INDEX_URL" ]]; then | ||
|
||
|
@@ -13,10 +19,10 @@ if [[ -z "$PIP_EXTRA_INDEX_URL" ]]; then | |
fi | ||
|
||
# Decide which container flavor to build if not specified | ||
if [[ -z "$CONTAINER_FLAVOR" ]] && python -c "import torch" &>/dev/null; then | ||
if [[ -z "$CONTAINER_FLAVOR" ]] && python3 -c "import torch" &>/dev/null; then | ||
# Check for CUDA and ROCm | ||
CUDA_AVAILABLE=$(python -c "import torch;print(torch.cuda.is_available())") | ||
ROCM_AVAILABLE=$(python -c "import torch;print(torch.version.hip is not None)") | ||
CUDA_AVAILABLE=$(python3 -c "import torch;print(torch.cuda.is_available())") | ||
ROCM_AVAILABLE=$(python3 -c "import torch;print(torch.version.hip is not None)") | ||
if [[ "${CUDA_AVAILABLE}" == "True" ]]; then | ||
CONTAINER_FLAVOR="cuda" | ||
elif [[ "${ROCM_AVAILABLE}" == "True" ]]; then | ||
|
@@ -41,6 +47,14 @@ REPOSITORY_NAME="${REPOSITORY_NAME-$(basename "$(git rev-parse --show-toplevel)" | |
REPOSITORY_NAME="${REPOSITORY_NAME,,}" | ||
VOLUMENAME="${VOLUMENAME-"${REPOSITORY_NAME}_data"}" | ||
ARCH="${ARCH-$(uname -m)}" | ||
if [ $ARCH == "aarch64" ] | ||
then | ||
ARCH=arm64 | ||
fi | ||
if [ $ARCH == "x86_64" ] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I had no problems with neither aarch64 nor x86_64 (tested on a M1 and a I7) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not sure why but both could not find the repository when I tried those arch names until I changed it to the ones displayed on the docker hub site. Also, on podman you have to specify docker.io fwiw There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. the missing docker.io in the base image tag is totally my fault and the dockerfile is much "cleaner" with the registry preponed to the base-image tag 😅 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. heh sounds good |
||
then | ||
ARCH=amd64 | ||
fi | ||
PLATFORM="${PLATFORM-linux/${ARCH}}" | ||
INVOKEAI_BRANCH="${INVOKEAI_BRANCH-$(git branch --show)}" | ||
CONTAINER_REGISTRY="${CONTAINER_REGISTRY-"ghcr.io"}" | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -8,27 +8,31 @@ cd "$SCRIPTDIR" || exit 1 | |
|
||
source ./env.sh | ||
|
||
# Create outputs directory if it does not exist | ||
[[ -d ./outputs ]] || mkdir ./outputs | ||
|
||
echo -e "You are using these values:\n" | ||
echo -e "Container engine:\t${CONTAINER_ENGINE}" | ||
echo -e "Volumename:\t${VOLUMENAME}" | ||
echo -e "Invokeai_tag:\t${CONTAINER_IMAGE}" | ||
echo -e "local Models:\t${MODELSPATH:-unset}\n" | ||
|
||
docker run \ | ||
if [[ "${CONTAINER_ENGINE}" == "podman" ]]; then | ||
PODMAN_ARGS="--user=appuser:appuser" | ||
unset PLATFORM #causes problems | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. if PLATFORM causes problems, then maybe runpod is not buildkit compatible? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It looks like podman is stlil behind when it comes to buildkit. The good news is that podman 4.1.1 has caching mount support for example. The bad news is debian 11 includes 3.0.1 and even Ubuntu 22.10 (the latest release) uses podman version 3.4.4. The author of that article even suggests using actual buildkit with podman, then says wait never mind it doesn't work very well. (Since you have security as a primary concern, I recommend considering trying podman as the container is run and managed by a local user (and the container's user is ALSO a local user.) So even if someone breaks out of the container's local user to root user, then they break out of the container entirely, they're STILL constrained within a user process.) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I run docker rootless and the user in the container runtime is also not having root permissions 🙈 Please try if your problems are resolved when pulling the built image from https://docs.docker.com/engine/security/rootless/ which would be much better than removing all those features from the Dockerfile 😅 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Okay I'll try it. I'm also adding a commit with everything discussed so far that I can change and not break the build. |
||
fi | ||
|
||
"${CONTAINER_ENGINE}" run \ | ||
--interactive \ | ||
--tty \ | ||
--rm \ | ||
--platform="${PLATFORM}" \ | ||
${PLATFORM+--platform="${PLATFORM}"} \ | ||
--name="${REPOSITORY_NAME,,}" \ | ||
--hostname="${REPOSITORY_NAME,,}" \ | ||
--mount=source="${VOLUMENAME}",target=/data \ | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "mount=source=" is that valid syndax? Podman was confused by it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Was working there, never used Podman 😅 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. okay I looked in the documentation and didn't see it... maybe type defaults to volume and the second = is the equivalent of a comma.. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
So replace the equal sign in |
||
--mount type=volume,src="${VOLUMENAME}",target=/data \ | ||
--mount type=bind,source="$(pwd)"/outputs,target=/data/outputs \ | ||
${MODELSPATH:+--mount="type=bind,source=${MODELSPATH},target=/data/models"} \ | ||
${HUGGING_FACE_HUB_TOKEN:+--env="HUGGING_FACE_HUB_TOKEN=${HUGGING_FACE_HUB_TOKEN}"} \ | ||
--publish=9090:9090 \ | ||
--cap-add=sys_nice \ | ||
${PODMAN_ARGS:+"${PODMAN_ARGS}"} \ | ||
${GPU_FLAGS:+--gpus="${GPU_FLAGS}"} \ | ||
"${CONTAINER_IMAGE}" ${@:+$@} | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do you remove the build cache?
This question of course also applys to all further removements of the build cache, but not adding the same comment now x times 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the notes above and in f70fb02 it breaks podman unfortunately.