Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix long container startup times #763

Merged
merged 1 commit into from
Aug 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 10 additions & 1 deletion training/ilab-wrapper/ilab
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,8 @@ check_insights

# Template values replaced by container build
CONTAINER_DEVICE="__REPLACE_CONTAINER_DEVICE__"
IMAGE_NAME="__REPLACE_IMAGE_NAME__"
SOURCE_IMAGE="__REPLACE_IMAGE_NAME__"
IMAGE_NAME="localhost/instructlab:__REPLACE_IMAGE_TAG__"

ENTRYPOINT="ilab"
PARAMS=("$@")
Expand Down Expand Up @@ -144,4 +145,12 @@ PODMAN_COMMAND=("sudo" "--preserve-env=$PRESERVE_ENV" "podman" "run" "--rm" "-it
"--env" "HF_TOKEN"
"${IMAGE_NAME}")

sudo podman image exists "$IMAGE_NAME"
if [ "$?" != "0" ]; then
echo "Initializing ilab container..."
id=$(sudo podman create "$SOURCE_IMAGE")
sudo podman commit "$id" "$IMAGE_NAME"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is creating a new layer in /var/lib/containers that references the base layers, but those layers will go away across an upgrade when the base ilab image changes, right?

IOW won't this break across upgrades?

Copy link
Member Author

@n1hility n1hility Aug 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cgwalters As long as we are updating our tag (1.1 today -> 1.2 tomorrow) this should be ok, it will then just create a new container at that point. We do need to add code to cleanup old images though (1.2 will need to delete the old 1.1 container)

sudo podman rm "$id"
fi

exec "${PODMAN_COMMAND[@]}" "${PARAMS[@]}"
4 changes: 3 additions & 1 deletion training/nvidia-bootc/Containerfile
Original file line number Diff line number Diff line change
Expand Up @@ -173,10 +173,12 @@ RUN chmod +x /usr/bin/ilab
ARG INSTRUCTLAB_IMAGE="quay.io/ai-lab/instructlab-nvidia:latest"
ARG INSTRUCTLAB_IMAGE_PULL_SECRET="instructlab-nvidia-pull"

RUN for i in /usr/bin/ilab*; do \
RUN export INSTRUCTLAB_TAG=$(echo ${INSTRUCTLAB_IMAGE} | cut -f 2 -d ':') && \
for i in /usr/bin/ilab*; do \
sed -i 's/__REPLACE_TRAIN_DEVICE__/cuda/' $i; \
sed -i 's/__REPLACE_CONTAINER_DEVICE__/nvidia.com\/gpu=all/' $i; \
sed -i "s%__REPLACE_IMAGE_NAME__%${INSTRUCTLAB_IMAGE}%" $i; \
sed -i "s%__REPLACE_IMAGE_TAG__%${INSTRUCTLAB_TAG}%" $i; \
done

# Added for running as an OCI Container to prevent Overlay on Overlay issues.
Expand Down
11 changes: 10 additions & 1 deletion training/nvidia-bootc/duplicated/ilab-wrapper/ilab
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,8 @@ check_insights

# Template values replaced by container build
CONTAINER_DEVICE="__REPLACE_CONTAINER_DEVICE__"
IMAGE_NAME="__REPLACE_IMAGE_NAME__"
SOURCE_IMAGE="__REPLACE_IMAGE_NAME__"
IMAGE_NAME="localhost/instructlab:__REPLACE_IMAGE_TAG__"

ENTRYPOINT="ilab"
PARAMS=("$@")
Expand Down Expand Up @@ -144,4 +145,12 @@ PODMAN_COMMAND=("sudo" "--preserve-env=$PRESERVE_ENV" "podman" "run" "--rm" "-it
"--env" "HF_TOKEN"
"${IMAGE_NAME}")

sudo podman image exists "$IMAGE_NAME"
if [ "$?" != "0" ]; then
echo "Initializing ilab container..."
id=$(sudo podman create "$SOURCE_IMAGE")
sudo podman commit "$id" "$IMAGE_NAME"
sudo podman rm "$id"
fi

exec "${PODMAN_COMMAND[@]}" "${PARAMS[@]}"