GPU support in Docker, other Docker-related updates #1655

lukaboljevic · 2024-02-28T09:54:59Z

The main contribution of this PR is the addition of a new Dockerfile and docker compose file for running PrivateGPT on GPU in Docker. The command to run is simply docker compose -f docker-compose-gpu.yaml up --build. This should address issues like #1652, #1597, #1405.

The PR also proposes some changes based on #1428:

Added max workers for poetry installer
Added an entrypoint.sh script. Here, the user can specify the model, tokenizer, prompt style and embedding model through environment variables, and then those are automatically downloaded using the setup script.
Choose the tokenizer and prompt style in settings-docker.yaml, and update default Mistral model to v0.2 from v0.1
Removed USER worker as it seems to have caused a segfault on Mac (can't test unfortunately, as I don't have a Mac)

I feel like having the entrypoint.sh script and simply running docker compose up is way simpler and more transparent than the current (and not very well documented) approach of running docker compose run --rm --entrypoint="bash -c '[ -f scripts/setup ] && scripts/setup'" private-gpt. The script also allows the user to more easily and directly choose model, tokenizer, prompt style and embedding model. The current approach involves creating a new settings-FOO.yaml file, and including FOO as a new profile in the corresponding docker compose file. This isn't bad at all, it just takes a while to find through the documentation, and may require a few attempts and browsing through issues like #1579 and #1573 before getting it to run.

Whatever your stance is on the entrypoint script, the documentation needs to be updated to tell the user as directly as possible what is the correct way to run and configure PrivateGPT in Docker. One shouldn't have to dig deep in the documentation and issues to look for something that already exists and works. I would really like to hear your opinion on this, so we can discuss on how exactly to update it.

In any case, I'm open for any comments and suggestions, and I hope you find this PR useful.

Edit: I ran make test and make check - 30 tests passed with 23 warnings, while make check fixed some files which I did not edit.

Edit 08.03.2024.: Closed in favour of #1690.

…ocker

neofob · 2024-03-03T05:27:34Z

@lukaboljevic :
This is great! You beat me to it. I plan to work on this this weekend but you saved me some time. :)

What I would do is similar except a minor change where I would install virtualenv in the base (builder) stage, install deps..etc. then copy the virtual env directory to the app stage in Dockerfile. Anyhow, what you are doing is similar. I'll check out your PR and let you know how it goes.

neofob · 2024-03-03T22:17:18Z

@lukaboljevic :

The line 74 in Dockerfile.local.gpu should be
ENV PYTHONPATH="$PYTHONPATH:/home/worker/app/private_gpt/"
docker build works
docker-compose up works

lukaboljevic · 2024-03-04T09:04:38Z

@lukaboljevic :

The line 74 in Dockerfile.local.gpu should be
ENV PYTHONPATH="$PYTHONPATH:/home/worker/app/private_gpt/"

docker build works

docker-compose up works

I agree with you, it should be this way as this is the correct path to the private_gpt folder. This line is present in Dockerfile.local and Dockerfile.local.gpu, so I tested your suggestion in both and everything works without issues. However, it for some reason seems to work even with just ENV PYTHONPATH="$PYTHONPATH:/private_gpt/" - this is what the current Dockerfile.local on the main branch has, which is why I didn't pay too much attention to that line until now.

I will wait for the input from @imartinez to update, but yes, I agree with you. I'm also glad docker build and compose work for you.

dpedwards · 2024-04-24T10:34:01Z

To address a NVIDIA GPU from a Docker Container the following Dockerfile config is working for me as well on Linux & Windows after fully installing the CUDA-Toolkit:

Dockerfile.cuda

# Use a specific version of the nvidia base image
FROM nvidia/cuda:12.2.2-devel-ubuntu22.04 as base

ENV DEBIAN_FRONTEND="noninteractive"
ENV TZ="Europe/Ljubljana"

# Minimize the number of RUN commands to reduce the number of layers
RUN apt-get update && apt-get install -y software-properties-common && \
    add-apt-repository ppa:deadsnakes/ppa && \
    apt-get update && \
    apt-get install -y python3.11 python3.11-venv python3-pip && \
    ln -sf /usr/bin/python3.11 /usr/bin/python3 && \
    python3 --version && \
    apt-get install -y libopenblas-dev ninja-build build-essential pkg-config wget gcc && \
    pip install pipx && \
    python3 -m pipx ensurepath && \
    pipx install poetry && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

ENV PATH="/root/.local/bin:$PATH"
ENV POETRY_VIRTUALENVS_IN_PROJECT=true

############################################
FROM base as dependencies
############################################

WORKDIR /home/worker/app
COPY pyproject.toml poetry.lock ./

RUN poetry install --extras "ui llms-llama-cpp embeddings-huggingface vector-stores-qdrant" && \
    CMAKE_ARGS='-DLLAMA_CUBLAS=on' poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python

############################################
FROM base as app
############################################

ENV PYTHONUNBUFFERED=1
ENV PORT=8080
EXPOSE 8080

RUN useradd -m worker
USER worker
WORKDIR /home/worker/app

RUN mkdir -p local_data models
COPY --chown=worker --from=dependencies /home/worker/app/.venv/ .venv
COPY --chown=worker private_gpt/ private_gpt
COPY --chown=worker fern/ fern
COPY --chown=worker *.yaml *.md ./

ENTRYPOINT [".venv/bin/python", "-m", "private_gpt"]

Build the Docker image:
docker build -f Dockerfile.cuda -t YOUR_DOCKER_IMAGE_NAME:YOUR_DOCKER_IMAGE_TAG .
Example:
docker build -f Dockerfile.cuda -t rag-cuda:latest .
Run the Docker container:
docker run -it --gpus all -v "YOUR_HOST_MODEL_PATH:/home/worker/app/models" -v "YOUR_HOST_LOCAL_DATA_PATH:/home/worker/app/local_data" -p 8080:8080 YOUR_DOCKER_IMAGE_NAME:YOUR_DOCKER_IMAGE_TAG
Example:
docker run -it --gpus all -v "/home/ubuntu/development/private-gpt-api/models:/home/worker/app/models" -v "/home/ubuntu/development/private-gpt-api/local_data:/home/worker/app/local_data" -p 8080:8080 rag-cuda:latest

neofob · 2024-04-24T14:44:02Z

@dpedwards : With recent change in llama-cpp-python, after 0.2.58, iirc, you need to use the flag -DLLAMA_CUDA=on instead of -DLLAMA_CUBLAS=on to get CUDA support.

GPU support in Docker, choose tokenizer and prompt through settings-d…

39efcd7

…ocker

lukaboljevic closed this Mar 8, 2024

lukaboljevic deleted the docker-gpu branch March 8, 2024 11:34

lukaboljevic mentioned this pull request Mar 8, 2024

GPU support in Docker, other Docker updates #1690

Open

dpedwards mentioned this pull request Apr 25, 2024

Running private-gpt inside docker with the same definitions as non-docker behave super slow - unusable #1873

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPU support in Docker, other Docker-related updates #1655

GPU support in Docker, other Docker-related updates #1655

lukaboljevic commented Feb 28, 2024 •

edited

Loading

neofob commented Mar 3, 2024

neofob commented Mar 3, 2024 •

edited

Loading

lukaboljevic commented Mar 4, 2024 •

edited

Loading

dpedwards commented Apr 24, 2024

neofob commented Apr 24, 2024

GPU support in Docker, other Docker-related updates #1655

GPU support in Docker, other Docker-related updates #1655

Conversation

lukaboljevic commented Feb 28, 2024 • edited Loading

neofob commented Mar 3, 2024

neofob commented Mar 3, 2024 • edited Loading

lukaboljevic commented Mar 4, 2024 • edited Loading

dpedwards commented Apr 24, 2024

neofob commented Apr 24, 2024

lukaboljevic commented Feb 28, 2024 •

edited

Loading

neofob commented Mar 3, 2024 •

edited

Loading

lukaboljevic commented Mar 4, 2024 •

edited

Loading