Update docs

DevXT-LLC · Oct 2, 2023 · a69d906 · a69d906
1 parent feb36fc
commit a69d906
Show file tree

Hide file tree

Showing 5 changed files with 26 additions and 60 deletions.
diff --git a/README.md b/README.md
@@ -1,8 +1,12 @@
 # llamacpp-server in Docker with OpenAI Style Endpoints
 
-This llamacpp server comes equipped with the OpenAI style endpoints that most software is familiar with. will allow you to start it with a `MODEL_URL` defined in the `.env` file instead of needing to manually go to Hugging Face and download the model on the server.
+This llamacpp server comes equipped with the OpenAI style endpoints that most software is familiar with. It will allow you to start it with a `MODEL_URL` defined in the `.env` file instead of needing to manually go to Hugging Face and download the model on the server.
 
-This is the default `.env` file, modify it to your needs:
+TheBloke sticks to the same naming convention for his models, so you can just use the model repository name like `TheBloke/Mistral-7B-OpenOrca-GGUF` and it will automatically download the model from Hugging Face. If the model repositories are not in the format he uses, you can use the full URL to the model of the download link like `https://huggingface.co/TheBloke/Mistral-7B-OpenOrca-GGUF/resolve/main/mistral.7b.q5_k_s.gguf` and it will download the quantized model from Hugging Face.
+
+## Environment Set Up
+
+Create a `.env` file if one does not exist and modify it to your needs. This is the default `.env` file if cloning the repository, modify it to your needs:
 
 ```env
 MODEL_URL=TheBloke/Mistral-7B-OpenOrca-GGUF
@@ -17,51 +21,44 @@ UVICORN_WORKERS=2
 LLAMACPP_API_KEY=
 ```
 
-TheBloke sticks to the same naming convention for his models, so you can just use the model name and it will automatically download the model from Hugging Face. If the model repositories are not in the format he uses, you can use the full URL to the model of the download link.
+## CPU Only
 
-## Clone the repository
+Run with docker:
 
 ```bash
-git clone https://github.com/Josh-XT/llamacpp-server
-cd llamacpp-server
+docker pull joshxt/llamacpp-server:full
+docker run -d --name llamacpp-server -p 8091:8091 joshxt/llamacpp-server:full --env-file .env
 ```
 
-Modify the `.env` file if desired before proceeding.
-
-### NVIDIA GPU
-
-If running without an NVIDIA GPU, you can start the server with:
+Or with docker-compose:
 
 ```bash
-docker-compose -f docker-compose-cuda.yml pull
-docker-compose -f docker-compose-cuda.yml up
+git clone https://github.com/Josh-XT/llamacpp-server
+cd llamacpp-server
+docker-compose pull
+docker-compose up
 ```
 
-Or if you only want the OpenAPI Style endpoints exposed:
-
-```bash
-docker-compose -f docker-compose-cuda-openai.yml pull
-docker-compose -f docker-compose-cuda-openai.yml up
-```
+## NVIDIA GPU
 
-### CPU Only
+If you're using an NVIDIA GPU, you can use the CUDA version of the server.
 
-If you are not running on an NVIDIA GPU, you can start the server with:
+Run with docker:
 
 ```bash
-docker-compose pull
-docker-compose up
+docker pull joshxt/llamacpp-server:full-cuda
+docker run -d --name llamacpp-server -p 8091:8091 --gpus all joshxt/llamacpp-server:full-cuda --env-file .env
 ```
 
-Or if you only want the OpenAPI Style endpoints exposed:
+Or with docker-compose:
 
 ```bash
-docker-compose -f docker-compose-openai.yml pull
-docker-compose -f docker-compose-openai.yml up
+git clone https://github.com/Josh-XT/llamacpp-server
+cd llamacpp-server
+docker-compose -f docker-compose-cuda.yml pull
+docker-compose -f docker-compose-cuda.yml up
 ```
 
-The llamacpp server API is available at `http://localhost:8090` by default. The [documentation for the API is available here.](https://github.com/ggerganov/llama.cpp/tree/master/examples/server#api-endpoints)
-
 ## OpenAI Style Endpoint Usage
 
 OpenAI Style endpoints available at `http://localhost:8091/` by default. Documentation can be accessed at that url when the server is running.

diff --git a/docker-compose-cuda-openai.yml b/docker-compose-cuda-openai.yml
diff --git a/docker-compose-cuda.yml b/docker-compose-cuda.yml
@@ -6,7 +6,6 @@ services:
     env_file: .env
     restart: unless-stopped
     ports:
-      - "8090:8090"
       - "8091:8091"
     volumes:
       - ./models:/app/models

diff --git a/docker-compose-openai.yml b/docker-compose-openai.yml
diff --git a/docker-compose.yml b/docker-compose.yml
@@ -6,7 +6,6 @@ services:
     env_file: .env
     restart: unless-stopped
     ports:
-      - "8090:8090"
       - "8091:8091"
     volumes:
-      - ./models:/app/models
+      - ./models:/app/models