Skip to content

Commit

Permalink
Merge pull request #835 from nerdalert/bash-bugs
Browse files Browse the repository at this point in the history
  • Loading branch information
Gregory-Pereira authored Dec 12, 2024
2 parents 7fbdf03 + 28525a4 commit 8d2b5b9
Showing 1 changed file with 11 additions and 11 deletions.
22 changes: 11 additions & 11 deletions model_servers/llamacpp_python/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -110,25 +110,25 @@ To deploy the LLM server you must specify a volume mount `-v` where your models
podman run --rm -it \
-p 8001:8001 \
-v Local/path/to/locallm/models:/locallm/models:ro \
-e MODEL_PATH=models/granite-7b-lab-Q4_K_M.gguf
-e HOST=0.0.0.0
-e PORT=8001
-e MODEL_CHAT_FORMAT=openchat
llamacpp_python \
-e MODEL_PATH=models/granite-7b-lab-Q4_K_M.gguf \
-e HOST=0.0.0.0 \
-e PORT=8001 \
-e MODEL_CHAT_FORMAT=openchat \
llamacpp_python
```

or with Cuda image

```bash
podman run --rm -it \
--device nvidia.com/gpu=all
--device nvidia.com/gpu=all \
-p 8001:8001 \
-v Local/path/to/locallm/models:/locallm/models:ro \
-e MODEL_PATH=models/granite-7b-lab-Q4_K_M.gguf
-e HOST=0.0.0.0
-e PORT=8001
-e MODEL_CHAT_FORMAT=openchat
llamacpp_python \
-e MODEL_PATH=models/granite-7b-lab-Q4_K_M.gguf \
-e HOST=0.0.0.0 \
-e PORT=8001 \
-e MODEL_CHAT_FORMAT=openchat \
llamacpp_python
```
### Multiple Model Service:

Expand Down

0 comments on commit 8d2b5b9

Please sign in to comment.