Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

setup ollama service in aipc docker compose #1008

Merged
merged 1 commit into from
Oct 23, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
99 changes: 0 additions & 99 deletions ChatQnA/docker_compose/intel/cpu/aipc/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,105 +2,6 @@

This document outlines the deployment process for a ChatQnA application utilizing the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline on AIPC. The steps include Docker image creation, container deployment via Docker Compose, and service execution to integrate microservices such as `embedding`, `retriever`, `rerank`, and `llm`.

## Prerequisites

We use [Ollama](https://ollama.com/) as our LLM service for AIPC.

Please follow the instructions to set up Ollama on your PC. This will set the entrypoint needed for the Ollama to suit the ChatQnA examples.

### Set Up Ollama LLM Service

#### Install Ollama Service

Install Ollama service with one command:

```
curl -fsSL https://ollama.com/install.sh | sh
```

#### Set Ollama Service Configuration

Ollama Service Configuration file is /etc/systemd/system/ollama.service. Edit the file to set OLLAMA_HOST environment.
Replace **<host_ip>** with your host IPV4 (please use external public IP). For example the host_ip is 10.132.x.y, then `Environment="OLLAMA_HOST=10.132.x.y:11434"'.

```
Environment="OLLAMA_HOST=host_ip:11434"
```

#### Set https_proxy environment for Ollama

If your system access network through proxy, add https_proxy in Ollama Service Configuration file

```
Environment="https_proxy=Your_HTTPS_Proxy"
```

#### Restart Ollama services

```
$ sudo systemctl daemon-reload
$ sudo systemctl restart ollama.service
```

#### Check the service started

```
netstat -tuln | grep 11434
```

The output are:

```
tcp 0 0 10.132.x.y:11434 0.0.0.0:* LISTEN
```

#### Pull Ollama LLM model

Run the command to download LLM models. The <host_ip> is the one set in [Ollama Service Configuration](#Set-Ollama-Service-Configuration)

```
export host_ip=<host_ip>
export OLLAMA_HOST=http://${host_ip}:11434
ollama pull llama3.2
```

After downloaded the models, you can list the models by `ollama list`.

The output should be similar to the following:

```
NAME ID SIZE MODIFIED
llama3.2:latest a80c4f17acd5 2.0 GB 2 minutes ago
```

### Consume Ollama LLM Service

Access ollama service to verify that the ollama is functioning correctly.

```bash
curl http://${host_ip}:11434/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "llama3.2",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Hello!"
}
]
}'
```

The outputs are similar to these:

```
{"id":"chatcmpl-4","object":"chat.completion","created":1729232496,"model":"llama3.2","system_fingerprint":"fp_ollama","choices":[{"index":0,"message":{"role":"assistant","content":"How can I assist you today? Are you looking for information, answers to a question, or just need someone to chat with? I'm here to help in any way I can."},"finish_reason":"stop"}],"usage":{"prompt_tokens":33,"completion_tokens":38,"total_tokens":71}}
```

## 🚀 Build Docker Images

First of all, you need to build Docker Images locally and install the python package of it.
Expand Down
18 changes: 18 additions & 0 deletions ChatQnA/docker_compose/intel/cpu/aipc/compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,21 @@ services:
HF_HUB_DISABLE_PROGRESS_BARS: 1
HF_HUB_ENABLE_HF_TRANSFER: 0
command: --model-id ${RERANK_MODEL_ID} --auto-truncate
ollama-service:
image: ollama/ollama
container_name: ollama
ports:
- "11434:11434"
volumes:
- ollama:/root/.ollama
entrypoint: ["bash", "-c"]
command: ["ollama serve & sleep 10 && ollama run ${OLLAMA_MODEL} & wait"]
environment:
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
OLLAMA_MODEL: ${OLLAMA_MODEL}

chatqna-aipc-backend-server:
image: ${REGISTRY:-opea}/chatqna:${TAG:-latest}
container_name: chatqna-aipc-backend-server
Expand Down Expand Up @@ -134,6 +149,9 @@ services:
ipc: host
restart: always

volumes:
ollama:

networks:
default:
driver: bridge
Loading