Skip to content

Commit

Permalink
Add kubernetes support for VisualQnA (opea-project#578)
Browse files Browse the repository at this point in the history
* Add kubernetes support for VisualQnA

Signed-off-by: lvliang-intel <[email protected]>

* update gmc file

Signed-off-by: lvliang-intel <[email protected]>

* update pic

Signed-off-by: lvliang-intel <[email protected]>

---------

Signed-off-by: lvliang-intel <[email protected]>
  • Loading branch information
lvliang-intel authored Aug 13, 2024
1 parent 80e3e2a commit 4f7fc39
Show file tree
Hide file tree
Showing 9 changed files with 784 additions and 7 deletions.
2 changes: 1 addition & 1 deletion VisualQnA/docker/gaudi/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -116,7 +116,7 @@ curl http://${host_ip}:8888/v1/visualqna -H "Content-Type: application/json" -d
{
"type": "image_url",
"image_url": {
"url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"
"url": "https://www.ilankelman.org/stopsigns/australia.jpg"
}
}
]
Expand Down
13 changes: 9 additions & 4 deletions VisualQnA/docker/xeon/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,15 +68,20 @@ docker build --no-cache -t opea/visualqna-ui:latest --build-arg https_proxy=$htt
cd ../../../..
```

### 4. Pull TGI image
### 4. Build TGI Xeon Image

Since TGI official image has not supported llava-next for CPU, we'll need to build it based on Dockerfile_intel.

```bash
docker pull ghcr.io/huggingface/text-generation-inference:2.2.0
git clone https://github.com/huggingface/text-generation-inference
cd text-generation-inference/
docker build -t opea/llava-tgi-xeon:latest --build-arg PLATFORM=cpu --build-arg http_proxy=${http_proxy} --build-arg https_proxy=${https_proxy} . -f Dockerfile_intel
cd ../
```

Then run the command `docker images`, you will have the following 4 Docker Images:

1. `ghcr.io/huggingface/text-generation-inference:2.2.0`
1. `opea/llava-tgi-xeon:latest`
2. `opea/lvm-tgi:latest`
3. `opea/visualqna:latest`
4. `opea/visualqna-ui:latest`
Expand Down Expand Up @@ -152,7 +157,7 @@ curl http://${host_ip}:8888/v1/visualqna -H "Content-Type: application/json" -d
{
"type": "image_url",
"image_url": {
"url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"
"url": "https://www.ilankelman.org/stopsigns/australia.jpg"
}
}
]
Expand Down
4 changes: 2 additions & 2 deletions VisualQnA/docker/xeon/compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ version: "3.8"

services:
llava-tgi-service:
image: ghcr.io/huggingface/text-generation-inference:2.2.0
image: opea/llava-tgi-xeon:latest
container_name: tgi-llava-xeon-server
ports:
- "9399:80"
Expand All @@ -19,7 +19,7 @@ services:
https_proxy: ${https_proxy}
HF_HUB_DISABLE_PROGRESS_BARS: 1
HF_HUB_ENABLE_HF_TRANSFER: 0
command: --model-id ${LVM_MODEL_ID}
command: --model-id ${LVM_MODEL_ID} --max-input-length 4096 --max-total-tokens 8192 --cuda-graphs 0
lvm-tgi:
image: opea/lvm-tgi:latest
container_name: lvm-tgi-server
Expand Down
57 changes: 57 additions & 0 deletions VisualQnA/kubernetes/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
# Deploy VisualQnA in a Kubernetes Cluster

This document outlines the deployment process for a Visual Question Answering (VisualQnA) application that utilizes the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice components on Intel Xeon servers and Gaudi machines.

Please install GMC in your Kubernetes cluster, if you have not already done so, by following the steps in Section "Getting Started" at [GMC Install](https://github.com/opea-project/GenAIInfra/tree/main/microservices-connector#readme). We will soon publish images to Docker Hub, at which point no builds will be required, further simplifying install.

If you have only Intel Xeon machines you could use the visualqna_xeon.yaml file or if you have a Gaudi cluster you could use visualqna_gaudi.yaml
In the below example we illustrate on Xeon.

## Deploy the VisualQnA application

1. Create the desired namespace if it does not already exist and deploy the application
```bash
export APP_NAMESPACE=CT
kubectl create ns $APP_NAMESPACE
sed -i "s|namespace: visualqna|namespace: $APP_NAMESPACE|g" ./visualqna_xeon.yaml
kubectl apply -f ./visualqna_xeon.yaml
```

2. Check if the application is up and ready
```bash
kubectl get pods -n $APP_NAMESPACE
```

3. Deploy a client pod for testing
```bash
kubectl create deployment client-test -n $APP_NAMESPACE --image=python:3.8.13 -- sleep infinity
```

4. Check that client pod is ready
```bash
kubectl get pods -n $APP_NAMESPACE
```

5. Send request to application
```bash
export CLIENT_POD=$(kubectl get pod -n $APP_NAMESPACE -l app=client-test -o jsonpath={.items..metadata.name})
export accessUrl=$(kubectl get gmc -n $APP_NAMESPACE -o jsonpath="{.items[?(@.metadata.name=='visualqna')].status.accessUrl}")
kubectl exec "$CLIENT_POD" -n $APP_NAMESPACE -- curl $accessUrl -X POST -d '{"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "What'\''s in this image?"
},
{
"type": "image_url",
"image_url": {
"url": "https://www.ilankelman.org/stopsigns/australia.jpg"
}
}
]
}
],
"max_tokens": 128}' -H 'Content-Type: application/json' > $LOG_PATH/gmc_visualqna.log
```
51 changes: 51 additions & 0 deletions VisualQnA/kubernetes/manifests/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# Deploy VisualQnA in Kubernetes Cluster

> [NOTE]
> You can also customize the "LVM_MODEL_ID" if needed.
> You need to make sure you have created the directory `/mnt/opea-models` to save the cached model on the node where the visualqna workload is running. Otherwise, you need to modify the `visualqna.yaml` file to change the `model-volume` to a directory that exists on the node.
## Deploy On Xeon

```
cd GenAIExamples/visualqna/kubernetes/manifests/xeon
kubectl apply -f visualqna.yaml
```

## Deploy On Gaudi

```
cd GenAIExamples/visualqna/kubernetes/manifests/gaudi
kubectl apply -f visualqna.yaml
```

## Verify Services

To verify the installation, run the command `kubectl get pod` to make sure all pods are running.

Then run the command `kubectl port-forward svc/visualqna 8888:8888` to expose the visualqna service for access.

Open another terminal and run the following command to verify the service if working:

```console
curl http://localhost:8888/v1/visualqna \
-H 'Content-Type: application/json' \
-d '{"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "What'\''s in this image?"
},
{
"type": "image_url",
"image_url": {
"url": "https://www.ilankelman.org/stopsigns/australia.jpg"
}
}
]
}
],
"max_tokens": 128}'
```
Loading

0 comments on commit 4f7fc39

Please sign in to comment.