Skip to content

Commit

Permalink
Update READMEs (#430)
Browse files Browse the repository at this point in the history
* update readme gaudi part & add tei-gaudi params

Signed-off-by: letonghan <[email protected]>

* modify supported habana driver version

Signed-off-by: letonghan <[email protected]>

* update env set part

Signed-off-by: letonghan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add example for no_proxy

Signed-off-by: letonghan <[email protected]>

* add an example of public ip

Signed-off-by: letonghan <[email protected]>

---------

Signed-off-by: letonghan <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
  • Loading branch information
letonghan and pre-commit-ci[bot] authored Jul 23, 2024
1 parent 2f9397e commit 8ad7f36
Show file tree
Hide file tree
Showing 13 changed files with 129 additions and 53 deletions.
17 changes: 14 additions & 3 deletions ChatQnA/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,9 @@ To set up environment variables for deploying ChatQnA services, follow these ste
1. Set the required environment variables:

```bash
# Example: host_ip="192.168.1.1"
export host_ip="External_Public_IP"
# Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1"
export no_proxy="Your_No_Proxy"
export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token"
```
Expand All @@ -59,20 +61,29 @@ export https_proxy="Your_HTTPs_Proxy"

3. Set up other environment variables:

> Notice that you can only choose <b>one</b> command below to set up envs according to your hardware. Other that the port numbers may be set incorrectly.
```bash
bash ./docker/set_env.sh
# on Gaudi
source ./docker/gaudi/set_env.sh
# on Xeon
source ./docker/xeon/set_env.sh
# on Nvidia GPU
source ./docker/gpu/set_env.sh
```

## Deploy ChatQnA on Gaudi

If your version of `Habana Driver` < 1.16.0 (check with `hl-smi`), run the following command directly to start ChatQnA services. Please find corresponding [docker_compose.yaml](./docker/gaudi/docker_compose.yaml).
Please find corresponding [docker_compose.yaml](./docker/gaudi/docker_compose.yaml).

```bash
cd GenAIExamples/ChatQnA/docker/gaudi/
docker compose -f docker_compose.yaml up -d
```

If your version of `Habana Driver` >= 1.16.0, refer to the [Gaudi Guide](./docker/gaudi/README.md) to build docker images from source.
> Notice: Currently only the <b>Habana Driver 1.16.x</b> is supported for Gaudi.
Please refer to the [Gaudi Guide](./docker/gaudi/README.md) to build docker images from source.

## Deploy ChatQnA on Xeon

Expand Down
2 changes: 2 additions & 0 deletions ChatQnA/docker/gaudi/docker_compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,8 @@ services:
HABANA_VISIBLE_DEVICES: all
OMPI_MCA_btl_vader_single_copy_mechanism: none
MAX_WARMUP_SEQUENCE_LENGTH: 512
INIT_HCCL_ON_ACQUIRE: 0
ENABLE_EXPERIMENTAL_FLAGS: true
command: --model-id ${EMBEDDING_MODEL_ID}
embedding:
image: opea/embedding-tei:latest
Expand Down
File renamed without changes.
16 changes: 6 additions & 10 deletions ChatQnA/docker/gpu/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -135,21 +135,17 @@ curl http://${host_ip}:6000/v1/embeddings \

3. Retriever Microservice

To consume the retriever microservice, you need to generate a mock embedding vector of length 768 in Python script:
To consume the retriever microservice, you need to generate a mock embedding vector by Python script. The length of embedding vector
is determined by the embedding model.
Here we use the model `EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5"`, which vector size is 768.

```python
import random

embedding = [random.uniform(-1, 1) for _ in range(768)]
print(embedding)
```

Then substitute your mock embedding vector for the `${your_embedding}` in the following `curl` command:
Check the vecotor dimension of your embedding model, set `your_embedding` dimension equals to it.

```bash
your_embedding=$(python -c "import random; embedding = [random.uniform(-1, 1) for _ in range(768)]; print(embedding)")
curl http://${host_ip}:7000/v1/retrieval \
-X POST \
-d '{"text":"test", "embedding":${your_embedding}}' \
-d "{\"text\":\"test\",\"embedding\":${your_embedding}}" \
-H 'Content-Type: application/json'
```

Expand Down
23 changes: 23 additions & 0 deletions ChatQnA/docker/gpu/set_env.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
#!/usr/bin/env bash

# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0


export EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5"
export RERANK_MODEL_ID="BAAI/bge-reranker-base"
export LLM_MODEL_ID="Intel/neural-chat-7b-v3-3"
export TEI_EMBEDDING_ENDPOINT="http://${host_ip}:8090"
export TEI_RERANKING_ENDPOINT="http://${host_ip}:8808"
export TGI_LLM_ENDPOINT="http://${host_ip}:8008"
export REDIS_URL="redis://${host_ip}:6379"
export INDEX_NAME="rag-redis"
export MEGA_SERVICE_HOST_IP=${host_ip}
export EMBEDDING_SERVICE_HOST_IP=${host_ip}
export RETRIEVER_SERVICE_HOST_IP=${host_ip}
export RERANK_SERVICE_HOST_IP=${host_ip}
export LLM_SERVICE_HOST_IP=${host_ip}
export BACKEND_SERVICE_ENDPOINT="http://${host_ip}:8888/v1/chatqna"
export DATAPREP_SERVICE_ENDPOINT="http://${host_ip}:6007/v1/dataprep"
export DATAPREP_GET_FILE_ENDPOINT="http://${host_ip}:6008/v1/dataprep/get_file"
export DATAPREP_DELETE_FILE_ENDPOINT="http://${host_ip}:6009/v1/dataprep/delete_file"
40 changes: 28 additions & 12 deletions ChatQnA/docker/xeon/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -226,21 +226,19 @@ curl http://${host_ip}:6000/v1/embeddings\
-H 'Content-Type: application/json'
```

3. Retriever Microservice
To validate the retriever microservice, you need to generate a mock embedding vector of length 768 in Python script:
3. Retriever Microservice

```Python
import random
embedding = [random.uniform(-1, 1) for _ in range(768)]
print(embedding)
```
To consume the retriever microservice, you need to generate a mock embedding vector by Python script. The length of embedding vector
is determined by the embedding model.
Here we use the model `EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5"`, which vector size is 768.

Then substitute your mock embedding vector for the `${your_embedding}` in the following cURL command:
Check the vecotor dimension of your embedding model, set `your_embedding` dimension equals to it.

```bash
your_embedding=$(python -c "import random; embedding = [random.uniform(-1, 1) for _ in range(768)]; print(embedding)")
curl http://${host_ip}:7000/v1/retrieval \
-X POST \
-d '{"text":"What is the revenue of Nike in 2023?","embedding":"'"${your_embedding}"'"}' \
-d "{\"text\":\"test\",\"embedding\":${your_embedding}}" \
-H 'Content-Type: application/json'
```

Expand Down Expand Up @@ -369,12 +367,30 @@ To access the frontend, open the following URL in your browser: http://{host_ip}
- "80:5173"
```
## 🚀 Launch the Conversational UI (react)
## 🚀 Launch the Conversational UI (Optional)
To access the Conversational UI (react based) frontend, modify the UI service in the `docker_compose.yaml` file. Replace `chaqna-gaudi-ui-server` service with the `chatqna-gaudi-conversation-ui-server` service as per the config below:

To access the Conversational UI frontend, open the following URL in your browser: http://{host_ip}:5174. By default, the UI runs on port 80 internally. If you prefer to use a different host port to access the frontend, you can modify the port mapping in the `docker_compose.yaml` file as shown below:
```yaml
chaqna-gaudi-conversation-ui-server:
image: opea/chatqna-conversation-ui:latest
container_name: chatqna-gaudi-conversation-ui-server
environment:
- no_proxy=${no_proxy}
- https_proxy=${https_proxy}
- http_proxy=${http_proxy}
ports:
- "5174:80"
depends_on:
- chaqna-gaudi-backend-server
ipc: host
restart: always
```

Once the services are up, open the following URL in your browser: http://{host_ip}:5174. By default, the UI runs on port 80 internally. If you prefer to use a different host port to access the frontend, you can modify the port mapping in the `docker_compose.yaml` file as shown below:

```yaml
chaqna-xeon-conversation-ui-server:
chaqna-gaudi-conversation-ui-server:
image: opea/chatqna-conversation-ui:latest
...
ports:
Expand Down
13 changes: 0 additions & 13 deletions ChatQnA/docker/xeon/docker_compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -189,19 +189,6 @@ services:
- DELETE_FILE=${DATAPREP_DELETE_FILE_ENDPOINT}
ipc: host
restart: always
chaqna-xeon-conversation-ui-server:
image: opea/chatqna-conversation-ui:latest
container_name: chatqna-xeon-conversation-ui-server
environment:
- no_proxy=${no_proxy}
- https_proxy=${https_proxy}
- http_proxy=${http_proxy}
ports:
- 5174:80
depends_on:
- chaqna-xeon-backend-server
ipc: host
restart: always

networks:
default:
Expand Down
23 changes: 23 additions & 0 deletions ChatQnA/docker/xeon/set_env.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
#!/usr/bin/env bash

# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0


export EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5"
export RERANK_MODEL_ID="BAAI/bge-reranker-base"
export LLM_MODEL_ID="Intel/neural-chat-7b-v3-3"
export TEI_EMBEDDING_ENDPOINT="http://${host_ip}:6006"
export TEI_RERANKING_ENDPOINT="http://${host_ip}:8808"
export TGI_LLM_ENDPOINT="http://${host_ip}:9009"
export REDIS_URL="redis://${host_ip}:6379"
export INDEX_NAME="rag-redis"
export MEGA_SERVICE_HOST_IP=${host_ip}
export EMBEDDING_SERVICE_HOST_IP=${host_ip}
export RETRIEVER_SERVICE_HOST_IP=${host_ip}
export RERANK_SERVICE_HOST_IP=${host_ip}
export LLM_SERVICE_HOST_IP=${host_ip}
export BACKEND_SERVICE_ENDPOINT="http://${host_ip}:8888/v1/chatqna"
export DATAPREP_SERVICE_ENDPOINT="http://${host_ip}:6007/v1/dataprep"
export DATAPREP_GET_FILE_ENDPOINT="http://${host_ip}:6008/v1/dataprep/get_file"
export DATAPREP_DELETE_FILE_ENDPOINT="http://${host_ip}:6009/v1/dataprep/delete_file"
12 changes: 8 additions & 4 deletions CodeGen/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,36 +39,40 @@ To set up environment variables for deploying ChatQnA services, follow these ste
1. Set the required environment variables:

```bash
# Example: host_ip="192.168.1.1"
export host_ip="External_Public_IP"
# Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1"
export no_proxy="Your_No_Proxy"
export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token"
```

2. If you are in a proxy environment, also set the proxy-related environment variables:

```bash
export http_proxy="Your_HTTP_Proxy"
export https_proxy="Your_HTTPs_Proxy"
export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token"
```

3. Set up other environment variables:

```bash
bash ./docker/set_env.sh
source ./docker/set_env.sh
```

## Deploy CodeGen using Docker

### Deploy CodeGen on Gaudi

- If your version of `Habana Driver` < 1.16.0 (check with `hl-smi`), run the following command directly to start ChatQnA services. Please find corresponding [docker_compose.yaml](./docker/gaudi/docker_compose.yaml).
Please find corresponding [docker_compose.yaml](./docker/gaudi/docker_compose.yaml).

```bash
cd GenAIExamples/CodeGen/docker/gaudi
docker compose -f docker_compose.yaml up -d
```

- If your version of `Habana Driver` >= 1.16.0, refer to the [Gaudi Guide](./docker/gaudi/README.md) to build docker images from source.
> Notice: Currently only the <b>Habana Driver 1.16.x</b> is supported for Gaudi.
Please refer to the [Gaudi Guide](./docker/gaudi/README.md) to build docker images from source.

### Deploy CodeGen on Xeon

Expand Down
12 changes: 8 additions & 4 deletions CodeTrans/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,36 +29,40 @@ To set up environment variables for deploying Code Translation services, follow
1. Set the required environment variables:

```bash
# Example: host_ip="192.168.1.1"
export host_ip="External_Public_IP"
# Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1"
export no_proxy="Your_No_Proxy"
export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token"
```

2. If you are in a proxy environment, also set the proxy-related environment variables:

```bash
export http_proxy="Your_HTTP_Proxy"
export https_proxy="Your_HTTPs_Proxy"
export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token"
```

3. Set up other environment variables:

```bash
bash ./docker/set_env.sh
source ./docker/set_env.sh
```

## Deploy with Docker

### Deploy Code Translation on Gaudi

- If your version of `Habana Driver` < 1.16.0 (check with `hl-smi`), run the following command directly to start Code Translation services. Please find corresponding [docker_compose.yaml](./docker/gaudi/docker_compose.yaml).
Please find corresponding [docker_compose.yaml](./docker/gaudi/docker_compose.yaml).

```bash
cd GenAIExamples/CodeTrans/docker/gaudi
docker compose -f docker_compose.yaml up -d
```

- If your version of `Habana Driver` >= 1.16.0, refer to the [Gaudi Guide](./docker/gaudi/README.md) to build docker images from source.
> Notice: Currently only the <b>Habana Driver 1.16.x</b> is supported for Gaudi.
Please refer to the [Gaudi Guide](./docker/gaudi/README.md) to build docker images from source.

### Deploy Code Translation on Xeon

Expand Down
12 changes: 8 additions & 4 deletions DocSum/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,36 +32,40 @@ To set up environment variables for deploying Document Summarization services, f
1. Set the required environment variables:

```bash
# Example: host_ip="192.168.1.1"
export host_ip="External_Public_IP"
# Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1"
export no_proxy="Your_No_Proxy"
export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token"
```

2. If you are in a proxy environment, also set the proxy-related environment variables:

```bash
export http_proxy="Your_HTTP_Proxy"
export https_proxy="Your_HTTPs_Proxy"
export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token"
```

3. Set up other environment variables:

```bash
bash ./docker/set_env.sh
source ./docker/set_env.sh
```

## Deploy using Docker

### Deploy on Gaudi

If your version of `Habana Driver` < 1.16.0 (check with `hl-smi`), run the following command directly to start DocSum services. Please find corresponding [docker_compose.yaml](./docker/gaudi/docker_compose.yaml).
Please find corresponding [docker_compose.yaml](./docker/gaudi/docker_compose.yaml).

```bash
cd GenAIExamples/DocSum/docker/gaudi/
docker compose -f docker_compose.yaml up -d
```

If your version of `Habana Driver` >= 1.16.0, refer to the [Gaudi Guide](./docker/gaudi/README.md) to build docker images from source.
> Notice: Currently only the <b>Habana Driver 1.16.x</b> is supported for Gaudi.
Please refer to the [Gaudi Guide](./docker/gaudi/README.md) to build docker images from source.

### Deploy on Xeon

Expand Down
10 changes: 7 additions & 3 deletions SearchQnA/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,24 +41,26 @@ To set up environment variables for deploying SearchQnA services, follow these s
1. Set the required environment variables:

```bash
# Example: host_ip="192.168.1.1"
export host_ip="External_Public_IP"
# Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1"
export no_proxy="Your_No_Proxy"
export GOOGLE_CSE_ID="Your_CSE_ID"
export GOOGLE_API_KEY="Your_Google_API_Key"
export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token"
```

2. If you are in a proxy environment, also set the proxy-related environment variables:

```bash
export http_proxy="Your_HTTP_Proxy"
export https_proxy="Your_HTTPs_Proxy"
export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token"
```

3. Set up other environment variables:

```bash
bash ./docker/set_env.sh
source ./docker/set_env.sh
```

## Deploy SearchQnA on Gaudi
Expand All @@ -70,7 +72,9 @@ cd GenAIExamples/SearchQnA/docker/gaudi/
docker compose up -d
```

If your version of `Habana Driver` >= 1.16.0, refer to the [Gaudi Guide](./docker/gaudi/README.md) to build docker images from source.
> Notice: Currently only the <b>Habana Driver 1.16.x</b> is supported for Gaudi.
Please refer to the [Gaudi Guide](./docker/gaudi/README.md) to build docker images from source.

## Deploy SearchQnA on Xeon

Expand Down
2 changes: 2 additions & 0 deletions SearchQnA/docker/gaudi/compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,8 @@ services:
HABANA_VISIBLE_DEVICES: all
OMPI_MCA_btl_vader_single_copy_mechanism: none
MAX_WARMUP_SEQUENCE_LENGTH: 512
INIT_HCCL_ON_ACQUIRE: 0
ENABLE_EXPERIMENTAL_FLAGS: true
command: --model-id ${EMBEDDING_MODEL_ID} --auto-truncate
embedding:
image: opea/embedding-tei:latest
Expand Down

0 comments on commit 8ad7f36

Please sign in to comment.