diff --git a/ChatQnA/README.md b/ChatQnA/README.md
index 549fea8fb8..59359b7216 100644
--- a/ChatQnA/README.md
+++ b/ChatQnA/README.md
@@ -45,7 +45,9 @@ To set up environment variables for deploying ChatQnA services, follow these ste
1. Set the required environment variables:
```bash
+# Example: host_ip="192.168.1.1"
export host_ip="External_Public_IP"
+# Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1"
export no_proxy="Your_No_Proxy"
export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token"
```
@@ -59,20 +61,29 @@ export https_proxy="Your_HTTPs_Proxy"
3. Set up other environment variables:
+> Notice that you can only choose one command below to set up envs according to your hardware. Other that the port numbers may be set incorrectly.
+
```bash
-bash ./docker/set_env.sh
+# on Gaudi
+source ./docker/gaudi/set_env.sh
+# on Xeon
+source ./docker/xeon/set_env.sh
+# on Nvidia GPU
+source ./docker/gpu/set_env.sh
```
## Deploy ChatQnA on Gaudi
-If your version of `Habana Driver` < 1.16.0 (check with `hl-smi`), run the following command directly to start ChatQnA services. Please find corresponding [docker_compose.yaml](./docker/gaudi/docker_compose.yaml).
+Please find corresponding [docker_compose.yaml](./docker/gaudi/docker_compose.yaml).
```bash
cd GenAIExamples/ChatQnA/docker/gaudi/
docker compose -f docker_compose.yaml up -d
```
-If your version of `Habana Driver` >= 1.16.0, refer to the [Gaudi Guide](./docker/gaudi/README.md) to build docker images from source.
+> Notice: Currently only the Habana Driver 1.16.x is supported for Gaudi.
+
+Please refer to the [Gaudi Guide](./docker/gaudi/README.md) to build docker images from source.
## Deploy ChatQnA on Xeon
diff --git a/ChatQnA/docker/gaudi/docker_compose.yaml b/ChatQnA/docker/gaudi/docker_compose.yaml
index d9de34e230..b8a420ec21 100644
--- a/ChatQnA/docker/gaudi/docker_compose.yaml
+++ b/ChatQnA/docker/gaudi/docker_compose.yaml
@@ -44,6 +44,8 @@ services:
HABANA_VISIBLE_DEVICES: all
OMPI_MCA_btl_vader_single_copy_mechanism: none
MAX_WARMUP_SEQUENCE_LENGTH: 512
+ INIT_HCCL_ON_ACQUIRE: 0
+ ENABLE_EXPERIMENTAL_FLAGS: true
command: --model-id ${EMBEDDING_MODEL_ID}
embedding:
image: opea/embedding-tei:latest
diff --git a/ChatQnA/docker/set_env.sh b/ChatQnA/docker/gaudi/set_env.sh
similarity index 100%
rename from ChatQnA/docker/set_env.sh
rename to ChatQnA/docker/gaudi/set_env.sh
diff --git a/ChatQnA/docker/gpu/README.md b/ChatQnA/docker/gpu/README.md
index d00a7ce143..78f8c5d17d 100644
--- a/ChatQnA/docker/gpu/README.md
+++ b/ChatQnA/docker/gpu/README.md
@@ -135,21 +135,17 @@ curl http://${host_ip}:6000/v1/embeddings \
3. Retriever Microservice
-To consume the retriever microservice, you need to generate a mock embedding vector of length 768 in Python script:
+To consume the retriever microservice, you need to generate a mock embedding vector by Python script. The length of embedding vector
+is determined by the embedding model.
+Here we use the model `EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5"`, which vector size is 768.
-```python
-import random
-
-embedding = [random.uniform(-1, 1) for _ in range(768)]
-print(embedding)
-```
-
-Then substitute your mock embedding vector for the `${your_embedding}` in the following `curl` command:
+Check the vecotor dimension of your embedding model, set `your_embedding` dimension equals to it.
```bash
+your_embedding=$(python -c "import random; embedding = [random.uniform(-1, 1) for _ in range(768)]; print(embedding)")
curl http://${host_ip}:7000/v1/retrieval \
-X POST \
- -d '{"text":"test", "embedding":${your_embedding}}' \
+ -d "{\"text\":\"test\",\"embedding\":${your_embedding}}" \
-H 'Content-Type: application/json'
```
diff --git a/ChatQnA/docker/gpu/set_env.sh b/ChatQnA/docker/gpu/set_env.sh
new file mode 100644
index 0000000000..e024bd8a64
--- /dev/null
+++ b/ChatQnA/docker/gpu/set_env.sh
@@ -0,0 +1,23 @@
+#!/usr/bin/env bash
+
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+
+export EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5"
+export RERANK_MODEL_ID="BAAI/bge-reranker-base"
+export LLM_MODEL_ID="Intel/neural-chat-7b-v3-3"
+export TEI_EMBEDDING_ENDPOINT="http://${host_ip}:8090"
+export TEI_RERANKING_ENDPOINT="http://${host_ip}:8808"
+export TGI_LLM_ENDPOINT="http://${host_ip}:8008"
+export REDIS_URL="redis://${host_ip}:6379"
+export INDEX_NAME="rag-redis"
+export MEGA_SERVICE_HOST_IP=${host_ip}
+export EMBEDDING_SERVICE_HOST_IP=${host_ip}
+export RETRIEVER_SERVICE_HOST_IP=${host_ip}
+export RERANK_SERVICE_HOST_IP=${host_ip}
+export LLM_SERVICE_HOST_IP=${host_ip}
+export BACKEND_SERVICE_ENDPOINT="http://${host_ip}:8888/v1/chatqna"
+export DATAPREP_SERVICE_ENDPOINT="http://${host_ip}:6007/v1/dataprep"
+export DATAPREP_GET_FILE_ENDPOINT="http://${host_ip}:6008/v1/dataprep/get_file"
+export DATAPREP_DELETE_FILE_ENDPOINT="http://${host_ip}:6009/v1/dataprep/delete_file"
diff --git a/ChatQnA/docker/xeon/README.md b/ChatQnA/docker/xeon/README.md
index c74070bc0e..97f77ae434 100644
--- a/ChatQnA/docker/xeon/README.md
+++ b/ChatQnA/docker/xeon/README.md
@@ -226,21 +226,19 @@ curl http://${host_ip}:6000/v1/embeddings\
-H 'Content-Type: application/json'
```
-3. Retriever Microservice
- To validate the retriever microservice, you need to generate a mock embedding vector of length 768 in Python script:
+3. Retriever Microservice
-```Python
-import random
-embedding = [random.uniform(-1, 1) for _ in range(768)]
-print(embedding)
-```
+To consume the retriever microservice, you need to generate a mock embedding vector by Python script. The length of embedding vector
+is determined by the embedding model.
+Here we use the model `EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5"`, which vector size is 768.
-Then substitute your mock embedding vector for the `${your_embedding}` in the following cURL command:
+Check the vecotor dimension of your embedding model, set `your_embedding` dimension equals to it.
```bash
+your_embedding=$(python -c "import random; embedding = [random.uniform(-1, 1) for _ in range(768)]; print(embedding)")
curl http://${host_ip}:7000/v1/retrieval \
-X POST \
- -d '{"text":"What is the revenue of Nike in 2023?","embedding":"'"${your_embedding}"'"}' \
+ -d "{\"text\":\"test\",\"embedding\":${your_embedding}}" \
-H 'Content-Type: application/json'
```
@@ -369,12 +367,30 @@ To access the frontend, open the following URL in your browser: http://{host_ip}
- "80:5173"
```
-## 🚀 Launch the Conversational UI (react)
+## 🚀 Launch the Conversational UI (Optional)
+
+To access the Conversational UI (react based) frontend, modify the UI service in the `docker_compose.yaml` file. Replace `chaqna-gaudi-ui-server` service with the `chatqna-gaudi-conversation-ui-server` service as per the config below:
-To access the Conversational UI frontend, open the following URL in your browser: http://{host_ip}:5174. By default, the UI runs on port 80 internally. If you prefer to use a different host port to access the frontend, you can modify the port mapping in the `docker_compose.yaml` file as shown below:
+```yaml
+chaqna-gaudi-conversation-ui-server:
+ image: opea/chatqna-conversation-ui:latest
+ container_name: chatqna-gaudi-conversation-ui-server
+ environment:
+ - no_proxy=${no_proxy}
+ - https_proxy=${https_proxy}
+ - http_proxy=${http_proxy}
+ ports:
+ - "5174:80"
+ depends_on:
+ - chaqna-gaudi-backend-server
+ ipc: host
+ restart: always
+```
+
+Once the services are up, open the following URL in your browser: http://{host_ip}:5174. By default, the UI runs on port 80 internally. If you prefer to use a different host port to access the frontend, you can modify the port mapping in the `docker_compose.yaml` file as shown below:
```yaml
- chaqna-xeon-conversation-ui-server:
+ chaqna-gaudi-conversation-ui-server:
image: opea/chatqna-conversation-ui:latest
...
ports:
diff --git a/ChatQnA/docker/xeon/docker_compose.yaml b/ChatQnA/docker/xeon/docker_compose.yaml
index 10c7b5d652..71818f85f3 100644
--- a/ChatQnA/docker/xeon/docker_compose.yaml
+++ b/ChatQnA/docker/xeon/docker_compose.yaml
@@ -189,19 +189,6 @@ services:
- DELETE_FILE=${DATAPREP_DELETE_FILE_ENDPOINT}
ipc: host
restart: always
- chaqna-xeon-conversation-ui-server:
- image: opea/chatqna-conversation-ui:latest
- container_name: chatqna-xeon-conversation-ui-server
- environment:
- - no_proxy=${no_proxy}
- - https_proxy=${https_proxy}
- - http_proxy=${http_proxy}
- ports:
- - 5174:80
- depends_on:
- - chaqna-xeon-backend-server
- ipc: host
- restart: always
networks:
default:
diff --git a/ChatQnA/docker/xeon/set_env.sh b/ChatQnA/docker/xeon/set_env.sh
new file mode 100644
index 0000000000..888b8f3070
--- /dev/null
+++ b/ChatQnA/docker/xeon/set_env.sh
@@ -0,0 +1,23 @@
+#!/usr/bin/env bash
+
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+
+export EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5"
+export RERANK_MODEL_ID="BAAI/bge-reranker-base"
+export LLM_MODEL_ID="Intel/neural-chat-7b-v3-3"
+export TEI_EMBEDDING_ENDPOINT="http://${host_ip}:6006"
+export TEI_RERANKING_ENDPOINT="http://${host_ip}:8808"
+export TGI_LLM_ENDPOINT="http://${host_ip}:9009"
+export REDIS_URL="redis://${host_ip}:6379"
+export INDEX_NAME="rag-redis"
+export MEGA_SERVICE_HOST_IP=${host_ip}
+export EMBEDDING_SERVICE_HOST_IP=${host_ip}
+export RETRIEVER_SERVICE_HOST_IP=${host_ip}
+export RERANK_SERVICE_HOST_IP=${host_ip}
+export LLM_SERVICE_HOST_IP=${host_ip}
+export BACKEND_SERVICE_ENDPOINT="http://${host_ip}:8888/v1/chatqna"
+export DATAPREP_SERVICE_ENDPOINT="http://${host_ip}:6007/v1/dataprep"
+export DATAPREP_GET_FILE_ENDPOINT="http://${host_ip}:6008/v1/dataprep/get_file"
+export DATAPREP_DELETE_FILE_ENDPOINT="http://${host_ip}:6009/v1/dataprep/delete_file"
diff --git a/CodeGen/README.md b/CodeGen/README.md
index 1fe8304e83..879073bea8 100644
--- a/CodeGen/README.md
+++ b/CodeGen/README.md
@@ -39,8 +39,11 @@ To set up environment variables for deploying ChatQnA services, follow these ste
1. Set the required environment variables:
```bash
+# Example: host_ip="192.168.1.1"
export host_ip="External_Public_IP"
+# Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1"
export no_proxy="Your_No_Proxy"
+export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token"
```
2. If you are in a proxy environment, also set the proxy-related environment variables:
@@ -48,27 +51,28 @@ export no_proxy="Your_No_Proxy"
```bash
export http_proxy="Your_HTTP_Proxy"
export https_proxy="Your_HTTPs_Proxy"
-export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token"
```
3. Set up other environment variables:
```bash
-bash ./docker/set_env.sh
+source ./docker/set_env.sh
```
## Deploy CodeGen using Docker
### Deploy CodeGen on Gaudi
-- If your version of `Habana Driver` < 1.16.0 (check with `hl-smi`), run the following command directly to start ChatQnA services. Please find corresponding [docker_compose.yaml](./docker/gaudi/docker_compose.yaml).
+Please find corresponding [docker_compose.yaml](./docker/gaudi/docker_compose.yaml).
```bash
cd GenAIExamples/CodeGen/docker/gaudi
docker compose -f docker_compose.yaml up -d
```
-- If your version of `Habana Driver` >= 1.16.0, refer to the [Gaudi Guide](./docker/gaudi/README.md) to build docker images from source.
+> Notice: Currently only the Habana Driver 1.16.x is supported for Gaudi.
+
+Please refer to the [Gaudi Guide](./docker/gaudi/README.md) to build docker images from source.
### Deploy CodeGen on Xeon
diff --git a/CodeTrans/README.md b/CodeTrans/README.md
index 9813fe1acd..10a324ec88 100644
--- a/CodeTrans/README.md
+++ b/CodeTrans/README.md
@@ -29,8 +29,11 @@ To set up environment variables for deploying Code Translation services, follow
1. Set the required environment variables:
```bash
+# Example: host_ip="192.168.1.1"
export host_ip="External_Public_IP"
+# Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1"
export no_proxy="Your_No_Proxy"
+export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token"
```
2. If you are in a proxy environment, also set the proxy-related environment variables:
@@ -38,27 +41,28 @@ export no_proxy="Your_No_Proxy"
```bash
export http_proxy="Your_HTTP_Proxy"
export https_proxy="Your_HTTPs_Proxy"
-export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token"
```
3. Set up other environment variables:
```bash
-bash ./docker/set_env.sh
+source ./docker/set_env.sh
```
## Deploy with Docker
### Deploy Code Translation on Gaudi
-- If your version of `Habana Driver` < 1.16.0 (check with `hl-smi`), run the following command directly to start Code Translation services. Please find corresponding [docker_compose.yaml](./docker/gaudi/docker_compose.yaml).
+Please find corresponding [docker_compose.yaml](./docker/gaudi/docker_compose.yaml).
```bash
cd GenAIExamples/CodeTrans/docker/gaudi
docker compose -f docker_compose.yaml up -d
```
-- If your version of `Habana Driver` >= 1.16.0, refer to the [Gaudi Guide](./docker/gaudi/README.md) to build docker images from source.
+> Notice: Currently only the Habana Driver 1.16.x is supported for Gaudi.
+
+Please refer to the [Gaudi Guide](./docker/gaudi/README.md) to build docker images from source.
### Deploy Code Translation on Xeon
diff --git a/DocSum/README.md b/DocSum/README.md
index e8bff49c9b..ef116e9e53 100644
--- a/DocSum/README.md
+++ b/DocSum/README.md
@@ -32,8 +32,11 @@ To set up environment variables for deploying Document Summarization services, f
1. Set the required environment variables:
```bash
+# Example: host_ip="192.168.1.1"
export host_ip="External_Public_IP"
+# Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1"
export no_proxy="Your_No_Proxy"
+export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token"
```
2. If you are in a proxy environment, also set the proxy-related environment variables:
@@ -41,27 +44,28 @@ export no_proxy="Your_No_Proxy"
```bash
export http_proxy="Your_HTTP_Proxy"
export https_proxy="Your_HTTPs_Proxy"
-export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token"
```
3. Set up other environment variables:
```bash
-bash ./docker/set_env.sh
+source ./docker/set_env.sh
```
## Deploy using Docker
### Deploy on Gaudi
-If your version of `Habana Driver` < 1.16.0 (check with `hl-smi`), run the following command directly to start DocSum services. Please find corresponding [docker_compose.yaml](./docker/gaudi/docker_compose.yaml).
+Please find corresponding [docker_compose.yaml](./docker/gaudi/docker_compose.yaml).
```bash
cd GenAIExamples/DocSum/docker/gaudi/
docker compose -f docker_compose.yaml up -d
```
-If your version of `Habana Driver` >= 1.16.0, refer to the [Gaudi Guide](./docker/gaudi/README.md) to build docker images from source.
+> Notice: Currently only the Habana Driver 1.16.x is supported for Gaudi.
+
+Please refer to the [Gaudi Guide](./docker/gaudi/README.md) to build docker images from source.
### Deploy on Xeon
diff --git a/SearchQnA/README.md b/SearchQnA/README.md
index 9498905d4c..989d32fa32 100644
--- a/SearchQnA/README.md
+++ b/SearchQnA/README.md
@@ -41,10 +41,13 @@ To set up environment variables for deploying SearchQnA services, follow these s
1. Set the required environment variables:
```bash
+# Example: host_ip="192.168.1.1"
export host_ip="External_Public_IP"
+# Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1"
export no_proxy="Your_No_Proxy"
export GOOGLE_CSE_ID="Your_CSE_ID"
export GOOGLE_API_KEY="Your_Google_API_Key"
+export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token"
```
2. If you are in a proxy environment, also set the proxy-related environment variables:
@@ -52,13 +55,12 @@ export GOOGLE_API_KEY="Your_Google_API_Key"
```bash
export http_proxy="Your_HTTP_Proxy"
export https_proxy="Your_HTTPs_Proxy"
-export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token"
```
3. Set up other environment variables:
```bash
-bash ./docker/set_env.sh
+source ./docker/set_env.sh
```
## Deploy SearchQnA on Gaudi
@@ -70,7 +72,9 @@ cd GenAIExamples/SearchQnA/docker/gaudi/
docker compose up -d
```
-If your version of `Habana Driver` >= 1.16.0, refer to the [Gaudi Guide](./docker/gaudi/README.md) to build docker images from source.
+> Notice: Currently only the Habana Driver 1.16.x is supported for Gaudi.
+
+Please refer to the [Gaudi Guide](./docker/gaudi/README.md) to build docker images from source.
## Deploy SearchQnA on Xeon
diff --git a/SearchQnA/docker/gaudi/compose.yaml b/SearchQnA/docker/gaudi/compose.yaml
index 5de690cc6c..b7198e3637 100644
--- a/SearchQnA/docker/gaudi/compose.yaml
+++ b/SearchQnA/docker/gaudi/compose.yaml
@@ -23,6 +23,8 @@ services:
HABANA_VISIBLE_DEVICES: all
OMPI_MCA_btl_vader_single_copy_mechanism: none
MAX_WARMUP_SEQUENCE_LENGTH: 512
+ INIT_HCCL_ON_ACQUIRE: 0
+ ENABLE_EXPERIMENTAL_FLAGS: true
command: --model-id ${EMBEDDING_MODEL_ID} --auto-truncate
embedding:
image: opea/embedding-tei:latest