diff --git a/ChatQnA/README.md b/ChatQnA/README.md index 549fea8fb8..59359b7216 100644 --- a/ChatQnA/README.md +++ b/ChatQnA/README.md @@ -45,7 +45,9 @@ To set up environment variables for deploying ChatQnA services, follow these ste 1. Set the required environment variables: ```bash +# Example: host_ip="192.168.1.1" export host_ip="External_Public_IP" +# Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1" export no_proxy="Your_No_Proxy" export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token" ``` @@ -59,20 +61,29 @@ export https_proxy="Your_HTTPs_Proxy" 3. Set up other environment variables: +> Notice that you can only choose one command below to set up envs according to your hardware. Other that the port numbers may be set incorrectly. + ```bash -bash ./docker/set_env.sh +# on Gaudi +source ./docker/gaudi/set_env.sh +# on Xeon +source ./docker/xeon/set_env.sh +# on Nvidia GPU +source ./docker/gpu/set_env.sh ``` ## Deploy ChatQnA on Gaudi -If your version of `Habana Driver` < 1.16.0 (check with `hl-smi`), run the following command directly to start ChatQnA services. Please find corresponding [docker_compose.yaml](./docker/gaudi/docker_compose.yaml). +Please find corresponding [docker_compose.yaml](./docker/gaudi/docker_compose.yaml). ```bash cd GenAIExamples/ChatQnA/docker/gaudi/ docker compose -f docker_compose.yaml up -d ``` -If your version of `Habana Driver` >= 1.16.0, refer to the [Gaudi Guide](./docker/gaudi/README.md) to build docker images from source. +> Notice: Currently only the Habana Driver 1.16.x is supported for Gaudi. + +Please refer to the [Gaudi Guide](./docker/gaudi/README.md) to build docker images from source. ## Deploy ChatQnA on Xeon diff --git a/ChatQnA/docker/gaudi/docker_compose.yaml b/ChatQnA/docker/gaudi/docker_compose.yaml index d9de34e230..b8a420ec21 100644 --- a/ChatQnA/docker/gaudi/docker_compose.yaml +++ b/ChatQnA/docker/gaudi/docker_compose.yaml @@ -44,6 +44,8 @@ services: HABANA_VISIBLE_DEVICES: all OMPI_MCA_btl_vader_single_copy_mechanism: none MAX_WARMUP_SEQUENCE_LENGTH: 512 + INIT_HCCL_ON_ACQUIRE: 0 + ENABLE_EXPERIMENTAL_FLAGS: true command: --model-id ${EMBEDDING_MODEL_ID} embedding: image: opea/embedding-tei:latest diff --git a/ChatQnA/docker/set_env.sh b/ChatQnA/docker/gaudi/set_env.sh similarity index 100% rename from ChatQnA/docker/set_env.sh rename to ChatQnA/docker/gaudi/set_env.sh diff --git a/ChatQnA/docker/gpu/README.md b/ChatQnA/docker/gpu/README.md index d00a7ce143..78f8c5d17d 100644 --- a/ChatQnA/docker/gpu/README.md +++ b/ChatQnA/docker/gpu/README.md @@ -135,21 +135,17 @@ curl http://${host_ip}:6000/v1/embeddings \ 3. Retriever Microservice -To consume the retriever microservice, you need to generate a mock embedding vector of length 768 in Python script: +To consume the retriever microservice, you need to generate a mock embedding vector by Python script. The length of embedding vector +is determined by the embedding model. +Here we use the model `EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5"`, which vector size is 768. -```python -import random - -embedding = [random.uniform(-1, 1) for _ in range(768)] -print(embedding) -``` - -Then substitute your mock embedding vector for the `${your_embedding}` in the following `curl` command: +Check the vecotor dimension of your embedding model, set `your_embedding` dimension equals to it. ```bash +your_embedding=$(python -c "import random; embedding = [random.uniform(-1, 1) for _ in range(768)]; print(embedding)") curl http://${host_ip}:7000/v1/retrieval \ -X POST \ - -d '{"text":"test", "embedding":${your_embedding}}' \ + -d "{\"text\":\"test\",\"embedding\":${your_embedding}}" \ -H 'Content-Type: application/json' ``` diff --git a/ChatQnA/docker/gpu/set_env.sh b/ChatQnA/docker/gpu/set_env.sh new file mode 100644 index 0000000000..e024bd8a64 --- /dev/null +++ b/ChatQnA/docker/gpu/set_env.sh @@ -0,0 +1,23 @@ +#!/usr/bin/env bash + +# Copyright (C) 2024 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + + +export EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5" +export RERANK_MODEL_ID="BAAI/bge-reranker-base" +export LLM_MODEL_ID="Intel/neural-chat-7b-v3-3" +export TEI_EMBEDDING_ENDPOINT="http://${host_ip}:8090" +export TEI_RERANKING_ENDPOINT="http://${host_ip}:8808" +export TGI_LLM_ENDPOINT="http://${host_ip}:8008" +export REDIS_URL="redis://${host_ip}:6379" +export INDEX_NAME="rag-redis" +export MEGA_SERVICE_HOST_IP=${host_ip} +export EMBEDDING_SERVICE_HOST_IP=${host_ip} +export RETRIEVER_SERVICE_HOST_IP=${host_ip} +export RERANK_SERVICE_HOST_IP=${host_ip} +export LLM_SERVICE_HOST_IP=${host_ip} +export BACKEND_SERVICE_ENDPOINT="http://${host_ip}:8888/v1/chatqna" +export DATAPREP_SERVICE_ENDPOINT="http://${host_ip}:6007/v1/dataprep" +export DATAPREP_GET_FILE_ENDPOINT="http://${host_ip}:6008/v1/dataprep/get_file" +export DATAPREP_DELETE_FILE_ENDPOINT="http://${host_ip}:6009/v1/dataprep/delete_file" diff --git a/ChatQnA/docker/xeon/README.md b/ChatQnA/docker/xeon/README.md index c74070bc0e..97f77ae434 100644 --- a/ChatQnA/docker/xeon/README.md +++ b/ChatQnA/docker/xeon/README.md @@ -226,21 +226,19 @@ curl http://${host_ip}:6000/v1/embeddings\ -H 'Content-Type: application/json' ``` -3. Retriever Microservice - To validate the retriever microservice, you need to generate a mock embedding vector of length 768 in Python script: +3. Retriever Microservice -```Python -import random -embedding = [random.uniform(-1, 1) for _ in range(768)] -print(embedding) -``` +To consume the retriever microservice, you need to generate a mock embedding vector by Python script. The length of embedding vector +is determined by the embedding model. +Here we use the model `EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5"`, which vector size is 768. -Then substitute your mock embedding vector for the `${your_embedding}` in the following cURL command: +Check the vecotor dimension of your embedding model, set `your_embedding` dimension equals to it. ```bash +your_embedding=$(python -c "import random; embedding = [random.uniform(-1, 1) for _ in range(768)]; print(embedding)") curl http://${host_ip}:7000/v1/retrieval \ -X POST \ - -d '{"text":"What is the revenue of Nike in 2023?","embedding":"'"${your_embedding}"'"}' \ + -d "{\"text\":\"test\",\"embedding\":${your_embedding}}" \ -H 'Content-Type: application/json' ``` @@ -369,12 +367,30 @@ To access the frontend, open the following URL in your browser: http://{host_ip} - "80:5173" ``` -## 🚀 Launch the Conversational UI (react) +## 🚀 Launch the Conversational UI (Optional) + +To access the Conversational UI (react based) frontend, modify the UI service in the `docker_compose.yaml` file. Replace `chaqna-gaudi-ui-server` service with the `chatqna-gaudi-conversation-ui-server` service as per the config below: -To access the Conversational UI frontend, open the following URL in your browser: http://{host_ip}:5174. By default, the UI runs on port 80 internally. If you prefer to use a different host port to access the frontend, you can modify the port mapping in the `docker_compose.yaml` file as shown below: +```yaml +chaqna-gaudi-conversation-ui-server: + image: opea/chatqna-conversation-ui:latest + container_name: chatqna-gaudi-conversation-ui-server + environment: + - no_proxy=${no_proxy} + - https_proxy=${https_proxy} + - http_proxy=${http_proxy} + ports: + - "5174:80" + depends_on: + - chaqna-gaudi-backend-server + ipc: host + restart: always +``` + +Once the services are up, open the following URL in your browser: http://{host_ip}:5174. By default, the UI runs on port 80 internally. If you prefer to use a different host port to access the frontend, you can modify the port mapping in the `docker_compose.yaml` file as shown below: ```yaml - chaqna-xeon-conversation-ui-server: + chaqna-gaudi-conversation-ui-server: image: opea/chatqna-conversation-ui:latest ... ports: diff --git a/ChatQnA/docker/xeon/docker_compose.yaml b/ChatQnA/docker/xeon/docker_compose.yaml index 10c7b5d652..71818f85f3 100644 --- a/ChatQnA/docker/xeon/docker_compose.yaml +++ b/ChatQnA/docker/xeon/docker_compose.yaml @@ -189,19 +189,6 @@ services: - DELETE_FILE=${DATAPREP_DELETE_FILE_ENDPOINT} ipc: host restart: always - chaqna-xeon-conversation-ui-server: - image: opea/chatqna-conversation-ui:latest - container_name: chatqna-xeon-conversation-ui-server - environment: - - no_proxy=${no_proxy} - - https_proxy=${https_proxy} - - http_proxy=${http_proxy} - ports: - - 5174:80 - depends_on: - - chaqna-xeon-backend-server - ipc: host - restart: always networks: default: diff --git a/ChatQnA/docker/xeon/set_env.sh b/ChatQnA/docker/xeon/set_env.sh new file mode 100644 index 0000000000..888b8f3070 --- /dev/null +++ b/ChatQnA/docker/xeon/set_env.sh @@ -0,0 +1,23 @@ +#!/usr/bin/env bash + +# Copyright (C) 2024 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + + +export EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5" +export RERANK_MODEL_ID="BAAI/bge-reranker-base" +export LLM_MODEL_ID="Intel/neural-chat-7b-v3-3" +export TEI_EMBEDDING_ENDPOINT="http://${host_ip}:6006" +export TEI_RERANKING_ENDPOINT="http://${host_ip}:8808" +export TGI_LLM_ENDPOINT="http://${host_ip}:9009" +export REDIS_URL="redis://${host_ip}:6379" +export INDEX_NAME="rag-redis" +export MEGA_SERVICE_HOST_IP=${host_ip} +export EMBEDDING_SERVICE_HOST_IP=${host_ip} +export RETRIEVER_SERVICE_HOST_IP=${host_ip} +export RERANK_SERVICE_HOST_IP=${host_ip} +export LLM_SERVICE_HOST_IP=${host_ip} +export BACKEND_SERVICE_ENDPOINT="http://${host_ip}:8888/v1/chatqna" +export DATAPREP_SERVICE_ENDPOINT="http://${host_ip}:6007/v1/dataprep" +export DATAPREP_GET_FILE_ENDPOINT="http://${host_ip}:6008/v1/dataprep/get_file" +export DATAPREP_DELETE_FILE_ENDPOINT="http://${host_ip}:6009/v1/dataprep/delete_file" diff --git a/CodeGen/README.md b/CodeGen/README.md index 1fe8304e83..879073bea8 100644 --- a/CodeGen/README.md +++ b/CodeGen/README.md @@ -39,8 +39,11 @@ To set up environment variables for deploying ChatQnA services, follow these ste 1. Set the required environment variables: ```bash +# Example: host_ip="192.168.1.1" export host_ip="External_Public_IP" +# Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1" export no_proxy="Your_No_Proxy" +export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token" ``` 2. If you are in a proxy environment, also set the proxy-related environment variables: @@ -48,27 +51,28 @@ export no_proxy="Your_No_Proxy" ```bash export http_proxy="Your_HTTP_Proxy" export https_proxy="Your_HTTPs_Proxy" -export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token" ``` 3. Set up other environment variables: ```bash -bash ./docker/set_env.sh +source ./docker/set_env.sh ``` ## Deploy CodeGen using Docker ### Deploy CodeGen on Gaudi -- If your version of `Habana Driver` < 1.16.0 (check with `hl-smi`), run the following command directly to start ChatQnA services. Please find corresponding [docker_compose.yaml](./docker/gaudi/docker_compose.yaml). +Please find corresponding [docker_compose.yaml](./docker/gaudi/docker_compose.yaml). ```bash cd GenAIExamples/CodeGen/docker/gaudi docker compose -f docker_compose.yaml up -d ``` -- If your version of `Habana Driver` >= 1.16.0, refer to the [Gaudi Guide](./docker/gaudi/README.md) to build docker images from source. +> Notice: Currently only the Habana Driver 1.16.x is supported for Gaudi. + +Please refer to the [Gaudi Guide](./docker/gaudi/README.md) to build docker images from source. ### Deploy CodeGen on Xeon diff --git a/CodeTrans/README.md b/CodeTrans/README.md index 9813fe1acd..10a324ec88 100644 --- a/CodeTrans/README.md +++ b/CodeTrans/README.md @@ -29,8 +29,11 @@ To set up environment variables for deploying Code Translation services, follow 1. Set the required environment variables: ```bash +# Example: host_ip="192.168.1.1" export host_ip="External_Public_IP" +# Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1" export no_proxy="Your_No_Proxy" +export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token" ``` 2. If you are in a proxy environment, also set the proxy-related environment variables: @@ -38,27 +41,28 @@ export no_proxy="Your_No_Proxy" ```bash export http_proxy="Your_HTTP_Proxy" export https_proxy="Your_HTTPs_Proxy" -export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token" ``` 3. Set up other environment variables: ```bash -bash ./docker/set_env.sh +source ./docker/set_env.sh ``` ## Deploy with Docker ### Deploy Code Translation on Gaudi -- If your version of `Habana Driver` < 1.16.0 (check with `hl-smi`), run the following command directly to start Code Translation services. Please find corresponding [docker_compose.yaml](./docker/gaudi/docker_compose.yaml). +Please find corresponding [docker_compose.yaml](./docker/gaudi/docker_compose.yaml). ```bash cd GenAIExamples/CodeTrans/docker/gaudi docker compose -f docker_compose.yaml up -d ``` -- If your version of `Habana Driver` >= 1.16.0, refer to the [Gaudi Guide](./docker/gaudi/README.md) to build docker images from source. +> Notice: Currently only the Habana Driver 1.16.x is supported for Gaudi. + +Please refer to the [Gaudi Guide](./docker/gaudi/README.md) to build docker images from source. ### Deploy Code Translation on Xeon diff --git a/DocSum/README.md b/DocSum/README.md index e8bff49c9b..ef116e9e53 100644 --- a/DocSum/README.md +++ b/DocSum/README.md @@ -32,8 +32,11 @@ To set up environment variables for deploying Document Summarization services, f 1. Set the required environment variables: ```bash +# Example: host_ip="192.168.1.1" export host_ip="External_Public_IP" +# Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1" export no_proxy="Your_No_Proxy" +export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token" ``` 2. If you are in a proxy environment, also set the proxy-related environment variables: @@ -41,27 +44,28 @@ export no_proxy="Your_No_Proxy" ```bash export http_proxy="Your_HTTP_Proxy" export https_proxy="Your_HTTPs_Proxy" -export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token" ``` 3. Set up other environment variables: ```bash -bash ./docker/set_env.sh +source ./docker/set_env.sh ``` ## Deploy using Docker ### Deploy on Gaudi -If your version of `Habana Driver` < 1.16.0 (check with `hl-smi`), run the following command directly to start DocSum services. Please find corresponding [docker_compose.yaml](./docker/gaudi/docker_compose.yaml). +Please find corresponding [docker_compose.yaml](./docker/gaudi/docker_compose.yaml). ```bash cd GenAIExamples/DocSum/docker/gaudi/ docker compose -f docker_compose.yaml up -d ``` -If your version of `Habana Driver` >= 1.16.0, refer to the [Gaudi Guide](./docker/gaudi/README.md) to build docker images from source. +> Notice: Currently only the Habana Driver 1.16.x is supported for Gaudi. + +Please refer to the [Gaudi Guide](./docker/gaudi/README.md) to build docker images from source. ### Deploy on Xeon diff --git a/SearchQnA/README.md b/SearchQnA/README.md index 9498905d4c..989d32fa32 100644 --- a/SearchQnA/README.md +++ b/SearchQnA/README.md @@ -41,10 +41,13 @@ To set up environment variables for deploying SearchQnA services, follow these s 1. Set the required environment variables: ```bash +# Example: host_ip="192.168.1.1" export host_ip="External_Public_IP" +# Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1" export no_proxy="Your_No_Proxy" export GOOGLE_CSE_ID="Your_CSE_ID" export GOOGLE_API_KEY="Your_Google_API_Key" +export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token" ``` 2. If you are in a proxy environment, also set the proxy-related environment variables: @@ -52,13 +55,12 @@ export GOOGLE_API_KEY="Your_Google_API_Key" ```bash export http_proxy="Your_HTTP_Proxy" export https_proxy="Your_HTTPs_Proxy" -export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token" ``` 3. Set up other environment variables: ```bash -bash ./docker/set_env.sh +source ./docker/set_env.sh ``` ## Deploy SearchQnA on Gaudi @@ -70,7 +72,9 @@ cd GenAIExamples/SearchQnA/docker/gaudi/ docker compose up -d ``` -If your version of `Habana Driver` >= 1.16.0, refer to the [Gaudi Guide](./docker/gaudi/README.md) to build docker images from source. +> Notice: Currently only the Habana Driver 1.16.x is supported for Gaudi. + +Please refer to the [Gaudi Guide](./docker/gaudi/README.md) to build docker images from source. ## Deploy SearchQnA on Xeon diff --git a/SearchQnA/docker/gaudi/compose.yaml b/SearchQnA/docker/gaudi/compose.yaml index 5de690cc6c..b7198e3637 100644 --- a/SearchQnA/docker/gaudi/compose.yaml +++ b/SearchQnA/docker/gaudi/compose.yaml @@ -23,6 +23,8 @@ services: HABANA_VISIBLE_DEVICES: all OMPI_MCA_btl_vader_single_copy_mechanism: none MAX_WARMUP_SEQUENCE_LENGTH: 512 + INIT_HCCL_ON_ACQUIRE: 0 + ENABLE_EXPERIMENTAL_FLAGS: true command: --model-id ${EMBEDDING_MODEL_ID} --auto-truncate embedding: image: opea/embedding-tei:latest