diff --git a/ChatQnA/benchmark/README.md b/ChatQnA/benchmark/README.md index 52c7764b8..b666e8ce4 100644 --- a/ChatQnA/benchmark/README.md +++ b/ChatQnA/benchmark/README.md @@ -4,7 +4,7 @@ This folder contains a collection of Kubernetes manifest files for deploying the By following this guide, you can run benchmarks on your deployment and share the results with the OPEA community. -# Purpose +## Purpose We aim to run these benchmarks and share them with the OPEA community for three primary reasons: @@ -12,7 +12,7 @@ We aim to run these benchmarks and share them with the OPEA community for three - To establish a baseline for validating optimization solutions across different implementations, providing clear guidance on which methods are most effective for your use case. - To inspire the community to build upon our benchmarks, allowing us to better quantify new solutions in conjunction with current leading llms, serving frameworks etc. -# Metrics +## Metrics The benchmark will report the below metrics, including: @@ -27,9 +27,9 @@ The benchmark will report the below metrics, including: Results will be displayed in the terminal and saved as CSV file named `1_stats.csv` for easy export to spreadsheets. -# Getting Started +## Getting Started -## Prerequisites +### Prerequisites - Install Kubernetes by following [this guide](https://github.com/opea-project/docs/blob/main/guide/installation/k8s_install/k8s_install_kubespray.md). @@ -38,7 +38,7 @@ Results will be displayed in the terminal and saved as CSV file named `1_stats.c - Install Python 3.8+ on the master node for running the stress tool. - Ensure all nodes have a local /mnt/models folder, which will be mounted by the pods. -## Kubernetes Cluster Example +### Kubernetes Cluster Example ```bash $ kubectl get nodes @@ -49,7 +49,7 @@ k8s-work2 Ready 35d v1.29.6 k8s-work3 Ready 35d v1.29.6 ``` -## Manifest preparation +### Manifest preparation We have created the [BKC manifest](https://github.com/opea-project/GenAIExamples/tree/main/ChatQnA/benchmark) for single node, two nodes and four nodes K8s cluster. In order to apply, we need to check out and configure some values. @@ -75,7 +75,7 @@ find . -name '*.yaml' -type f -exec sed -i "s#\$(EMBEDDING_MODEL_ID)#${EMBEDDING find . -name '*.yaml' -type f -exec sed -i "s#\$(RERANK_MODEL_ID)#${RERANK_MODEL_ID}#g" {} \; ``` -## Benchmark tool preparation +### Benchmark tool preparation The test uses the [benchmark tool](https://github.com/opea-project/GenAIEval/tree/main/evals/benchmark) to do performance test. We need to set up benchmark tool at the master node of Kubernetes which is k8s-master. @@ -88,7 +88,7 @@ source stress_venv/bin/activate pip install -r requirements.txt ``` -## Test Configurations +### Test Configurations Workload configuration: @@ -119,11 +119,11 @@ Number of test requests for different scheduled node number: More detailed configuration can be found in configuration file [benchmark.yaml](./benchmark.yaml). -## Test Steps +### Test Steps -### Single node test +#### Single node test -#### 1. Preparation +##### 1. Preparation We add label to 1 Kubernetes node to make sure all pods are scheduled to this node: @@ -131,7 +131,7 @@ We add label to 1 Kubernetes node to make sure all pods are scheduled to this no kubectl label nodes k8s-worker1 node-type=chatqna-opea ``` -#### 2. Install ChatQnA +##### 2. Install ChatQnA Go to [BKC manifest](https://github.com/opea-project/GenAIExamples/tree/main/ChatQnA/benchmark/tuned/with_rerank/single_gaudi) and apply to K8s. @@ -141,9 +141,9 @@ cd GenAIExamples/ChatQnA/benchmark/tuned/with_rerank/single_gaudi kubectl apply -f . ``` -#### 3. Run tests +##### 3. Run tests -##### 3.1 Upload Retrieval File +###### 3.1 Upload Retrieval File Before running tests, upload a specified file to make sure the llm input have the token length of 1k. @@ -174,7 +174,7 @@ curl -X POST "http://${cluster_ip}:6007/v1/dataprep" \ -F "files=@./upload_file_no_rerank.txt" ``` -##### 3.2 Run Benchmark Test +###### 3.2 Run Benchmark Test We copy the configuration file [benchmark.yaml](./benchmark.yaml) to `GenAIEval/evals/benchmark/benchmark.yaml` and config `test_suite_config.user_queries` and `test_suite_config.test_output_dir`. @@ -191,11 +191,11 @@ cd GenAIEval/evals/benchmark python benchmark.py ``` -#### 4. Data collection +##### 4. Data collection All the test results will come to this folder `/home/sdp/benchmark_output/node_1` configured by the environment variable `TEST_OUTPUT_DIR` in previous steps. -#### 5. Clean up +##### 5. Clean up ```bash # on k8s-master node @@ -204,9 +204,9 @@ kubectl delete -f . kubectl label nodes k8s-worker1 node-type- ``` -### Two node test +#### Two node test -#### 1. Preparation +##### 1. Preparation We add label to 2 Kubernetes node to make sure all pods are scheduled to this node: @@ -214,7 +214,7 @@ We add label to 2 Kubernetes node to make sure all pods are scheduled to this no kubectl label nodes k8s-worker1 k8s-worker2 node-type=chatqna-opea ``` -#### 2. Install ChatQnA +##### 2. Install ChatQnA Go to [BKC manifest](https://github.com/opea-project/GenAIExamples/tree/main/ChatQnA/benchmark/tuned/with_rerank/two_gaudi) and apply to K8s. @@ -224,7 +224,7 @@ cd GenAIExamples/ChatQnA/benchmark/tuned/with_rerank/two_gaudi kubectl apply -f . ``` -#### 3. Run tests +##### 3. Run tests We copy the configuration file [benchmark.yaml](./benchmark.yaml) to `GenAIEval/evals/benchmark/benchmark.yaml` and config `test_suite_config.user_queries` and `test_suite_config.test_output_dir`. @@ -241,11 +241,11 @@ cd GenAIEval/evals/benchmark python benchmark.py ``` -#### 4. Data collection +##### 4. Data collection All the test results will come to this folder `/home/sdp/benchmark_output/node_2` configured by the environment variable `TEST_OUTPUT_DIR` in previous steps. -#### 5. Clean up +##### 5. Clean up ```bash # on k8s-master node @@ -253,9 +253,9 @@ kubectl delete -f . kubectl label nodes k8s-worker1 k8s-worker2 node-type- ``` -### Four node test +#### Four node test -#### 1. Preparation +##### 1. Preparation We add label to 4 Kubernetes node to make sure all pods are scheduled to this node: @@ -263,7 +263,7 @@ We add label to 4 Kubernetes node to make sure all pods are scheduled to this no kubectl label nodes k8s-master k8s-worker1 k8s-worker2 k8s-worker3 node-type=chatqna-opea ``` -#### 2. Install ChatQnA +##### 2. Install ChatQnA Go to [BKC manifest](https://github.com/opea-project/GenAIExamples/tree/main/ChatQnA/benchmark/tuned/with_rerank/four_gaudi) and apply to K8s. @@ -273,7 +273,7 @@ cd GenAIExamples/ChatQnA/benchmark/tuned/with_rerank/four_gaudi kubectl apply -f . ``` -#### 3. Run tests +##### 3. Run tests We copy the configuration file [benchmark.yaml](./benchmark.yaml) to `GenAIEval/evals/benchmark/benchmark.yaml` and config `test_suite_config.user_queries` and `test_suite_config.test_output_dir`. @@ -290,11 +290,11 @@ cd GenAIEval/evals/benchmark python benchmark.py ``` -#### 4. Data collection +##### 4. Data collection All the test results will come to this folder `/home/sdp/benchmark_output/node_4` configured by the environment variable `TEST_OUTPUT_DIR` in previous steps. -#### 5. Clean up +##### 5. Clean up ```bash # on k8s-master node diff --git a/ChatQnA/docker/aipc/README.md b/ChatQnA/docker/aipc/README.md index 25a12dd16..3d4bfebb8 100644 --- a/ChatQnA/docker/aipc/README.md +++ b/ChatQnA/docker/aipc/README.md @@ -173,97 +173,97 @@ OLLAMA_HOST=${host_ip}:11434 ollama run $OLLAMA_MODEL 1. TEI Embedding Service -```bash -curl ${host_ip}:6006/embed \ - -X POST \ - -d '{"inputs":"What is Deep Learning?"}' \ - -H 'Content-Type: application/json' -``` + ```bash + curl ${host_ip}:6006/embed \ + -X POST \ + -d '{"inputs":"What is Deep Learning?"}' \ + -H 'Content-Type: application/json' + ``` 2. Embedding Microservice -```bash -curl http://${host_ip}:6000/v1/embeddings\ - -X POST \ - -d '{"text":"hello"}' \ - -H 'Content-Type: application/json' -``` + ```bash + curl http://${host_ip}:6000/v1/embeddings\ + -X POST \ + -d '{"text":"hello"}' \ + -H 'Content-Type: application/json' + ``` 3. Retriever Microservice To validate the retriever microservice, you need to generate a mock embedding vector of length 768 in Python script: -```bash -export your_embedding=$(python3 -c "import random; embedding = [random.uniform(-1, 1) for _ in range(768)]; print(embedding)") -curl http://${host_ip}:7000/v1/retrieval \ - -X POST \ - -d '{"text":"What is the revenue of Nike in 2023?","embedding":"'"${your_embedding}"'"}' \ - -H 'Content-Type: application/json' -``` + ```bash + export your_embedding=$(python3 -c "import random; embedding = [random.uniform(-1, 1) for _ in range(768)]; print(embedding)") + curl http://${host_ip}:7000/v1/retrieval \ + -X POST \ + -d '{"text":"What is the revenue of Nike in 2023?","embedding":"'"${your_embedding}"'"}' \ + -H 'Content-Type: application/json' + ``` 4. TEI Reranking Service -```bash -curl http://${host_ip}:8808/rerank \ - -X POST \ - -d '{"query":"What is Deep Learning?", "texts": ["Deep Learning is not...", "Deep learning is..."]}' \ - -H 'Content-Type: application/json' -``` + ```bash + curl http://${host_ip}:8808/rerank \ + -X POST \ + -d '{"query":"What is Deep Learning?", "texts": ["Deep Learning is not...", "Deep learning is..."]}' \ + -H 'Content-Type: application/json' + ``` 5. Reranking Microservice -```bash -curl http://${host_ip}:8000/v1/reranking\ - -X POST \ - -d '{"initial_query":"What is Deep Learning?", "retrieved_docs": [{"text":"Deep Learning is not..."}, {"text":"Deep learning is..."}]}' \ - -H 'Content-Type: application/json' -``` + ```bash + curl http://${host_ip}:8000/v1/reranking\ + -X POST \ + -d '{"initial_query":"What is Deep Learning?", "retrieved_docs": [{"text":"Deep Learning is not..."}, {"text":"Deep learning is..."}]}' \ + -H 'Content-Type: application/json' + ``` 6. Ollama Service -```bash -curl http://${host_ip}:11434/api/generate -d '{"model": "llama3", "prompt":"What is Deep Learning?"}' -``` + ```bash + curl http://${host_ip}:11434/api/generate -d '{"model": "llama3", "prompt":"What is Deep Learning?"}' + ``` 7. LLM Microservice -```bash -curl http://${host_ip}:9000/v1/chat/completions\ - -X POST \ - -d '{"query":"What is Deep Learning?","max_new_tokens":17,"top_k":10,"top_p":0.95,"typical_p":0.95,"temperature":0.01,"repetition_penalty":1.03,"streaming":true}' \ - -H 'Content-Type: application/json' -``` + ```bash + curl http://${host_ip}:9000/v1/chat/completions\ + -X POST \ + -d '{"query":"What is Deep Learning?","max_new_tokens":17,"top_k":10,"top_p":0.95,"typical_p":0.95,"temperature":0.01,"repetition_penalty":1.03,"streaming":true}' \ + -H 'Content-Type: application/json' + ``` 8. MegaService -```bash -curl http://${host_ip}:8888/v1/chatqna -H "Content-Type: application/json" -d '{ - "messages": "What is the revenue of Nike in 2023?", "model": "'"${OLLAMA_MODEL}"'" - }' -``` + ```bash + curl http://${host_ip}:8888/v1/chatqna -H "Content-Type: application/json" -d '{ + "messages": "What is the revenue of Nike in 2023?", "model": "'"${OLLAMA_MODEL}"'" + }' + ``` 9. Dataprep Microservice(Optional) -If you want to update the default knowledge base, you can use the following commands: + If you want to update the default knowledge base, you can use the following commands: -Update Knowledge Base via Local File Upload: + Update Knowledge Base via Local File Upload: -```bash -curl -X POST "http://${host_ip}:6007/v1/dataprep" \ - -H "Content-Type: multipart/form-data" \ - -F "files=@./nke-10k-2023.pdf" -``` + ```bash + curl -X POST "http://${host_ip}:6007/v1/dataprep" \ + -H "Content-Type: multipart/form-data" \ + -F "files=@./nke-10k-2023.pdf" + ``` -This command updates a knowledge base by uploading a local file for processing. Update the file path according to your environment. + This command updates a knowledge base by uploading a local file for processing. Update the file path according to your environment. -Add Knowledge Base via HTTP Links: + Add Knowledge Base via HTTP Links: -```bash -curl -X POST "http://${host_ip}:6007/v1/dataprep" \ - -H "Content-Type: multipart/form-data" \ - -F 'link_list=["https://opea.dev"]' -``` + ```bash + curl -X POST "http://${host_ip}:6007/v1/dataprep" \ + -H "Content-Type: multipart/form-data" \ + -F 'link_list=["https://opea.dev"]' + ``` -This command updates a knowledge base by submitting a list of HTTP links for processing. + This command updates a knowledge base by submitting a list of HTTP links for processing. ## 🚀 Launch the UI diff --git a/ChatQnA/docker/gaudi/README.md b/ChatQnA/docker/gaudi/README.md index 0d39b9297..2b49a33f4 100644 --- a/ChatQnA/docker/gaudi/README.md +++ b/ChatQnA/docker/gaudi/README.md @@ -157,22 +157,22 @@ cd ../../.. Then run the command `docker images`, you will have the following 8 Docker Images: -1. `opea/embedding-tei:latest` -2. `opea/retriever-redis:latest` -3. `opea/reranking-tei:latest` -4. `opea/llm-tgi:latest` or `opea/llm-vllm:latest` or `opea/llm-vllm-ray:latest` -5. `opea/tei-gaudi:latest` -6. `opea/dataprep-redis:latest` -7. `opea/chatqna:latest` or `opea/chatqna-guardrails:latest` or `opea/chatqna-without-rerank:latest` -8. `opea/chatqna-ui:latest` +- `opea/embedding-tei:latest` +- `opea/retriever-redis:latest` +- `opea/reranking-tei:latest` +- `opea/llm-tgi:latest` or `opea/llm-vllm:latest` or `opea/llm-vllm-ray:latest` +- `opea/tei-gaudi:latest` +- `opea/dataprep-redis:latest` +- `opea/chatqna:latest` or `opea/chatqna-guardrails:latest` or `opea/chatqna-without-rerank:latest` +- `opea/chatqna-ui:latest` If Conversation React UI is built, you will find one more image: -9. `opea/chatqna-conversation-ui:latest` +- `opea/chatqna-conversation-ui:latest` If Guardrails docker image is built, you will find one more image: -10. `opea/guardrails-tgi:latest` +- `opea/guardrails-tgi:latest` ## 🚀 Start MicroServices and MegaService @@ -274,190 +274,190 @@ For validation details, please refer to [how-to-validate_service](./how_to_valid 1. TEI Embedding Service -```bash -curl ${host_ip}:8090/embed \ - -X POST \ - -d '{"inputs":"What is Deep Learning?"}' \ - -H 'Content-Type: application/json' -``` + ```bash + curl ${host_ip}:8090/embed \ + -X POST \ + -d '{"inputs":"What is Deep Learning?"}' \ + -H 'Content-Type: application/json' + ``` 2. Embedding Microservice -```bash -curl http://${host_ip}:6000/v1/embeddings \ - -X POST \ - -d '{"text":"hello"}' \ - -H 'Content-Type: application/json' -``` + ```bash + curl http://${host_ip}:6000/v1/embeddings \ + -X POST \ + -d '{"text":"hello"}' \ + -H 'Content-Type: application/json' + ``` 3. Retriever Microservice -To consume the retriever microservice, you need to generate a mock embedding vector by Python script. The length of embedding vector -is determined by the embedding model. -Here we use the model `EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5"`, which vector size is 768. + To consume the retriever microservice, you need to generate a mock embedding vector by Python script. The length of embedding vector + is determined by the embedding model. + Here we use the model `EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5"`, which vector size is 768. -Check the vecotor dimension of your embedding model, set `your_embedding` dimension equals to it. + Check the vecotor dimension of your embedding model, set `your_embedding` dimension equals to it. -```bash -export your_embedding=$(python3 -c "import random; embedding = [random.uniform(-1, 1) for _ in range(768)]; print(embedding)") -curl http://${host_ip}:7000/v1/retrieval \ - -X POST \ - -d "{\"text\":\"test\",\"embedding\":${your_embedding}}" \ - -H 'Content-Type: application/json' -``` + ```bash + export your_embedding=$(python3 -c "import random; embedding = [random.uniform(-1, 1) for _ in range(768)]; print(embedding)") + curl http://${host_ip}:7000/v1/retrieval \ + -X POST \ + -d "{\"text\":\"test\",\"embedding\":${your_embedding}}" \ + -H 'Content-Type: application/json' + ``` 4. TEI Reranking Service -> Skip for ChatQnA without Rerank pipeline + > Skip for ChatQnA without Rerank pipeline -```bash -curl http://${host_ip}:8808/rerank \ - -X POST \ - -d '{"query":"What is Deep Learning?", "texts": ["Deep Learning is not...", "Deep learning is..."]}' \ - -H 'Content-Type: application/json' -``` + ```bash + curl http://${host_ip}:8808/rerank \ + -X POST \ + -d '{"query":"What is Deep Learning?", "texts": ["Deep Learning is not...", "Deep learning is..."]}' \ + -H 'Content-Type: application/json' + ``` 5. Reranking Microservice -> Skip for ChatQnA without Rerank pipeline + > Skip for ChatQnA without Rerank pipeline -```bash -curl http://${host_ip}:8000/v1/reranking \ - -X POST \ - -d '{"initial_query":"What is Deep Learning?", "retrieved_docs": [{"text":"Deep Learning is not..."}, {"text":"Deep learning is..."}]}' \ - -H 'Content-Type: application/json' -``` + ```bash + curl http://${host_ip}:8000/v1/reranking \ + -X POST \ + -d '{"initial_query":"What is Deep Learning?", "retrieved_docs": [{"text":"Deep Learning is not..."}, {"text":"Deep learning is..."}]}' \ + -H 'Content-Type: application/json' + ``` 6. LLM backend Service -In first startup, this service will take more time to download the model files. After it's finished, the service will be ready. + In first startup, this service will take more time to download the model files. After it's finished, the service will be ready. -Try the command below to check whether the LLM serving is ready. + Try the command below to check whether the LLM serving is ready. -```bash -docker logs ${CONTAINER_ID} | grep Connected -``` + ```bash + docker logs ${CONTAINER_ID} | grep Connected + ``` -If the service is ready, you will get the response like below. + If the service is ready, you will get the response like below. -```log -2024-09-03T02:47:53.402023Z INFO text_generation_router::server: router/src/server.rs:2311: Connected -``` + ``` + 2024-09-03T02:47:53.402023Z INFO text_generation_router::server: router/src/server.rs:2311: Connected + ``` -Then try the `cURL` command below to validate services. + Then try the `cURL` command below to validate services. -```bash -#TGI Service -curl http://${host_ip}:8005/generate \ - -X POST \ - -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":64, "do_sample": true}}' \ - -H 'Content-Type: application/json' -``` + ```bash + #TGI Service + curl http://${host_ip}:8005/generate \ + -X POST \ + -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":64, "do_sample": true}}' \ + -H 'Content-Type: application/json' + ``` -```bash -#vLLM Service -curl http://${host_ip}:8007/v1/completions \ - -H "Content-Type: application/json" \ - -d '{ - "model": "${LLM_MODEL_ID}", - "prompt": "What is Deep Learning?", - "max_tokens": 32, - "temperature": 0 - }' -``` + ```bash + #vLLM Service + curl http://${host_ip}:8007/v1/completions \ + -H "Content-Type: application/json" \ + -d '{ + "model": "${LLM_MODEL_ID}", + "prompt": "What is Deep Learning?", + "max_tokens": 32, + "temperature": 0 + }' + ``` -```bash -#vLLM-on-Ray Service -curl http://${host_ip}:8006/v1/chat/completions \ - -H "Content-Type: application/json" \ - -d '{"model": "${LLM_MODEL_ID}", "messages": [{"role": "user", "content": "What is Deep Learning?"}]}' -``` + ```bash + #vLLM-on-Ray Service + curl http://${host_ip}:8006/v1/chat/completions \ + -H "Content-Type: application/json" \ + -d '{"model": "${LLM_MODEL_ID}", "messages": [{"role": "user", "content": "What is Deep Learning?"}]}' + ``` 7. LLM Microservice -```bash -curl http://${host_ip}:9000/v1/chat/completions \ - -X POST \ - -d '{"query":"What is Deep Learning?","max_new_tokens":17,"top_k":10,"top_p":0.95,"typical_p":0.95,"temperature":0.01,"repetition_penalty":1.03,"streaming":true}' \ - -H 'Content-Type: application/json' -``` + ```bash + curl http://${host_ip}:9000/v1/chat/completions \ + -X POST \ + -d '{"query":"What is Deep Learning?","max_new_tokens":17,"top_k":10,"top_p":0.95,"typical_p":0.95,"temperature":0.01,"repetition_penalty":1.03,"streaming":true}' \ + -H 'Content-Type: application/json' + ``` 8. MegaService -```bash -curl http://${host_ip}:8888/v1/chatqna -H "Content-Type: application/json" -d '{ - "messages": "What is the revenue of Nike in 2023?" - }' -``` + ```bash + curl http://${host_ip}:8888/v1/chatqna -H "Content-Type: application/json" -d '{ + "messages": "What is the revenue of Nike in 2023?" + }' + ``` 9. Dataprep Microservice(Optional) -If you want to update the default knowledge base, you can use the following commands: + If you want to update the default knowledge base, you can use the following commands: -Update Knowledge Base via Local File Upload: + Update Knowledge Base via Local File Upload: -```bash -curl -X POST "http://${host_ip}:6007/v1/dataprep" \ - -H "Content-Type: multipart/form-data" \ - -F "files=@./nke-10k-2023.pdf" -``` + ```bash + curl -X POST "http://${host_ip}:6007/v1/dataprep" \ + -H "Content-Type: multipart/form-data" \ + -F "files=@./nke-10k-2023.pdf" + ``` -This command updates a knowledge base by uploading a local file for processing. Update the file path according to your environment. + This command updates a knowledge base by uploading a local file for processing. Update the file path according to your environment. -Add Knowledge Base via HTTP Links: + Add Knowledge Base via HTTP Links: -```bash -curl -X POST "http://${host_ip}:6007/v1/dataprep" \ - -H "Content-Type: multipart/form-data" \ - -F 'link_list=["https://opea.dev"]' -``` + ```bash + curl -X POST "http://${host_ip}:6007/v1/dataprep" \ + -H "Content-Type: multipart/form-data" \ + -F 'link_list=["https://opea.dev"]' + ``` -This command updates a knowledge base by submitting a list of HTTP links for processing. + This command updates a knowledge base by submitting a list of HTTP links for processing. -Also, you are able to get the file/link list that you uploaded: + Also, you are able to get the file/link list that you uploaded: -```bash -curl -X POST "http://${host_ip}:6007/v1/dataprep/get_file" \ - -H "Content-Type: application/json" -``` + ```bash + curl -X POST "http://${host_ip}:6007/v1/dataprep/get_file" \ + -H "Content-Type: application/json" + ``` -Then you will get the response JSON like this. Notice that the returned `name`/`id` of the uploaded link is `https://xxx.txt`. - -```json -[ - { - "name": "nke-10k-2023.pdf", - "id": "nke-10k-2023.pdf", - "type": "File", - "parent": "" - }, - { - "name": "https://opea.dev.txt", - "id": "https://opea.dev.txt", - "type": "File", - "parent": "" - } -] -``` + Then you will get the response JSON like this. Notice that the returned `name`/`id` of the uploaded link is `https://xxx.txt`. + + ```json + [ + { + "name": "nke-10k-2023.pdf", + "id": "nke-10k-2023.pdf", + "type": "File", + "parent": "" + }, + { + "name": "https://opea.dev.txt", + "id": "https://opea.dev.txt", + "type": "File", + "parent": "" + } + ] + ``` -To delete the file/link you uploaded: + To delete the file/link you uploaded: -```bash -# delete link -curl -X POST "http://${host_ip}:6007/v1/dataprep/delete_file" \ - -d '{"file_path": "https://opea.dev.txt"}' \ - -H "Content-Type: application/json" - -# delete file -curl -X POST "http://${host_ip}:6007/v1/dataprep/delete_file" \ - -d '{"file_path": "nke-10k-2023.pdf"}' \ - -H "Content-Type: application/json" - -# delete all uploaded files and links -curl -X POST "http://${host_ip}:6007/v1/dataprep/delete_file" \ - -d '{"file_path": "all"}' \ - -H "Content-Type: application/json" -``` + ```bash + # delete link + curl -X POST "http://${host_ip}:6007/v1/dataprep/delete_file" \ + -d '{"file_path": "https://opea.dev.txt"}' \ + -H "Content-Type: application/json" + + # delete file + curl -X POST "http://${host_ip}:6007/v1/dataprep/delete_file" \ + -d '{"file_path": "nke-10k-2023.pdf"}' \ + -H "Content-Type: application/json" + + # delete all uploaded files and links + curl -X POST "http://${host_ip}:6007/v1/dataprep/delete_file" \ + -d '{"file_path": "all"}' \ + -H "Content-Type: application/json" + ``` 10. Guardrails (Optional) diff --git a/ChatQnA/docker/gaudi/how_to_validate_service.md b/ChatQnA/docker/gaudi/how_to_validate_service.md index 0e58491eb..fb039fad9 100644 --- a/ChatQnA/docker/gaudi/how_to_validate_service.md +++ b/ChatQnA/docker/gaudi/how_to_validate_service.md @@ -78,7 +78,7 @@ Check the log of container by: View the logs of `ghcr.io/huggingface/tgi-gaudi:1.2.1` -#docker logs 05c40b636239 -t +`docker logs 05c40b636239 -t` ``` ... diff --git a/ChatQnA/docker/gpu/README.md b/ChatQnA/docker/gpu/README.md index 022cfdfae..eee076bf8 100644 --- a/ChatQnA/docker/gpu/README.md +++ b/ChatQnA/docker/gpu/README.md @@ -140,147 +140,147 @@ docker compose up -d 1. TEI Embedding Service -```bash -curl ${host_ip}:8090/embed \ - -X POST \ - -d '{"inputs":"What is Deep Learning?"}' \ - -H 'Content-Type: application/json' -``` + ```bash + curl ${host_ip}:8090/embed \ + -X POST \ + -d '{"inputs":"What is Deep Learning?"}' \ + -H 'Content-Type: application/json' + ``` 2. Embedding Microservice -```bash -curl http://${host_ip}:6000/v1/embeddings \ - -X POST \ - -d '{"text":"hello"}' \ - -H 'Content-Type: application/json' -``` + ```bash + curl http://${host_ip}:6000/v1/embeddings \ + -X POST \ + -d '{"text":"hello"}' \ + -H 'Content-Type: application/json' + ``` 3. Retriever Microservice -To consume the retriever microservice, you need to generate a mock embedding vector by Python script. The length of embedding vector -is determined by the embedding model. -Here we use the model `EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5"`, which vector size is 768. + To consume the retriever microservice, you need to generate a mock embedding vector by Python script. The length of embedding vector + is determined by the embedding model. + Here we use the model `EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5"`, which vector size is 768. -Check the vecotor dimension of your embedding model, set `your_embedding` dimension equals to it. + Check the vecotor dimension of your embedding model, set `your_embedding` dimension equals to it. -```bash -export your_embedding=$(python3 -c "import random; embedding = [random.uniform(-1, 1) for _ in range(768)]; print(embedding)") -curl http://${host_ip}:7000/v1/retrieval \ - -X POST \ - -d "{\"text\":\"test\",\"embedding\":${your_embedding}}" \ - -H 'Content-Type: application/json' -``` + ```bash + export your_embedding=$(python3 -c "import random; embedding = [random.uniform(-1, 1) for _ in range(768)]; print(embedding)") + curl http://${host_ip}:7000/v1/retrieval \ + -X POST \ + -d "{\"text\":\"test\",\"embedding\":${your_embedding}}" \ + -H 'Content-Type: application/json' + ``` 4. TEI Reranking Service -```bash -curl http://${host_ip}:8808/rerank \ - -X POST \ - -d '{"query":"What is Deep Learning?", "texts": ["Deep Learning is not...", "Deep learning is..."]}' \ - -H 'Content-Type: application/json' -``` + ```bash + curl http://${host_ip}:8808/rerank \ + -X POST \ + -d '{"query":"What is Deep Learning?", "texts": ["Deep Learning is not...", "Deep learning is..."]}' \ + -H 'Content-Type: application/json' + ``` 5. Reranking Microservice -```bash -curl http://${host_ip}:8000/v1/reranking \ - -X POST \ - -d '{"initial_query":"What is Deep Learning?", "retrieved_docs": [{"text":"Deep Learning is not..."}, {"text":"Deep learning is..."}]}' \ - -H 'Content-Type: application/json' -``` + ```bash + curl http://${host_ip}:8000/v1/reranking \ + -X POST \ + -d '{"initial_query":"What is Deep Learning?", "retrieved_docs": [{"text":"Deep Learning is not..."}, {"text":"Deep learning is..."}]}' \ + -H 'Content-Type: application/json' + ``` 6. TGI Service -In first startup, this service will take more time to download the model files. After it's finished, the service will be ready. + In first startup, this service will take more time to download the model files. After it's finished, the service will be ready. -Try the command below to check whether the TGI service is ready. + Try the command below to check whether the TGI service is ready. -```bash -docker logs ${CONTAINER_ID} | grep Connected -``` + ```bash + docker logs ${CONTAINER_ID} | grep Connected + ``` -If the service is ready, you will get the response like below. + If the service is ready, you will get the response like below. -```log -2024-09-03T02:47:53.402023Z INFO text_generation_router::server: router/src/server.rs:2311: Connected -``` + ``` + 2024-09-03T02:47:53.402023Z INFO text_generation_router::server: router/src/server.rs:2311: Connected + ``` -Then try the `cURL` command below to validate TGI. + Then try the `cURL` command below to validate TGI. -```bash -curl http://${host_ip}:8008/generate \ - -X POST \ - -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":64, "do_sample": true}}' \ - -H 'Content-Type: application/json' -``` + ```bash + curl http://${host_ip}:8008/generate \ + -X POST \ + -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":64, "do_sample": true}}' \ + -H 'Content-Type: application/json' + ``` 7. LLM Microservice -```bash -curl http://${host_ip}:9000/v1/chat/completions \ - -X POST \ - -d '{"query":"What is Deep Learning?","max_new_tokens":17,"top_k":10,"top_p":0.95,"typical_p":0.95,"temperature":0.01,"repetition_penalty":1.03,"streaming":true}' \ - -H 'Content-Type: application/json' -``` + ```bash + curl http://${host_ip}:9000/v1/chat/completions \ + -X POST \ + -d '{"query":"What is Deep Learning?","max_new_tokens":17,"top_k":10,"top_p":0.95,"typical_p":0.95,"temperature":0.01,"repetition_penalty":1.03,"streaming":true}' \ + -H 'Content-Type: application/json' + ``` 8. MegaService -```bash -curl http://${host_ip}:8888/v1/chatqna -H "Content-Type: application/json" -d '{ - "messages": "What is the revenue of Nike in 2023?" - }' -``` + ```bash + curl http://${host_ip}:8888/v1/chatqna -H "Content-Type: application/json" -d '{ + "messages": "What is the revenue of Nike in 2023?" + }' + ``` 9. Dataprep Microservice(Optional) -If you want to update the default knowledge base, you can use the following commands: + If you want to update the default knowledge base, you can use the following commands: -Update Knowledge Base via Local File Upload: + Update Knowledge Base via Local File Upload: -```bash -curl -X POST "http://${host_ip}:6007/v1/dataprep" \ - -H "Content-Type: multipart/form-data" \ - -F "files=@./nke-10k-2023.pdf" -``` + ```bash + curl -X POST "http://${host_ip}:6007/v1/dataprep" \ + -H "Content-Type: multipart/form-data" \ + -F "files=@./nke-10k-2023.pdf" + ``` -This command updates a knowledge base by uploading a local file for processing. Update the file path according to your environment. + This command updates a knowledge base by uploading a local file for processing. Update the file path according to your environment. -Add Knowledge Base via HTTP Links: + Add Knowledge Base via HTTP Links: -```bash -curl -X POST "http://${host_ip}:6007/v1/dataprep" \ - -H "Content-Type: multipart/form-data" \ - -F 'link_list=["https://opea.dev"]' -``` + ```bash + curl -X POST "http://${host_ip}:6007/v1/dataprep" \ + -H "Content-Type: multipart/form-data" \ + -F 'link_list=["https://opea.dev"]' + ``` -This command updates a knowledge base by submitting a list of HTTP links for processing. + This command updates a knowledge base by submitting a list of HTTP links for processing. -Also, you are able to get the file list that you uploaded: + Also, you are able to get the file list that you uploaded: -```bash -curl -X POST "http://${host_ip}:6007/v1/dataprep/get_file" \ - -H "Content-Type: application/json" -``` + ```bash + curl -X POST "http://${host_ip}:6007/v1/dataprep/get_file" \ + -H "Content-Type: application/json" + ``` -To delete the file/link you uploaded: + To delete the file/link you uploaded: -```bash -# delete link -curl -X POST "http://${host_ip}:6007/v1/dataprep/delete_file" \ - -d '{"file_path": "https://opea.dev"}' \ - -H "Content-Type: application/json" - -# delete file -curl -X POST "http://${host_ip}:6007/v1/dataprep/delete_file" \ - -d '{"file_path": "nke-10k-2023.pdf"}' \ - -H "Content-Type: application/json" - -# delete all uploaded files and links -curl -X POST "http://${host_ip}:6007/v1/dataprep/delete_file" \ - -d '{"file_path": "all"}' \ - -H "Content-Type: application/json" -``` + ```bash + # delete link + curl -X POST "http://${host_ip}:6007/v1/dataprep/delete_file" \ + -d '{"file_path": "https://opea.dev"}' \ + -H "Content-Type: application/json" + + # delete file + curl -X POST "http://${host_ip}:6007/v1/dataprep/delete_file" \ + -d '{"file_path": "nke-10k-2023.pdf"}' \ + -H "Content-Type: application/json" + + # delete all uploaded files and links + curl -X POST "http://${host_ip}:6007/v1/dataprep/delete_file" \ + -d '{"file_path": "all"}' \ + -H "Content-Type: application/json" + ``` ## 🚀 Launch the UI diff --git a/ChatQnA/docker/xeon/README.md b/ChatQnA/docker/xeon/README.md index c736c674a..a28128da0 100644 --- a/ChatQnA/docker/xeon/README.md +++ b/ChatQnA/docker/xeon/README.md @@ -269,182 +269,182 @@ docker compose -f compose_vllm.yaml up -d 1. TEI Embedding Service -```bash -curl ${host_ip}:6006/embed \ - -X POST \ - -d '{"inputs":"What is Deep Learning?"}' \ - -H 'Content-Type: application/json' -``` + ```bash + curl ${host_ip}:6006/embed \ + -X POST \ + -d '{"inputs":"What is Deep Learning?"}' \ + -H 'Content-Type: application/json' + ``` 2. Embedding Microservice -```bash -curl http://${host_ip}:6000/v1/embeddings\ - -X POST \ - -d '{"text":"hello"}' \ - -H 'Content-Type: application/json' -``` + ```bash + curl http://${host_ip}:6000/v1/embeddings\ + -X POST \ + -d '{"text":"hello"}' \ + -H 'Content-Type: application/json' + ``` 3. Retriever Microservice -To consume the retriever microservice, you need to generate a mock embedding vector by Python script. The length of embedding vector -is determined by the embedding model. -Here we use the model `EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5"`, which vector size is 768. + To consume the retriever microservice, you need to generate a mock embedding vector by Python script. The length of embedding vector + is determined by the embedding model. + Here we use the model `EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5"`, which vector size is 768. -Check the vecotor dimension of your embedding model, set `your_embedding` dimension equals to it. + Check the vector dimension of your embedding model, set `your_embedding` dimension equals to it. -```bash -export your_embedding=$(python3 -c "import random; embedding = [random.uniform(-1, 1) for _ in range(768)]; print(embedding)") -curl http://${host_ip}:7000/v1/retrieval \ - -X POST \ - -d "{\"text\":\"test\",\"embedding\":${your_embedding}}" \ - -H 'Content-Type: application/json' -``` + ```bash + export your_embedding=$(python3 -c "import random; embedding = [random.uniform(-1, 1) for _ in range(768)]; print(embedding)") + curl http://${host_ip}:7000/v1/retrieval \ + -X POST \ + -d "{\"text\":\"test\",\"embedding\":${your_embedding}}" \ + -H 'Content-Type: application/json' + ``` 4. TEI Reranking Service -> Skip for ChatQnA without Rerank pipeline + > Skip for ChatQnA without Rerank pipeline -```bash -curl http://${host_ip}:8808/rerank \ - -X POST \ - -d '{"query":"What is Deep Learning?", "texts": ["Deep Learning is not...", "Deep learning is..."]}' \ - -H 'Content-Type: application/json' -``` + ```bash + curl http://${host_ip}:8808/rerank \ + -X POST \ + -d '{"query":"What is Deep Learning?", "texts": ["Deep Learning is not...", "Deep learning is..."]}' \ + -H 'Content-Type: application/json' + ``` 5. Reranking Microservice -> Skip for ChatQnA without Rerank pipeline + > Skip for ChatQnA without Rerank pipeline -```bash -curl http://${host_ip}:8000/v1/reranking\ - -X POST \ - -d '{"initial_query":"What is Deep Learning?", "retrieved_docs": [{"text":"Deep Learning is not..."}, {"text":"Deep learning is..."}]}' \ - -H 'Content-Type: application/json' -``` + ```bash + curl http://${host_ip}:8000/v1/reranking\ + -X POST \ + -d '{"initial_query":"What is Deep Learning?", "retrieved_docs": [{"text":"Deep Learning is not..."}, {"text":"Deep learning is..."}]}' \ + -H 'Content-Type: application/json' + ``` 6. LLM backend Service -In first startup, this service will take more time to download the model files. After it's finished, the service will be ready. + In first startup, this service will take more time to download the model files. After it's finished, the service will be ready. -Try the command below to check whether the LLM serving is ready. + Try the command below to check whether the LLM serving is ready. -```bash -docker logs ${CONTAINER_ID} | grep Connected -``` + ```bash + docker logs ${CONTAINER_ID} | grep Connected + ``` -If the service is ready, you will get the response like below. + If the service is ready, you will get the response like below. -```log -2024-09-03T02:47:53.402023Z INFO text_generation_router::server: router/src/server.rs:2311: Connected -``` + ``` + 2024-09-03T02:47:53.402023Z INFO text_generation_router::server: router/src/server.rs:2311: Connected + ``` -Then try the `cURL` command below to validate services. + Then try the `cURL` command below to validate services. -```bash -# TGI service -curl http://${host_ip}:9009/generate \ - -X POST \ - -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":17, "do_sample": true}}' \ - -H 'Content-Type: application/json' -``` + ```bash + # TGI service + curl http://${host_ip}:9009/generate \ + -X POST \ + -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":17, "do_sample": true}}' \ + -H 'Content-Type: application/json' + ``` -```bash -# vLLM Service -curl http://${host_ip}:9009/v1/completions \ - -H "Content-Type: application/json" \ - -d '{"model": "Intel/neural-chat-7b-v3-3", "prompt": "What is Deep Learning?", "max_tokens": 32, "temperature": 0}' -``` + ```bash + # vLLM Service + curl http://${host_ip}:9009/v1/completions \ + -H "Content-Type: application/json" \ + -d '{"model": "Intel/neural-chat-7b-v3-3", "prompt": "What is Deep Learning?", "max_tokens": 32, "temperature": 0}' + ``` 7. LLM Microservice -This service depends on above LLM backend service startup. It will be ready after long time, to wait for them being ready in first startup. + This service depends on above LLM backend service startup. It will be ready after long time, to wait for them being ready in first startup. -```bash -curl http://${host_ip}:9000/v1/chat/completions\ - -X POST \ - -d '{"query":"What is Deep Learning?","max_new_tokens":17,"top_k":10,"top_p":0.95,"typical_p":0.95,"temperature":0.01,"repetition_penalty":1.03,"streaming":true}' \ - -H 'Content-Type: application/json' -``` + ```bash + curl http://${host_ip}:9000/v1/chat/completions\ + -X POST \ + -d '{"query":"What is Deep Learning?","max_new_tokens":17,"top_k":10,"top_p":0.95,"typical_p":0.95,"temperature":0.01,"repetition_penalty":1.03,"streaming":true}' \ + -H 'Content-Type: application/json' + ``` 8. MegaService -```bash -curl http://${host_ip}:8888/v1/chatqna -H "Content-Type: application/json" -d '{ - "messages": "What is the revenue of Nike in 2023?" - }' -``` + ```bash + curl http://${host_ip}:8888/v1/chatqna -H "Content-Type: application/json" -d '{ + "messages": "What is the revenue of Nike in 2023?" + }' + ``` 9. Dataprep Microservice(Optional) -If you want to update the default knowledge base, you can use the following commands: + If you want to update the default knowledge base, you can use the following commands: -Update Knowledge Base via Local File [nke-10k-2023.pdf](https://github.com/opea-project/GenAIComps/blob/main/comps/retrievers/langchain/redis/data/nke-10k-2023.pdf) Upload: + Update Knowledge Base via Local File [nke-10k-2023.pdf](https://github.com/opea-project/GenAIComps/blob/main/comps/retrievers/langchain/redis/data/nke-10k-2023.pdf) Upload: -```bash -curl -X POST "http://${host_ip}:6007/v1/dataprep" \ - -H "Content-Type: multipart/form-data" \ - -F "files=@./nke-10k-2023.pdf" -``` + ```bash + curl -X POST "http://${host_ip}:6007/v1/dataprep" \ + -H "Content-Type: multipart/form-data" \ + -F "files=@./nke-10k-2023.pdf" + ``` -This command updates a knowledge base by uploading a local file for processing. Update the file path according to your environment. + This command updates a knowledge base by uploading a local file for processing. Update the file path according to your environment. -Add Knowledge Base via HTTP Links: + Add Knowledge Base via HTTP Links: -```bash -curl -X POST "http://${host_ip}:6007/v1/dataprep" \ - -H "Content-Type: multipart/form-data" \ - -F 'link_list=["https://opea.dev"]' -``` + ```bash + curl -X POST "http://${host_ip}:6007/v1/dataprep" \ + -H "Content-Type: multipart/form-data" \ + -F 'link_list=["https://opea.dev"]' + ``` -This command updates a knowledge base by submitting a list of HTTP links for processing. + This command updates a knowledge base by submitting a list of HTTP links for processing. -Also, you are able to get the file list that you uploaded: + Also, you are able to get the file list that you uploaded: -```bash -curl -X POST "http://${host_ip}:6007/v1/dataprep/get_file" \ - -H "Content-Type: application/json" -``` + ```bash + curl -X POST "http://${host_ip}:6007/v1/dataprep/get_file" \ + -H "Content-Type: application/json" + ``` -Then you will get the response JSON like this. Notice that the returned `name`/`id` of the uploaded link is `https://xxx.txt`. - -```json -[ - { - "name": "nke-10k-2023.pdf", - "id": "nke-10k-2023.pdf", - "type": "File", - "parent": "" - }, - { - "name": "https://opea.dev.txt", - "id": "https://opea.dev.txt", - "type": "File", - "parent": "" - } -] -``` + Then you will get the response JSON like this. Notice that the returned `name`/`id` of the uploaded link is `https://xxx.txt`. + + ```json + [ + { + "name": "nke-10k-2023.pdf", + "id": "nke-10k-2023.pdf", + "type": "File", + "parent": "" + }, + { + "name": "https://opea.dev.txt", + "id": "https://opea.dev.txt", + "type": "File", + "parent": "" + } + ] + ``` -To delete the file/link you uploaded: + To delete the file/link you uploaded: -The `file_path` here should be the `id` get from `/v1/dataprep/get_file` API. + The `file_path` here should be the `id` get from `/v1/dataprep/get_file` API. -```bash -# delete link -curl -X POST "http://${host_ip}:6007/v1/dataprep/delete_file" \ - -d '{"file_path": "https://opea.dev.txt"}' \ - -H "Content-Type: application/json" - -# delete file -curl -X POST "http://${host_ip}:6007/v1/dataprep/delete_file" \ - -d '{"file_path": "nke-10k-2023.pdf"}' \ - -H "Content-Type: application/json" - -# delete all uploaded files and links -curl -X POST "http://${host_ip}:6007/v1/dataprep/delete_file" \ - -d '{"file_path": "all"}' \ - -H "Content-Type: application/json" -``` + ```bash + # delete link + curl -X POST "http://${host_ip}:6007/v1/dataprep/delete_file" \ + -d '{"file_path": "https://opea.dev.txt"}' \ + -H "Content-Type: application/json" + + # delete file + curl -X POST "http://${host_ip}:6007/v1/dataprep/delete_file" \ + -d '{"file_path": "nke-10k-2023.pdf"}' \ + -H "Content-Type: application/json" + + # delete all uploaded files and links + curl -X POST "http://${host_ip}:6007/v1/dataprep/delete_file" \ + -d '{"file_path": "all"}' \ + -H "Content-Type: application/json" + ``` ## 🚀 Launch the UI diff --git a/ChatQnA/docker/xeon/README_qdrant.md b/ChatQnA/docker/xeon/README_qdrant.md index a03b563b2..d8f0c9de6 100644 --- a/ChatQnA/docker/xeon/README_qdrant.md +++ b/ChatQnA/docker/xeon/README_qdrant.md @@ -224,119 +224,119 @@ docker compose -f compose_qdrant.yaml up -d 1. TEI Embedding Service -```bash -curl ${host_ip}:6040/embed \ - -X POST \ - -d '{"inputs":"What is Deep Learning?"}' \ - -H 'Content-Type: application/json' -``` + ```bash + curl ${host_ip}:6040/embed \ + -X POST \ + -d '{"inputs":"What is Deep Learning?"}' \ + -H 'Content-Type: application/json' + ``` 2. Embedding Microservice -```bash -curl http://${host_ip}:6044/v1/embeddings\ - -X POST \ - -d '{"text":"hello"}' \ - -H 'Content-Type: application/json' -``` + ```bash + curl http://${host_ip}:6044/v1/embeddings\ + -X POST \ + -d '{"text":"hello"}' \ + -H 'Content-Type: application/json' + ``` 3. Retriever Microservice -To consume the retriever microservice, you need to generate a mock embedding vector by Python script. The length of embedding vector -is determined by the embedding model. -Here we use the model `EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5"`, which vector size is 768. + To consume the retriever microservice, you need to generate a mock embedding vector by Python script. The length of embedding vector + is determined by the embedding model. + Here we use the model `EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5"`, which vector size is 768. -Check the vecotor dimension of your embedding model, set `your_embedding` dimension equals to it. + Check the vecotor dimension of your embedding model, set `your_embedding` dimension equals to it. -```bash -export your_embedding=$(python3 -c "import random; embedding = [random.uniform(-1, 1) for _ in range(768)]; print(embedding)") -curl http://${host_ip}:6045/v1/retrieval \ - -X POST \ - -d '{"text":"What is the revenue of Nike in 2023?","embedding":"'"${your_embedding}"'"}' \ - -H 'Content-Type: application/json' -``` + ```bash + export your_embedding=$(python3 -c "import random; embedding = [random.uniform(-1, 1) for _ in range(768)]; print(embedding)") + curl http://${host_ip}:6045/v1/retrieval \ + -X POST \ + -d '{"text":"What is the revenue of Nike in 2023?","embedding":"'"${your_embedding}"'"}' \ + -H 'Content-Type: application/json' + ``` 4. TEI Reranking Service -```bash -curl http://${host_ip}:6041/rerank \ - -X POST \ - -d '{"query":"What is Deep Learning?", "texts": ["Deep Learning is not...", "Deep learning is..."]}' \ - -H 'Content-Type: application/json' -``` + ```bash + curl http://${host_ip}:6041/rerank \ + -X POST \ + -d '{"query":"What is Deep Learning?", "texts": ["Deep Learning is not...", "Deep learning is..."]}' \ + -H 'Content-Type: application/json' + ``` 5. Reranking Microservice -```bash -curl http://${host_ip}:6046/v1/reranking\ - -X POST \ - -d '{"initial_query":"What is Deep Learning?", "retrieved_docs": [{"text":"Deep Learning is not..."}, {"text":"Deep learning is..."}]}' \ - -H 'Content-Type: application/json' -``` + ```bash + curl http://${host_ip}:6046/v1/reranking\ + -X POST \ + -d '{"initial_query":"What is Deep Learning?", "retrieved_docs": [{"text":"Deep Learning is not..."}, {"text":"Deep learning is..."}]}' \ + -H 'Content-Type: application/json' + ``` 6. TGI Service -In first startup, this service will take more time to download the model files. After it's finished, the service will be ready. + In first startup, this service will take more time to download the model files. After it's finished, the service will be ready. -Try the command below to check whether the TGI service is ready. + Try the command below to check whether the TGI service is ready. -```bash -docker logs ${CONTAINER_ID} | grep Connected -``` + ```bash + docker logs ${CONTAINER_ID} | grep Connected + ``` -If the service is ready, you will get the response like below. + If the service is ready, you will get the response like below. -```log -2024-09-03T02:47:53.402023Z INFO text_generation_router::server: router/src/server.rs:2311: Connected -``` + ``` + 2024-09-03T02:47:53.402023Z INFO text_generation_router::server: router/src/server.rs:2311: Connected + ``` -Then try the `cURL` command below to validate TGI. + Then try the `cURL` command below to validate TGI. -```bash -curl http://${host_ip}:6042/generate \ - -X POST \ - -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":17, "do_sample": true}}' \ - -H 'Content-Type: application/json' -``` + ```bash + curl http://${host_ip}:6042/generate \ + -X POST \ + -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":17, "do_sample": true}}' \ + -H 'Content-Type: application/json' + ``` 7. LLM Microservice -```bash -curl http://${host_ip}:6047/v1/chat/completions\ - -X POST \ - -d '{"query":"What is Deep Learning?","max_new_tokens":17,"top_k":10,"top_p":0.95,"typical_p":0.95,"temperature":0.01,"repetition_penalty":1.03,"streaming":true}' \ - -H 'Content-Type: application/json' -``` + ```bash + curl http://${host_ip}:6047/v1/chat/completions\ + -X POST \ + -d '{"query":"What is Deep Learning?","max_new_tokens":17,"top_k":10,"top_p":0.95,"typical_p":0.95,"temperature":0.01,"repetition_penalty":1.03,"streaming":true}' \ + -H 'Content-Type: application/json' + ``` 8. MegaService -```bash -curl http://${host_ip}:8912/v1/chatqna -H "Content-Type: application/json" -d '{ - "messages": "What is the revenue of Nike in 2023?" - }' -``` + ```bash + curl http://${host_ip}:8912/v1/chatqna -H "Content-Type: application/json" -d '{ + "messages": "What is the revenue of Nike in 2023?" + }' + ``` 9. Dataprep Microservice(Optional) -If you want to update the default knowledge base, you can use the following commands: + If you want to update the default knowledge base, you can use the following commands: -Update Knowledge Base via Local File Upload: + Update Knowledge Base via Local File Upload: -```bash -curl -X POST "http://${host_ip}:6043/v1/dataprep" \ - -H "Content-Type: multipart/form-data" \ - -F "files=@./your_file.pdf" -``` + ```bash + curl -X POST "http://${host_ip}:6043/v1/dataprep" \ + -H "Content-Type: multipart/form-data" \ + -F "files=@./your_file.pdf" + ``` -This command updates a knowledge base by uploading a local file for processing. Update the file path according to your environment. + This command updates a knowledge base by uploading a local file for processing. Update the file path according to your environment. -Add Knowledge Base via HTTP Links: + Add Knowledge Base via HTTP Links: -```bash -curl -X POST "http://${host_ip}:6043/v1/dataprep" \ - -H "Content-Type: multipart/form-data" \ - -F 'link_list=["https://opea.dev"]' -``` + ```bash + curl -X POST "http://${host_ip}:6043/v1/dataprep" \ + -H "Content-Type: multipart/form-data" \ + -F 'link_list=["https://opea.dev"]' + ``` ## 🚀 Launch the UI diff --git a/ChatQnA/kubernetes/manifests/README.md b/ChatQnA/kubernetes/manifests/README.md index 4f44b65cd..cb4ff7e12 100644 --- a/ChatQnA/kubernetes/manifests/README.md +++ b/ChatQnA/kubernetes/manifests/README.md @@ -3,9 +3,9 @@ > [NOTE] > The following values must be set before you can deploy: > HUGGINGFACEHUB_API_TOKEN - +> > You can also customize the "MODEL_ID" if needed. - +> > You need to make sure you have created the directory `/mnt/opea-models` to save the cached model on the node where the ChatQnA workload is running. Otherwise, you need to modify the `chatqna.yaml` file to change the `model-volume` to a directory that exists on the node. ## Deploy On Xeon diff --git a/CodeGen/docker/gaudi/README.md b/CodeGen/docker/gaudi/README.md index a563f416b..5c440cade 100644 --- a/CodeGen/docker/gaudi/README.md +++ b/CodeGen/docker/gaudi/README.md @@ -110,29 +110,29 @@ docker compose up -d 1. TGI Service -```bash -curl http://${host_ip}:8028/generate \ - -X POST \ - -d '{"inputs":"Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception.","parameters":{"max_new_tokens":256, "do_sample": true}}' \ - -H 'Content-Type: application/json' -``` + ```bash + curl http://${host_ip}:8028/generate \ + -X POST \ + -d '{"inputs":"Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception.","parameters":{"max_new_tokens":256, "do_sample": true}}' \ + -H 'Content-Type: application/json' + ``` 2. LLM Microservices -```bash -curl http://${host_ip}:9000/v1/chat/completions\ - -X POST \ - -d '{"query":"Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception.","max_new_tokens":256,"top_k":10,"top_p":0.95,"typical_p":0.95,"temperature":0.01,"repetition_penalty":1.03,"streaming":true}' \ - -H 'Content-Type: application/json' -``` + ```bash + curl http://${host_ip}:9000/v1/chat/completions\ + -X POST \ + -d '{"query":"Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception.","max_new_tokens":256,"top_k":10,"top_p":0.95,"typical_p":0.95,"temperature":0.01,"repetition_penalty":1.03,"streaming":true}' \ + -H 'Content-Type: application/json' + ``` 3. MegaService -```bash -curl http://${host_ip}:7778/v1/codegen -H "Content-Type: application/json" -d '{ - "messages": "Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception." - }' -``` + ```bash + curl http://${host_ip}:7778/v1/codegen -H "Content-Type: application/json" -d '{ + "messages": "Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception." + }' + ``` ## 🚀 Launch the Svelte Based UI diff --git a/CodeGen/docker/xeon/README.md b/CodeGen/docker/xeon/README.md index 74b5a2b7c..d5b988064 100644 --- a/CodeGen/docker/xeon/README.md +++ b/CodeGen/docker/xeon/README.md @@ -113,29 +113,29 @@ docker compose up -d 1. TGI Service -```bash -curl http://${host_ip}:8028/generate \ - -X POST \ - -d '{"inputs":"Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception.","parameters":{"max_new_tokens":256, "do_sample": true}}' \ - -H 'Content-Type: application/json' -``` + ```bash + curl http://${host_ip}:8028/generate \ + -X POST \ + -d '{"inputs":"Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception.","parameters":{"max_new_tokens":256, "do_sample": true}}' \ + -H 'Content-Type: application/json' + ``` 2. LLM Microservices -```bash -curl http://${host_ip}:9000/v1/chat/completions\ - -X POST \ - -d '{"query":"Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception.","max_new_tokens":256,"top_k":10,"top_p":0.95,"typical_p":0.95,"temperature":0.01,"repetition_penalty":1.03,"streaming":true}' \ - -H 'Content-Type: application/json' -``` + ```bash + curl http://${host_ip}:9000/v1/chat/completions\ + -X POST \ + -d '{"query":"Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception.","max_new_tokens":256,"top_k":10,"top_p":0.95,"typical_p":0.95,"temperature":0.01,"repetition_penalty":1.03,"streaming":true}' \ + -H 'Content-Type: application/json' + ``` 3. MegaService -```bash -curl http://${host_ip}:7778/v1/codegen -H "Content-Type: application/json" -d '{ - "messages": "Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception." - }' -``` + ```bash + curl http://${host_ip}:7778/v1/codegen -H "Content-Type: application/json" -d '{ + "messages": "Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception." + }' + ``` ## 🚀 Launch the UI diff --git a/CodeGen/kubernetes/manifests/README.md b/CodeGen/kubernetes/manifests/README.md index 87d6490f8..9a4383983 100644 --- a/CodeGen/kubernetes/manifests/README.md +++ b/CodeGen/kubernetes/manifests/README.md @@ -3,9 +3,9 @@ > [NOTE] > The following values must be set before you can deploy: > HUGGINGFACEHUB_API_TOKEN - +> > You can also customize the "MODEL_ID" if needed. - +> > You need to make sure you have created the directory `/mnt/opea-models` to save the cached model on the node where the CodeGen workload is running. Otherwise, you need to modify the `codegen.yaml` file to change the `model-volume` to a directory that exists on the node. > Alternatively, you can change the `codegen.yaml` to use a different type of volume, such as a persistent volume claim. diff --git a/CodeGen/kubernetes/manifests/xeon/ui/README.md b/CodeGen/kubernetes/manifests/xeon/ui/README.md index dce34d335..01ed0becf 100644 --- a/CodeGen/kubernetes/manifests/xeon/ui/README.md +++ b/CodeGen/kubernetes/manifests/xeon/ui/README.md @@ -7,30 +7,32 @@ You can use react-codegen.yaml to deploy CodeGen with reactUI. kubectl apply -f react-codegen.yaml ``` -## Prerequisites for Deploying CodeGen with ReactUI: +## Prerequisites for Deploying CodeGen with ReactUI Before deploying the react-codegen.yaml file, ensure that you have the following prerequisites in place: 1. Kubernetes installation: Make sure that you have Kubernetes installed. 2. Configuration Values: Set the following values in react-codegen.yaml before proceeding with the deployment: - #### a. HUGGINGFACEHUB_API_TOKEN (Your HuggingFace token to download your desired model from HuggingFace): + + a. HUGGINGFACEHUB_API_TOKEN (Your HuggingFace token to download your desired model from HuggingFace): ``` # You may set the HUGGINGFACEHUB_API_TOKEN via method: export HUGGINGFACEHUB_API_TOKEN="YourOwnToken" cd GenAIExamples/CodeGen/kubernetes/manifests/xeon/ui/ sed -i "s/insert-your-huggingface-token-here/${HUGGINGFACEHUB_API_TOKEN}/g" react-codegen.yaml ``` - #### b. Set the proxies based on your network configuration + b. Set the proxies based on your network configuration ``` # Look for http_proxy, https_proxy, no_proxy key and fill up the value with your proxy configuration. ``` 3. MODEL_ID and model-volume (OPTIONAL): You may as well customize the "MODEL_ID" to use different model and model-volume for the volume to be mounted. 4. After completing these, you can proceed with the deployment of the react-codegen.yaml file. -## Verify Services: +## Verify Services Make sure all the pods are running, you should see total of 4 pods running: -1. codegen -2. codegen-llm-uservice -3. codegen-react-ui -4. codegen-tgi + +- codegen +- codegen-llm-uservice +- codegen-react-ui +- codegen-tgi You may open up the UI by using the codegen-react-ui endpoint in the browser. diff --git a/CodeTrans/docker/gaudi/README.md b/CodeTrans/docker/gaudi/README.md index 5faf97b78..5b8c00b07 100755 --- a/CodeTrans/docker/gaudi/README.md +++ b/CodeTrans/docker/gaudi/README.md @@ -98,37 +98,37 @@ docker compose up -d 1. TGI Service -```bash -curl http://${host_ip}:8008/generate \ - -X POST \ - -d '{"inputs":" ### System: Please translate the following Golang codes into Python codes. ### Original codes: '\'''\'''\''Golang \npackage main\n\nimport \"fmt\"\nfunc main() {\n fmt.Println(\"Hello, World!\");\n '\'''\'''\'' ### Translated codes:","parameters":{"max_new_tokens":17, "do_sample": true}}' \ - -H 'Content-Type: application/json' -``` + ```bash + curl http://${host_ip}:8008/generate \ + -X POST \ + -d '{"inputs":" ### System: Please translate the following Golang codes into Python codes. ### Original codes: '\'''\'''\''Golang \npackage main\n\nimport \"fmt\"\nfunc main() {\n fmt.Println(\"Hello, World!\");\n '\'''\'''\'' ### Translated codes:","parameters":{"max_new_tokens":17, "do_sample": true}}' \ + -H 'Content-Type: application/json' + ``` 2. LLM Microservice -```bash -curl http://${host_ip}:9000/v1/chat/completions\ - -X POST \ - -d '{"text":" ### System: Please translate the following Golang codes into Python codes. ### Original codes: '\'''\'''\''Golang \npackage main\n\nimport \"fmt\"\nfunc main() {\n fmt.Println(\"Hello, World!\");\n '\'''\'''\'' ### Translated codes:"}' \ - -H 'Content-Type: application/json' -``` + ```bash + curl http://${host_ip}:9000/v1/chat/completions\ + -X POST \ + -d '{"text":" ### System: Please translate the following Golang codes into Python codes. ### Original codes: '\'''\'''\''Golang \npackage main\n\nimport \"fmt\"\nfunc main() {\n fmt.Println(\"Hello, World!\");\n '\'''\'''\'' ### Translated codes:"}' \ + -H 'Content-Type: application/json' + ``` 3. MegaService -```bash -curl http://${host_ip}:7777/v1/codetrans \ - -H "Content-Type: application/json" \ - -d '{"language_from": "Golang","language_to": "Python","source_code": "package main\n\nimport \"fmt\"\nfunc main() {\n fmt.Println(\"Hello, World!\");\n}"}' -``` + ```bash + curl http://${host_ip}:7777/v1/codetrans \ + -H "Content-Type: application/json" \ + -d '{"language_from": "Golang","language_to": "Python","source_code": "package main\n\nimport \"fmt\"\nfunc main() {\n fmt.Println(\"Hello, World!\");\n}"}' + ``` 4. Nginx Service -```bash -curl http://${host_ip}:${NGINX_PORT}/v1/codetrans \ - -H "Content-Type: application/json" \ - -d '{"language_from": "Golang","language_to": "Python","source_code": "package main\n\nimport \"fmt\"\nfunc main() {\n fmt.Println(\"Hello, World!\");\n}"}' -``` + ```bash + curl http://${host_ip}:${NGINX_PORT}/v1/codetrans \ + -H "Content-Type: application/json" \ + -d '{"language_from": "Golang","language_to": "Python","source_code": "package main\n\nimport \"fmt\"\nfunc main() {\n fmt.Println(\"Hello, World!\");\n}"}' + ``` ## 🚀 Launch the UI diff --git a/CodeTrans/docker/xeon/README.md b/CodeTrans/docker/xeon/README.md index 03f144b22..b4c0830fd 100755 --- a/CodeTrans/docker/xeon/README.md +++ b/CodeTrans/docker/xeon/README.md @@ -106,37 +106,37 @@ docker compose up -d 1. TGI Service -```bash -curl http://${host_ip}:8008/generate \ - -X POST \ - -d '{"inputs":" ### System: Please translate the following Golang codes into Python codes. ### Original codes: '\'''\'''\''Golang \npackage main\n\nimport \"fmt\"\nfunc main() {\n fmt.Println(\"Hello, World!\");\n '\'''\'''\'' ### Translated codes:","parameters":{"max_new_tokens":17, "do_sample": true}}' \ - -H 'Content-Type: application/json' -``` + ```bash + curl http://${host_ip}:8008/generate \ + -X POST \ + -d '{"inputs":" ### System: Please translate the following Golang codes into Python codes. ### Original codes: '\'''\'''\''Golang \npackage main\n\nimport \"fmt\"\nfunc main() {\n fmt.Println(\"Hello, World!\");\n '\'''\'''\'' ### Translated codes:","parameters":{"max_new_tokens":17, "do_sample": true}}' \ + -H 'Content-Type: application/json' + ``` 2. LLM Microservice -```bash -curl http://${host_ip}:9000/v1/chat/completions\ - -X POST \ - -d '{"query":" ### System: Please translate the following Golang codes into Python codes. ### Original codes: '\'''\'''\''Golang \npackage main\n\nimport \"fmt\"\nfunc main() {\n fmt.Println(\"Hello, World!\");\n '\'''\'''\'' ### Translated codes:"}' \ - -H 'Content-Type: application/json' -``` + ```bash + curl http://${host_ip}:9000/v1/chat/completions\ + -X POST \ + -d '{"query":" ### System: Please translate the following Golang codes into Python codes. ### Original codes: '\'''\'''\''Golang \npackage main\n\nimport \"fmt\"\nfunc main() {\n fmt.Println(\"Hello, World!\");\n '\'''\'''\'' ### Translated codes:"}' \ + -H 'Content-Type: application/json' + ``` 3. MegaService -```bash -curl http://${host_ip}:7777/v1/codetrans \ - -H "Content-Type: application/json" \ - -d '{"language_from": "Golang","language_to": "Python","source_code": "package main\n\nimport \"fmt\"\nfunc main() {\n fmt.Println(\"Hello, World!\");\n}"}' -``` + ```bash + curl http://${host_ip}:7777/v1/codetrans \ + -H "Content-Type: application/json" \ + -d '{"language_from": "Golang","language_to": "Python","source_code": "package main\n\nimport \"fmt\"\nfunc main() {\n fmt.Println(\"Hello, World!\");\n}"}' + ``` 4. Nginx Service -```bash -curl http://${host_ip}:${NGINX_PORT}/v1/codetrans \ - -H "Content-Type: application/json" \ - -d '{"language_from": "Golang","language_to": "Python","source_code": "package main\n\nimport \"fmt\"\nfunc main() {\n fmt.Println(\"Hello, World!\");\n}"}' -``` + ```bash + curl http://${host_ip}:${NGINX_PORT}/v1/codetrans \ + -H "Content-Type: application/json" \ + -d '{"language_from": "Golang","language_to": "Python","source_code": "package main\n\nimport \"fmt\"\nfunc main() {\n fmt.Println(\"Hello, World!\");\n}"}' + ``` ## 🚀 Launch the UI diff --git a/CodeTrans/kubernetes/manifests/README.md b/CodeTrans/kubernetes/manifests/README.md index 709d6ea0f..5edc148cb 100644 --- a/CodeTrans/kubernetes/manifests/README.md +++ b/CodeTrans/kubernetes/manifests/README.md @@ -3,9 +3,9 @@ > [NOTE] > The following values must be set before you can deploy: > HUGGINGFACEHUB_API_TOKEN - +> > You can also customize the "MODEL_ID" if needed. - +> > You need to make sure you have created the directory `/mnt/opea-models` to save the cached model on the node where the CodeTrans workload is running. Otherwise, you need to modify the `codetrans.yaml` file to change the `model-volume` to a directory that exists on the node. ## Required Models diff --git a/DocSum/docker/gaudi/README.md b/DocSum/docker/gaudi/README.md index cf48ca885..fe765ecb0 100644 --- a/DocSum/docker/gaudi/README.md +++ b/DocSum/docker/gaudi/README.md @@ -98,29 +98,29 @@ docker compose up -d 1. TGI Service -```bash -curl http://${your_ip}:8008/generate \ - -X POST \ - -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":64, "do_sample": true}}' \ - -H 'Content-Type: application/json' -``` + ```bash + curl http://${your_ip}:8008/generate \ + -X POST \ + -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":64, "do_sample": true}}' \ + -H 'Content-Type: application/json' + ``` 2. LLM Microservice -```bash -curl http://${your_ip}:9000/v1/chat/docsum \ - -X POST \ - -d '{"query":"Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5."}' \ - -H 'Content-Type: application/json' -``` + ```bash + curl http://${your_ip}:9000/v1/chat/docsum \ + -X POST \ + -d '{"query":"Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5."}' \ + -H 'Content-Type: application/json' + ``` 3. MegaService -```bash -curl http://${host_ip}:8888/v1/docsum -H "Content-Type: application/json" -d '{ - "messages": "Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5." - }' -``` + ```bash + curl http://${host_ip}:8888/v1/docsum -H "Content-Type: application/json" -d '{ + "messages": "Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5." + }' + ``` ## 🚀 Launch the Svelte UI diff --git a/DocSum/docker/xeon/README.md b/DocSum/docker/xeon/README.md index 7a84d47d5..751837495 100644 --- a/DocSum/docker/xeon/README.md +++ b/DocSum/docker/xeon/README.md @@ -107,35 +107,36 @@ docker compose up -d 1. TGI Service -```bash -curl http://${your_ip}:8008/generate \ - -X POST \ - -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":17, "do_sample": true}}' \ - -H 'Content-Type: application/json' -``` + ```bash + curl http://${your_ip}:8008/generate \ + -X POST \ + -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":17, "do_sample": true}}' \ + -H 'Content-Type: application/json' + ``` 2. LLM Microservice -```bash -curl http://${your_ip}:9000/v1/chat/docsum \ - -X POST \ - -d '{"query":"Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5."}' \ - -H 'Content-Type: application/json' -``` + ```bash + curl http://${your_ip}:9000/v1/chat/docsum \ + -X POST \ + -d '{"query":"Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5."}' \ + -H 'Content-Type: application/json' + ``` 3. MegaService -```bash -curl http://${host_ip}:8888/v1/docsum -H "Content-Type: application/json" -d '{ - "messages": "Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5." - }' -``` + ```bash + curl http://${host_ip}:8888/v1/docsum -H "Content-Type: application/json" -d '{ + "messages": "Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5." + }' + ``` Following the validation of all aforementioned microservices, we are now prepared to construct a mega-service. ## 🚀 Launch the UI Open this URL `http://{host_ip}:5173` in your browser to access the svelte based frontend. + Open this URL `http://{host_ip}:5174` in your browser to access the React based frontend. ### Svelte UI diff --git a/DocSum/kubernetes/manifests/README.md b/DocSum/kubernetes/manifests/README.md index c47e4b39f..ba0d012f8 100644 --- a/DocSum/kubernetes/manifests/README.md +++ b/DocSum/kubernetes/manifests/README.md @@ -3,9 +3,9 @@ > [NOTE] > The following values must be set before you can deploy: > HUGGINGFACEHUB_API_TOKEN - +> > You can also customize the "MODEL_ID" and "model-volume" - +> > You need to make sure you have created the directory `/mnt/opea-models` to save the cached model on the node where the DocSum workload is running. Otherwise, you need to modify the `docsum.yaml` file to change the `model-volume` to a directory that exists on the node. ## Deploy On Xeon diff --git a/DocSum/kubernetes/manifests/xeon/ui/README.md b/DocSum/kubernetes/manifests/xeon/ui/README.md index 4f7a2ad97..a1fffd4b7 100644 --- a/DocSum/kubernetes/manifests/xeon/ui/README.md +++ b/DocSum/kubernetes/manifests/xeon/ui/README.md @@ -7,30 +7,31 @@ You can use react-docsum.yaml to deploy Docsum with reactUI. kubectl apply -f react-docsum.yaml ``` -## Prerequisites for Deploying DocSum with ReactUI: +## Prerequisites for Deploying DocSum with ReactUI Before deploying the react-docsum.yaml file, ensure that you have the following prerequisites in place: 1. Kubernetes installation: Make sure that you have Kubernetes installed. 2. Configuration Values: Set the following values in react-docsum.yaml before proceeding with the deployment: - #### a. HUGGINGFACEHUB_API_TOKEN (Your HuggingFace token to download your desired model from HuggingFace): - ``` - # You may set the HUGGINGFACEHUB_API_TOKEN via method: - export HUGGINGFACEHUB_API_TOKEN="YourOwnToken" - cd GenAIExamples/DocSum/kubernetes/manifests/xeon/ui/ - sed -i "s/insert-your-huggingface-token-here/${HUGGINGFACEHUB_API_TOKEN}/g" react-docsum.yaml - ``` - #### b. Set the proxies based on your network configuration - ``` - # Look for http_proxy, https_proxy, no_proxy key and fill up the value with your proxy configuration. - ``` + a. HUGGINGFACEHUB_API_TOKEN (Your HuggingFace token to download your desired model from HuggingFace): + ``` + # You may set the HUGGINGFACEHUB_API_TOKEN via method: + export HUGGINGFACEHUB_API_TOKEN="YourOwnToken" + cd GenAIExamples/DocSum/kubernetes/manifests/xeon/ui/ + sed -i "s/insert-your-huggingface-token-here/${HUGGINGFACEHUB_API_TOKEN}/g" react-docsum.yaml + ``` + b. Set the proxies based on your network configuration + ``` + # Look for http_proxy, https_proxy, no_proxy key and fill up the value with your proxy configuration. + ``` 3. MODEL_ID and model-volume (OPTIONAL): You may as well customize the "MODEL_ID" to use different model and model-volume for the volume to be mounted. 4. After completing these, you can proceed with the deployment of the react-docsum.yaml file. -## Verify Services: +## Verify Services Make sure all the pods are running, you should see total of 4 pods running: -1. docsum -2. docsum-llm-uservice -3. docsum-react-ui -4. docsum-tgi + +- docsum +- docsum-llm-uservice +- docsum-react-ui +- docsum-tgi You may open up the UI by using the docsum-react-ui endpoint in the browser. diff --git a/FaqGen/docker/gaudi/README.md b/FaqGen/docker/gaudi/README.md index 79410bfbb..f07fbed11 100644 --- a/FaqGen/docker/gaudi/README.md +++ b/FaqGen/docker/gaudi/README.md @@ -99,29 +99,29 @@ docker compose up -d 1. TGI Service -```bash -curl http://${your_ip}:8008/generate \ - -X POST \ - -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":64, "do_sample": true}}' \ - -H 'Content-Type: application/json' -``` + ```bash + curl http://${your_ip}:8008/generate \ + -X POST \ + -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":64, "do_sample": true}}' \ + -H 'Content-Type: application/json' + ``` 2. LLM Microservice -```bash -curl http://${host_ip}:9000/v1/faqgen \ - -X POST \ - -d '{"query":"Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5."}' \ - -H 'Content-Type: application/json' -``` + ```bash + curl http://${host_ip}:9000/v1/faqgen \ + -X POST \ + -d '{"query":"Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5."}' \ + -H 'Content-Type: application/json' + ``` 3. MegaService -```bash -curl http://${host_ip}:8888/v1/faqgen -H "Content-Type: application/json" -d '{ - "messages": "Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5." - }' -``` + ```bash + curl http://${host_ip}:8888/v1/faqgen -H "Content-Type: application/json" -d '{ + "messages": "Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5." + }' + ``` ## 🚀 Launch the UI diff --git a/FaqGen/docker/xeon/README.md b/FaqGen/docker/xeon/README.md index cbe3a726b..8265ff02e 100644 --- a/FaqGen/docker/xeon/README.md +++ b/FaqGen/docker/xeon/README.md @@ -98,31 +98,31 @@ docker compose up -d 1. TGI Service -```bash -curl http://${host_ip}:8008/generate \ - -X POST \ - -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":17, "do_sample": true}}' \ - -H 'Content-Type: application/json' -``` + ```bash + curl http://${host_ip}:8008/generate \ + -X POST \ + -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":17, "do_sample": true}}' \ + -H 'Content-Type: application/json' + ``` 2. LLM Microservice -```bash -curl http://${host_ip}:9000/v1/faqgen \ - -X POST \ - -d '{"query":"Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5."}' \ - -H 'Content-Type: application/json' -``` + ```bash + curl http://${host_ip}:9000/v1/faqgen \ + -X POST \ + -d '{"query":"Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5."}' \ + -H 'Content-Type: application/json' + ``` 3. MegaService -```bash -curl http://${host_ip}:8888/v1/faqgen -H "Content-Type: application/json" -d '{ - "messages": "Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5." - }' -``` + ```bash + curl http://${host_ip}:8888/v1/faqgen -H "Content-Type: application/json" -d '{ + "messages": "Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5." + }' + ``` -Following the validation of all aforementioned microservices, we are now prepared to construct a mega-service. + Following the validation of all aforementioned microservices, we are now prepared to construct a mega-service. ## 🚀 Launch the UI diff --git a/FaqGen/kubernetes/manifests/xeon/ui/README.md b/FaqGen/kubernetes/manifests/xeon/ui/README.md index 8a0da45c0..a3817e695 100644 --- a/FaqGen/kubernetes/manifests/xeon/ui/README.md +++ b/FaqGen/kubernetes/manifests/xeon/ui/README.md @@ -7,26 +7,26 @@ You can use react-faqgen.yaml to deploy FaqGen with reactUI. kubectl apply -f react-faqgen.yaml ``` -## Prerequisites for Deploying FaqGen with ReactUI: +## Prerequisites for Deploying FaqGen with ReactUI Before deploying the react-faqgen.yaml file, ensure that you have the following prerequisites in place: 1. Kubernetes installation: Make sure that you have Kubernetes installed. 2. Configuration Values: Set the following values in react-faqgen.yaml before proceeding with the deployment: - #### a. HUGGINGFACEHUB_API_TOKEN (Your HuggingFace token to download your desired model from HuggingFace): - ``` - # You may set the HUGGINGFACEHUB_API_TOKEN via method: - export HUGGINGFACEHUB_API_TOKEN="YourOwnToken" - cd GenAIExamples/FaqGen/kubernetes/manifests/xeon/ui/ - sed -i "s/insert-your-huggingface-token-here/${HUGGINGFACEHUB_API_TOKEN}/g" react-faqgen.yaml - ``` - #### b. Set the proxies based on your network configuration - ``` - # Look for http_proxy, https_proxy, no_proxy key and fill up the value with your proxy configuration. - ``` + a. HUGGINGFACEHUB_API_TOKEN (Your HuggingFace token to download your desired model from HuggingFace): + ``` + # You may set the HUGGINGFACEHUB_API_TOKEN via method: + export HUGGINGFACEHUB_API_TOKEN="YourOwnToken" + cd GenAIExamples/FaqGen/kubernetes/manifests/xeon/ui/ + sed -i "s/insert-your-huggingface-token-here/${HUGGINGFACEHUB_API_TOKEN}/g" react-faqgen.yaml + ``` + b. Set the proxies based on your network configuration + ``` + # Look for http_proxy, https_proxy, no_proxy key and fill up the value with your proxy configuration. + ``` 3. MODEL_ID and model-volume (OPTIONAL): You may as well customize the "MODEL_ID" to use different model and model-volume for the volume to be mounted. 4. After completing these, you can proceed with the deployment of the react-faqgen.yaml file. -## Verify Services: +## Verify Services Make sure all the pods are running, you should see total of 4 pods running: 1. faqgen 2. faqgen-llm-uservice diff --git a/ProductivitySuite/docker/ui/react/README.md b/ProductivitySuite/docker/ui/react/README.md index 0ad0d23c3..97fae9be1 100644 --- a/ProductivitySuite/docker/ui/react/README.md +++ b/ProductivitySuite/docker/ui/react/README.md @@ -35,7 +35,8 @@ Here're some of the project's features: ### CODEGEN - Generate code: generate the corresponding code based on the current user's input. - ###### Screen Shot + + Screen Shot ![project-screenshot](../../../assets/img/codegen.png) ### DOC SUMMARY @@ -59,7 +60,7 @@ Here're some of the project's features: ![project-screenshot](../../../assets/img/faq_generator.png) -## 🛠️ Get it Running: +## 🛠️ Get it Running 1. Clone the repo. diff --git a/ProductivitySuite/docker/xeon/README.md b/ProductivitySuite/docker/xeon/README.md index 82b024c52..0c1da5490 100644 --- a/ProductivitySuite/docker/xeon/README.md +++ b/ProductivitySuite/docker/xeon/README.md @@ -285,202 +285,202 @@ Please refer to [keycloak_setup_guide](keycloak_setup_guide.md) for more detail 9. CodeGen LLM Microservice -```bash -curl http://${host_ip}:9001/v1/chat/completions\ - -X POST \ - -d '{"query":"def print_hello_world():"}' \ - -H 'Content-Type: application/json' -``` - -11. DocSum LLM Microservice - -```bash -curl http://${host_ip}:9002/v1/chat/docsum\ - -X POST \ - -d '{"query":"Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5"}' \ - -H 'Content-Type: application/json' -``` - -12. FAQGen LLM Microservice - -```bash -curl http://${host_ip}:9003/v1/faqgen\ - -X POST \ - -d '{"query":"Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5"}' \ - -H 'Content-Type: application/json' -``` - -13. ChatQnA MegaService - -```bash -curl http://${host_ip}:8888/v1/chatqna -H "Content-Type: application/json" -d '{ - "messages": "What is the revenue of Nike in 2023?" - }' -``` - -14. FAQGen MegaService - -```bash -curl http://${host_ip}:8889/v1/faqgen -H "Content-Type: application/json" -d '{ - "messages": "Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5." - }' -``` - -15. DocSum MegaService - -```bash -curl http://${host_ip}:8890/v1/docsum -H "Content-Type: application/json" -d '{ - "messages": "Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5." - }' -``` - -16. CodeGen MegaService - -```bash -curl http://${host_ip}:7778/v1/codegen -H "Content-Type: application/json" -d '{ - "messages": "def print_hello_world():" - }' -``` - -17. Dataprep Microservice - -If you want to update the default knowledge base, you can use the following commands: - -Update Knowledge Base via Local File Upload: - -```bash -curl -X POST "http://${host_ip}:6007/v1/dataprep" \ - -H "Content-Type: multipart/form-data" \ - -F "files=@./nke-10k-2023.pdf" -``` - -This command updates a knowledge base by uploading a local file for processing. Update the file path according to your environment. - -Add Knowledge Base via HTTP Links: - -```bash -curl -X POST "http://${host_ip}:6007/v1/dataprep" \ - -H "Content-Type: multipart/form-data" \ - -F 'link_list=["https://opea.dev"]' -``` - -This command updates a knowledge base by submitting a list of HTTP links for processing. - -Also, you are able to get the file list that you uploaded: - -```bash -curl -X POST "http://${host_ip}:6007/v1/dataprep/get_file" \ - -H "Content-Type: application/json" -``` - -To delete the file/link you uploaded: - -```bash -# delete link -curl -X POST "http://${host_ip}:6007/v1/dataprep/delete_file" \ - -d '{"file_path": "https://opea.dev.txt"}' \ - -H "Content-Type: application/json" - -# delete file -curl -X POST "http://${host_ip}:6007/v1/dataprep/delete_file" \ - -d '{"file_path": "nke-10k-2023.pdf"}' \ - -H "Content-Type: application/json" - -# delete all uploaded files and links -curl -X POST "http://${host_ip}:6007/v1/dataprep/delete_file" \ - -d '{"file_path": "all"}' \ - -H "Content-Type: application/json" -``` - -18. Prompt Registry Microservice - -If you want to update the default Prompts in the application for your user, you can use the following commands: + ```bash + curl http://${host_ip}:9001/v1/chat/completions\ + -X POST \ + -d '{"query":"def print_hello_world():"}' \ + -H 'Content-Type: application/json' + ``` -```bash -curl -X 'POST' \ - http://{host_ip}:6015/v1/prompt/create \ - -H 'accept: application/json' \ - -H 'Content-Type: application/json' \ - -d '{ - "prompt_text": "test prompt", "user": "test" -}' -``` +10. DocSum LLM Microservice -Retrieve prompt from database based on user or prompt_id + ```bash + curl http://${host_ip}:9002/v1/chat/docsum\ + -X POST \ + -d '{"query":"Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5"}' \ + -H 'Content-Type: application/json' + ``` -```bash -curl -X 'POST' \ - http://{host_ip}:6015/v1/prompt/get \ - -H 'accept: application/json' \ - -H 'Content-Type: application/json' \ - -d '{ - "user": "test"}' - -curl -X 'POST' \ - http://{host_ip}:6015/v1/prompt/get \ - -H 'accept: application/json' \ - -H 'Content-Type: application/json' \ - -d '{ - "user": "test", "prompt_id":"{prompt_id returned from save prompt route above}"}' -``` +11. FAQGen LLM Microservice -Delete prompt from database based on prompt_id provided + ```bash + curl http://${host_ip}:9003/v1/faqgen\ + -X POST \ + -d '{"query":"Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5"}' \ + -H 'Content-Type: application/json' + ``` -```bash -curl -X 'POST' \ - http://{host_ip}:6015/v1/prompt/delete \ - -H 'accept: application/json' \ - -H 'Content-Type: application/json' \ - -d '{ - "user": "test", "prompt_id":"{prompt_id to be deleted}"}' -``` +12. ChatQnA MegaService -19. Chat History Microservice + ```bash + curl http://${host_ip}:8888/v1/chatqna -H "Content-Type: application/json" -d '{ + "messages": "What is the revenue of Nike in 2023?" + }' + ``` -To validate the chatHistory Microservice, you can use the following commands. +13. FAQGen MegaService -Create a sample conversation and get the message ID. + ```bash + curl http://${host_ip}:8889/v1/faqgen -H "Content-Type: application/json" -d '{ + "messages": "Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5." + }' + ``` -```bash -curl -X 'POST' \ - http://${host_ip}:6012/v1/chathistory/create \ - -H 'accept: application/json' \ - -H 'Content-Type: application/json' \ - -d '{ - "data": { - "messages": "test Messages", "user": "test" - } -}' -``` +14. DocSum MegaService -Retrieve the conversation based on user or conversation id + ```bash + curl http://${host_ip}:8890/v1/docsum -H "Content-Type: application/json" -d '{ + "messages": "Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5." + }' + ``` + +15. CodeGen MegaService + + ```bash + curl http://${host_ip}:7778/v1/codegen -H "Content-Type: application/json" -d '{ + "messages": "def print_hello_world():" + }' + ``` -```bash -curl -X 'POST' \ - http://${host_ip}:6012/v1/chathistory/get \ - -H 'accept: application/json' \ - -H 'Content-Type: application/json' \ - -d '{ - "user": "test"}' - -curl -X 'POST' \ - http://${host_ip}:6012/v1/chathistory/get \ - -H 'accept: application/json' \ - -H 'Content-Type: application/json' \ - -d '{ - "user": "test", "id":"{Conversation id to retrieve }"}' -``` +16. Dataprep Microservice -Delete Conversation from database based on conversation id provided. + If you want to update the default knowledge base, you can use the following commands: -```bash -curl -X 'POST' \ - http://${host_ip}:6012/v1/chathistory/delete \ - -H 'accept: application/json' \ - -H 'Content-Type: application/json' \ - -d '{ - "user": "test", "id":"{Conversation id to Delete}"}' -``` + Update Knowledge Base via Local File Upload: + + ```bash + curl -X POST "http://${host_ip}:6007/v1/dataprep" \ + -H "Content-Type: multipart/form-data" \ + -F "files=@./nke-10k-2023.pdf" + ``` + + This command updates a knowledge base by uploading a local file for processing. Update the file path according to your environment. + + Add Knowledge Base via HTTP Links: + + ```bash + curl -X POST "http://${host_ip}:6007/v1/dataprep" \ + -H "Content-Type: multipart/form-data" \ + -F 'link_list=["https://opea.dev"]' + ``` + + This command updates a knowledge base by submitting a list of HTTP links for processing. + + Also, you are able to get the file list that you uploaded: + + ```bash + curl -X POST "http://${host_ip}:6007/v1/dataprep/get_file" \ + -H "Content-Type: application/json" + ``` + + To delete the file/link you uploaded: + + ```bash + # delete link + curl -X POST "http://${host_ip}:6007/v1/dataprep/delete_file" \ + -d '{"file_path": "https://opea.dev.txt"}' \ + -H "Content-Type: application/json" + + # delete file + curl -X POST "http://${host_ip}:6007/v1/dataprep/delete_file" \ + -d '{"file_path": "nke-10k-2023.pdf"}' \ + -H "Content-Type: application/json" + + # delete all uploaded files and links + curl -X POST "http://${host_ip}:6007/v1/dataprep/delete_file" \ + -d '{"file_path": "all"}' \ + -H "Content-Type: application/json" + ``` + +17. Prompt Registry Microservice + + If you want to update the default Prompts in the application for your user, you can use the following commands: + + ```bash + curl -X 'POST' \ + http://{host_ip}:6015/v1/prompt/create \ + -H 'accept: application/json' \ + -H 'Content-Type: application/json' \ + -d '{ + "prompt_text": "test prompt", "user": "test" + }' + ``` + + Retrieve prompt from database based on user or prompt_id + + ```bash + curl -X 'POST' \ + http://{host_ip}:6015/v1/prompt/get \ + -H 'accept: application/json' \ + -H 'Content-Type: application/json' \ + -d '{ + "user": "test"}' + + curl -X 'POST' \ + http://{host_ip}:6015/v1/prompt/get \ + -H 'accept: application/json' \ + -H 'Content-Type: application/json' \ + -d '{ + "user": "test", "prompt_id":"{prompt_id returned from save prompt route above}"}' + ``` + + Delete prompt from database based on prompt_id provided + + ```bash + curl -X 'POST' \ + http://{host_ip}:6015/v1/prompt/delete \ + -H 'accept: application/json' \ + -H 'Content-Type: application/json' \ + -d '{ + "user": "test", "prompt_id":"{prompt_id to be deleted}"}' + ``` + +18. Chat History Microservice + + To validate the chatHistory Microservice, you can use the following commands. + + Create a sample conversation and get the message ID. + + ```bash + curl -X 'POST' \ + http://${host_ip}:6012/v1/chathistory/create \ + -H 'accept: application/json' \ + -H 'Content-Type: application/json' \ + -d '{ + "data": { + "messages": "test Messages", "user": "test" + } + }' + ``` + + Retrieve the conversation based on user or conversation id + + ```bash + curl -X 'POST' \ + http://${host_ip}:6012/v1/chathistory/get \ + -H 'accept: application/json' \ + -H 'Content-Type: application/json' \ + -d '{ + "user": "test"}' + + curl -X 'POST' \ + http://${host_ip}:6012/v1/chathistory/get \ + -H 'accept: application/json' \ + -H 'Content-Type: application/json' \ + -d '{ + "user": "test", "id":"{Conversation id to retrieve }"}' + ``` + + Delete Conversation from database based on conversation id provided. + + ```bash + curl -X 'POST' \ + http://${host_ip}:6012/v1/chathistory/delete \ + -H 'accept: application/json' \ + -H 'Content-Type: application/json' \ + -d '{ + "user": "test", "id":"{Conversation id to Delete}"}' + ``` ## 🚀 Launch the UI @@ -528,7 +528,8 @@ Here're some of the project's features: ### CODEGEN - Generate code: generate the corresponding code based on the current user's input. - ###### Screen Shot + + Screen Shot ![project-screenshot](../../assets/img/codegen.png) ### DOC SUMMARY diff --git a/ProductivitySuite/kubernetes/manifests/README.md b/ProductivitySuite/kubernetes/manifests/README.md index cfb02c45b..11dd0acd9 100644 --- a/ProductivitySuite/kubernetes/manifests/README.md +++ b/ProductivitySuite/kubernetes/manifests/README.md @@ -16,7 +16,7 @@ In ProductivitySuite, it consists of following pipelines/examples and components - keycloak ``` -## Prerequisites for Deploying ProductivitySuite with ReactUI: +## Prerequisites for Deploying ProductivitySuite with ReactUI To begin with, ensure that you have following prerequisites in place: 1. Kubernetes installation: Make sure that you have Kubernetes installed. diff --git a/README.md b/README.md index a42df4a66..55a53ed1a 100644 --- a/README.md +++ b/README.md @@ -1,5 +1,3 @@ -
- # Generative AI Examples [![version](https://img.shields.io/badge/release-0.9-green)](https://github.com/opea-project/GenAIExamples/releases) @@ -7,8 +5,6 @@ --- -
- ## Introduction GenAIComps-based Generative AI examples offer streamlined deployment, testing, and scalability. All examples are fully compatible with Docker and Kubernetes, supporting a wide range of hardware platforms such as Gaudi, Xeon, and other hardwares. diff --git a/Translation/docker/gaudi/README.md b/Translation/docker/gaudi/README.md index 04ec63453..16a22e976 100644 --- a/Translation/docker/gaudi/README.md +++ b/Translation/docker/gaudi/README.md @@ -71,28 +71,28 @@ docker compose up -d 1. TGI Service -```bash -curl http://${host_ip}:8008/generate \ - -X POST \ - -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":64, "do_sample": true}}' \ - -H 'Content-Type: application/json' -``` + ```bash + curl http://${host_ip}:8008/generate \ + -X POST \ + -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":64, "do_sample": true}}' \ + -H 'Content-Type: application/json' + ``` 2. LLM Microservice -```bash -curl http://${host_ip}:9000/v1/chat/completions \ - -X POST \ - -d '{"query":"Translate this from Chinese to English:\nChinese: 我爱机器翻译。\nEnglish:"}' \ - -H 'Content-Type: application/json' -``` + ```bash + curl http://${host_ip}:9000/v1/chat/completions \ + -X POST \ + -d '{"query":"Translate this from Chinese to English:\nChinese: 我爱机器翻译。\nEnglish:"}' \ + -H 'Content-Type: application/json' + ``` 3. MegaService -```bash -curl http://${host_ip}:8888/v1/translation -H "Content-Type: application/json" -d '{ - "language_from": "Chinese","language_to": "English","source_language": "我爱机器翻译。"}' -``` + ```bash + curl http://${host_ip}:8888/v1/translation -H "Content-Type: application/json" -d '{ + "language_from": "Chinese","language_to": "English","source_language": "我爱机器翻译。"}' + ``` Following the validation of all aforementioned microservices, we are now prepared to construct a mega-service. diff --git a/Translation/docker/xeon/README.md b/Translation/docker/xeon/README.md index 915e2b3a4..25204094d 100644 --- a/Translation/docker/xeon/README.md +++ b/Translation/docker/xeon/README.md @@ -79,28 +79,28 @@ docker compose up -d 1. TGI Service -```bash -curl http://${host_ip}:8008/generate \ - -X POST \ - -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":17, "do_sample": true}}' \ - -H 'Content-Type: application/json' -``` + ```bash + curl http://${host_ip}:8008/generate \ + -X POST \ + -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":17, "do_sample": true}}' \ + -H 'Content-Type: application/json' + ``` 2. LLM Microservice -```bash -curl http://${host_ip}:9000/v1/chat/completions \ - -X POST \ - -d '{"query":"Translate this from Chinese to English:\nChinese: 我爱机器翻译。\nEnglish:"}' \ - -H 'Content-Type: application/json' -``` + ```bash + curl http://${host_ip}:9000/v1/chat/completions \ + -X POST \ + -d '{"query":"Translate this from Chinese to English:\nChinese: 我爱机器翻译。\nEnglish:"}' \ + -H 'Content-Type: application/json' + ``` 3. MegaService -```bash -curl http://${host_ip}:8888/v1/translation -H "Content-Type: application/json" -d '{ - "language_from": "Chinese","language_to": "English","source_language": "我爱机器翻译。"}' -``` + ```bash + curl http://${host_ip}:8888/v1/translation -H "Content-Type: application/json" -d '{ + "language_from": "Chinese","language_to": "English","source_language": "我爱机器翻译。"}' + ``` Following the validation of all aforementioned microservices, we are now prepared to construct a mega-service. diff --git a/VisualQnA/README.md b/VisualQnA/README.md index 910deda2a..a00c0602e 100644 --- a/VisualQnA/README.md +++ b/VisualQnA/README.md @@ -18,7 +18,7 @@ This example guides you through how to deploy a [LLaVA-NeXT](https://github.com/ ![llava screenshot](./assets/img/llava_screenshot1.png) ![llava-screenshot](./assets/img/llava_screenshot2.png) -# Required Models +## Required Models By default, the model is set to `llava-hf/llava-v1.6-mistral-7b-hf`. To use a different model, update the `LVM_MODEL_ID` variable in the [`set_env.sh`](./docker/gaudi/set_env.sh) file. @@ -28,13 +28,13 @@ export LVM_MODEL_ID="llava-hf/llava-v1.6-mistral-7b-hf" You can choose other llava-next models, such as `llava-hf/llava-v1.6-vicuna-13b-hf`, as needed. -# Deploy VisualQnA Service +## Deploy VisualQnA Service The VisualQnA service can be effortlessly deployed on either Intel Gaudi2 or Intel XEON Scalable Processors. Currently we support deploying VisualQnA services with docker compose. -## Setup Environment Variable +### Setup Environment Variable To set up environment variables for deploying VisualQnA services, follow these steps: @@ -65,7 +65,7 @@ To set up environment variables for deploying VisualQnA services, follow these s source ./docker/xeon/set_env.sh ``` -## Deploy VisualQnA on Gaudi +### Deploy VisualQnA on Gaudi Refer to the [Gaudi Guide](./docker/gaudi/README.md) to build docker images from source. @@ -78,7 +78,7 @@ docker compose up -d > Notice: Currently only the **Habana Driver 1.16.x** is supported for Gaudi. -## Deploy VisualQnA on Xeon +### Deploy VisualQnA on Xeon Refer to the [Xeon Guide](./docker/xeon/README.md) for more instructions on building docker images from source. diff --git a/VisualQnA/docker/gaudi/README.md b/VisualQnA/docker/gaudi/README.md index 4b9f055b8..5e0571121 100644 --- a/VisualQnA/docker/gaudi/README.md +++ b/VisualQnA/docker/gaudi/README.md @@ -91,34 +91,34 @@ Follow the instructions to validate MicroServices. 1. LLM Microservice -```bash -http_proxy="" curl http://${host_ip}:9399/v1/lvm -XPOST -d '{"image": "iVBORw0KGgoAAAANSUhEUgAAAAoAAAAKCAYAAACNMs+9AAAAFUlEQVR42mP8/5+hnoEIwDiqkL4KAcT9GO0U4BxoAAAAAElFTkSuQmCC", "prompt":"What is this?"}' -H 'Content-Type: application/json' -``` + ```bash + http_proxy="" curl http://${host_ip}:9399/v1/lvm -XPOST -d '{"image": "iVBORw0KGgoAAAANSUhEUgAAAAoAAAAKCAYAAACNMs+9AAAAFUlEQVR42mP8/5+hnoEIwDiqkL4KAcT9GO0U4BxoAAAAAElFTkSuQmCC", "prompt":"What is this?"}' -H 'Content-Type: application/json' + ``` 2. MegaService -```bash -curl http://${host_ip}:8888/v1/visualqna -H "Content-Type: application/json" -d '{ - "messages": [ - { - "role": "user", - "content": [ - { - "type": "text", - "text": "What'\''s in this image?" - }, - { - "type": "image_url", - "image_url": { - "url": "https://www.ilankelman.org/stopsigns/australia.jpg" - } - } - ] - } - ], - "max_tokens": 300 - }' -``` + ```bash + curl http://${host_ip}:8888/v1/visualqna -H "Content-Type: application/json" -d '{ + "messages": [ + { + "role": "user", + "content": [ + { + "type": "text", + "text": "What'\''s in this image?" + }, + { + "type": "image_url", + "image_url": { + "url": "https://www.ilankelman.org/stopsigns/australia.jpg" + } + } + ] + } + ], + "max_tokens": 300 + }' + ``` ## 🚀 Launch the UI diff --git a/VisualQnA/docker/xeon/README.md b/VisualQnA/docker/xeon/README.md index 73e30ab96..d5afda219 100644 --- a/VisualQnA/docker/xeon/README.md +++ b/VisualQnA/docker/xeon/README.md @@ -133,34 +133,34 @@ Follow the instructions to validate MicroServices. 1. LLM Microservice -```bash -http_proxy="" curl http://${host_ip}:9399/v1/lvm -XPOST -d '{"image": "iVBORw0KGgoAAAANSUhEUgAAAAoAAAAKCAYAAACNMs+9AAAAFUlEQVR42mP8/5+hnoEIwDiqkL4KAcT9GO0U4BxoAAAAAElFTkSuQmCC", "prompt":"What is this?"}' -H 'Content-Type: application/json' -``` + ```bash + http_proxy="" curl http://${host_ip}:9399/v1/lvm -XPOST -d '{"image": "iVBORw0KGgoAAAANSUhEUgAAAAoAAAAKCAYAAACNMs+9AAAAFUlEQVR42mP8/5+hnoEIwDiqkL4KAcT9GO0U4BxoAAAAAElFTkSuQmCC", "prompt":"What is this?"}' -H 'Content-Type: application/json' + ``` 2. MegaService -```bash -curl http://${host_ip}:8888/v1/visualqna -H "Content-Type: application/json" -d '{ - "messages": [ - { - "role": "user", - "content": [ - { - "type": "text", - "text": "What'\''s in this image?" - }, - { - "type": "image_url", - "image_url": { - "url": "https://www.ilankelman.org/stopsigns/australia.jpg" - } - } - ] - } - ], - "max_tokens": 300 - }' -``` + ```bash + curl http://${host_ip}:8888/v1/visualqna -H "Content-Type: application/json" -d '{ + "messages": [ + { + "role": "user", + "content": [ + { + "type": "text", + "text": "What'\''s in this image?" + }, + { + "type": "image_url", + "image_url": { + "url": "https://www.ilankelman.org/stopsigns/australia.jpg" + } + } + ] + } + ], + "max_tokens": 300 + }' + ``` ## 🚀 Launch the UI diff --git a/VisualQnA/kubernetes/manifests/README.md b/VisualQnA/kubernetes/manifests/README.md index 9973b15e7..aa92531f3 100644 --- a/VisualQnA/kubernetes/manifests/README.md +++ b/VisualQnA/kubernetes/manifests/README.md @@ -2,7 +2,7 @@ > [NOTE] > You can also customize the "LVM_MODEL_ID" if needed. - +> > You need to make sure you have created the directory `/mnt/opea-models` to save the cached model on the node where the visualqna workload is running. Otherwise, you need to modify the `visualqna.yaml` file to change the `model-volume` to a directory that exists on the node. ## Deploy On Xeon