doc: add ChatQnA deploy on xeon example #104

dbkinder · 2024-09-10T23:36:37Z

Show how to merge the vllm and TGI example into one with tabbed content for the differences.

tomlenth

LGTM

yinghu5

let's see the final effect. thanks

examples/ChatQnA/deploy/xeon.md

Show how to merge the vllm and TGI example into one with tabbed content for the differences. Signed-off-by: David B. Kinder <[email protected]>

dbkinder · 2024-09-11T14:20:34Z

@mkbhanda @hshen14 @preethivenkatesh I still need another "green" checkmark reviewer...

mkbhanda

More suggestions ... sorry if it feels like nit picking.

mkbhanda · 2024-09-12T19:24:49Z

examples/ChatQnA/deploy/xeon.md

+slice-n-dice ways to enable RAG with vectordb and LLM models, but here we will
+be covering one option of doing it for convenience : we will be showcasing  how
+to build an e2e chatQnA with Redis VectorDB and neural-chat-7b-v3-3 model,
+deployed on IDC. For more information on how to setup IDC instance to proceed,


In the spirit of refactor/re-use, should we have kept the where part in a separate document - IDC or oa desktop/server or VM elsewhere?

Also, the sentence is grammatically incorrect.

mkbhanda · 2024-09-12T19:25:18Z

examples/ChatQnA/deploy/xeon.md

+be covering one option of doing it for convenience : we will be showcasing  how
+to build an e2e chatQnA with Redis VectorDB and neural-chat-7b-v3-3 model,
+deployed on IDC. For more information on how to setup IDC instance to proceed,
+Please follow the instructions here (*** getting started section***). If you do


I thought "Please" is stylistically frowned upon.

mkbhanda · 2024-09-12T19:26:54Z

examples/ChatQnA/deploy/xeon.md

+
+## Prerequisites
+
+First step is to clone the GenAIExamples and GenAIComps. GenAIComps are


Should we have suggested using the pre-built docker images on Docker Hub instead of the build instructions here? Stick to V 0.9 tag or be future proof using latest tag.

mkbhanda · 2024-09-12T19:27:26Z

examples/ChatQnA/deploy/xeon.md

+git checkout tags/v0.9
+```
+
+The examples utilize model weights from HuggingFace and langchain.


mkbhanda · 2024-09-12T19:28:36Z

examples/ChatQnA/deploy/xeon.md

+
+## Prepare (Building / Pulling) Docker images
+
+This step will involve building/pulling ( maybe in future) relevant docker


future has become present :-)

mkbhanda · 2024-09-12T21:12:22Z

examples/ChatQnA/deploy/xeon.md

+{"id":"e1eb0e44f56059fc01aa0334b1dac313","query":"Human: Answer the question based only on the following context:\n    Deep learning is...\n    Question: What is Deep Learning?","max_new_tokens":1024,"top_k":10,"top_p":0.95,"typical_p":0.95,"temperature":0.01,"repetition_penalty":1.03,"streaming":true}
+
+```
+You may notice reranking microservice are with state ('ID' and other meta data),


unclear what the message is?

mkbhanda · 2024-09-12T21:13:07Z

examples/ChatQnA/deploy/xeon.md

+     "max_tokens": 32, "temperature": 0}'
+```
+
+vLLM service generate text for the input prompt. Here is the expected result


The vLLM service generates ...

mkbhanda · 2024-09-12T21:14:21Z

examples/ChatQnA/deploy/xeon.md

+{"generated_text":"We have all heard the buzzword, but our understanding of it is still growing. It’s a sub-field of Machine Learning, and it’s the cornerstone of today’s Machine Learning breakthroughs.\n\nDeep Learning makes machines act more like humans through their ability to generalize from very large"}
+```
+
+**NOTE**: After launch the vLLM, it takes few minutes for vLLM server to load


After launching the vLLM service it takes a few minutes for the vLLM server to load

mkbhanda · 2024-09-12T21:15:04Z

examples/ChatQnA/deploy/xeon.md

+
+```
+
+TGI service generate text for the input prompt. Here is the expected result from TGI:


we are eating up articles like "The", generates

mkbhanda · 2024-09-12T21:15:59Z

examples/ChatQnA/deploy/xeon.md

+
+```
+
+and the log shows model warm up, please wait for a while and try it later.


s/try it later/retry.

mkbhanda · 2024-09-12T21:18:36Z

Sorry @dbkinder did not get to this PR earlier prior to merge.

dbkinder requested review from chensuyue, ftian1, mkbhanda, preethivenkatesh, chickenrae and tomlenth as code owners September 10, 2024 23:36

tomlenth approved these changes Sep 11, 2024

View reviewed changes

kevinintel requested a review from letonghan September 11, 2024 02:07

yinghu5 approved these changes Sep 11, 2024

View reviewed changes

xiguiw reviewed Sep 11, 2024

View reviewed changes

examples/ChatQnA/deploy/xeon.md Show resolved Hide resolved

xiguiw reviewed Sep 11, 2024

View reviewed changes

examples/ChatQnA/deploy/xeon.md Show resolved Hide resolved

xiguiw reviewed Sep 11, 2024

View reviewed changes

examples/ChatQnA/deploy/xeon.md Show resolved Hide resolved

dbkinder force-pushed the chatqna10 branch from ee37b44 to 8eb7cdb Compare September 11, 2024 04:50

xiguiw approved these changes Sep 11, 2024

View reviewed changes

letonghan reviewed Sep 11, 2024

View reviewed changes

examples/ChatQnA/deploy/xeon.md Outdated Show resolved Hide resolved

examples/ChatQnA/deploy/xeon.md Outdated Show resolved Hide resolved

doc: add ChatQnA deploy on xeon example

27df733

Show how to merge the vllm and TGI example into one with tabbed content for the differences. Signed-off-by: David B. Kinder <[email protected]>

dbkinder force-pushed the chatqna10 branch from 8eb7cdb to 27df733 Compare September 11, 2024 14:18

preethivenkatesh approved these changes Sep 12, 2024

View reviewed changes

dbkinder merged commit b08d88f into opea-project:main Sep 12, 2024
1 check passed

mkbhanda reviewed Sep 12, 2024

View reviewed changes

dbkinder deleted the chatqna10 branch September 25, 2024 01:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

doc: add ChatQnA deploy on xeon example #104

doc: add ChatQnA deploy on xeon example #104

dbkinder commented Sep 10, 2024

tomlenth left a comment

yinghu5 left a comment

dbkinder commented Sep 11, 2024

mkbhanda left a comment

mkbhanda Sep 12, 2024

mkbhanda Sep 12, 2024

mkbhanda Sep 12, 2024

mkbhanda Sep 12, 2024

mkbhanda Sep 12, 2024

mkbhanda Sep 12, 2024

mkbhanda Sep 12, 2024

mkbhanda Sep 12, 2024

mkbhanda Sep 12, 2024

mkbhanda Sep 12, 2024

mkbhanda commented Sep 12, 2024


		## Prerequisites

		First step is to clone the GenAIExamples and GenAIComps. GenAIComps are


		## Prepare (Building / Pulling) Docker images

		This step will involve building/pulling ( maybe in future) relevant docker


		```

		TGI service generate text for the input prompt. Here is the expected result from TGI:


		```

		and the log shows model warm up, please wait for a while and try it later.

doc: add ChatQnA deploy on xeon example #104

doc: add ChatQnA deploy on xeon example #104

Conversation

dbkinder commented Sep 10, 2024

tomlenth left a comment

Choose a reason for hiding this comment

yinghu5 left a comment

Choose a reason for hiding this comment

dbkinder commented Sep 11, 2024

mkbhanda left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mkbhanda commented Sep 12, 2024