Skip to content

Commit

Permalink
Refine ChatQnA README for TGI (#715)
Browse files Browse the repository at this point in the history
* update chatqna readme for tgi

Signed-off-by: letonghan <[email protected]>

* update log block

Signed-off-by: letonghan <[email protected]>

---------

Signed-off-by: letonghan <[email protected]>
  • Loading branch information
letonghan authored Sep 3, 2024
1 parent e5ec38c commit afc3341
Show file tree
Hide file tree
Showing 5 changed files with 75 additions and 2 deletions.
13 changes: 13 additions & 0 deletions ChatQnA/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -224,6 +224,19 @@ Refer to the [Intel Technology enabling for Openshift readme](https://github.com

## Consume ChatQnA Service

Before consuming ChatQnA Service, make sure the TGI/vLLM service is ready (which takes up to 2 minutes to start).

```bash
# TGI example
docker logs tgi-service | grep Connected
```

Consume ChatQnA service until you get the TGI response like below.

```log
2024-09-03T02:47:53.402023Z INFO text_generation_router::server: router/src/server.rs:2311: Connected
```

Two ways of consuming ChatQnA Service:

1. Use cURL command on terminal
Expand Down
16 changes: 16 additions & 0 deletions ChatQnA/docker/gaudi/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -306,6 +306,22 @@ curl http://${host_ip}:8000/v1/reranking \

6. LLM backend Service

In first startup, this service will take more time to download the model files. After it's finished, the service will be ready.

Try the command below to check whether the LLM serving is ready.

```bash
docker logs ${CONTAINER_ID} | grep Connected
```

If the service is ready, you will get the response like below.

```log
2024-09-03T02:47:53.402023Z INFO text_generation_router::server: router/src/server.rs:2311: Connected
```

Then try the `cURL` command below to validate services.

```bash
#TGI Service
curl http://${host_ip}:8005/generate \
Expand Down
16 changes: 16 additions & 0 deletions ChatQnA/docker/gpu/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -192,6 +192,22 @@ curl http://${host_ip}:8000/v1/reranking \

6. TGI Service

In first startup, this service will take more time to download the model files. After it's finished, the service will be ready.

Try the command below to check whether the TGI service is ready.

```bash
docker logs ${CONTAINER_ID} | grep Connected
```

If the service is ready, you will get the response like below.

```log
2024-09-03T02:47:53.402023Z INFO text_generation_router::server: router/src/server.rs:2311: Connected
```

Then try the `cURL` command below to validate TGI.

```bash
curl http://${host_ip}:8008/generate \
-X POST \
Expand Down
16 changes: 14 additions & 2 deletions ChatQnA/docker/xeon/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -303,9 +303,21 @@ curl http://${host_ip}:8000/v1/reranking\

6. LLM backend Service

In first startup, this service will take more time to download the LLM file. After it's finished, the service will be ready.
In first startup, this service will take more time to download the model files. After it's finished, the service will be ready.

Use `docker logs CONTAINER_ID` to check if the download is finished.
Try the command below to check whether the LLM serving is ready.

```bash
docker logs ${CONTAINER_ID} | grep Connected
```

If the service is ready, you will get the response like below.

```log
2024-09-03T02:47:53.402023Z INFO text_generation_router::server: router/src/server.rs:2311: Connected
```

Then try the `cURL` command below to validate services.

```bash
# TGI service
Expand Down
16 changes: 16 additions & 0 deletions ChatQnA/docker/xeon/README_qdrant.md
Original file line number Diff line number Diff line change
Expand Up @@ -276,6 +276,22 @@ curl http://${host_ip}:6046/v1/reranking\

6. TGI Service

In first startup, this service will take more time to download the model files. After it's finished, the service will be ready.

Try the command below to check whether the TGI service is ready.

```bash
docker logs ${CONTAINER_ID} | grep Connected
```

If the service is ready, you will get the response like below.

```log
2024-09-03T02:47:53.402023Z INFO text_generation_router::server: router/src/server.rs:2311: Connected
```

Then try the `cURL` command below to validate TGI.

```bash
curl http://${host_ip}:6042/generate \
-X POST \
Expand Down

0 comments on commit afc3341

Please sign in to comment.