Skip to content

Commit

Permalink
Add default model in readme for FaqGen and DocSum (#693)
Browse files Browse the repository at this point in the history
* update default model in readme for DocSum

Signed-off-by: Xinyao Wang <[email protected]>
  • Loading branch information
XinyaoWa authored Aug 30, 2024
1 parent e6f5d13 commit d487093
Show file tree
Hide file tree
Showing 7 changed files with 47 additions and 4 deletions.
10 changes: 10 additions & 0 deletions DocSum/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,16 @@ Currently we support two ways of deploying Document Summarization services with

2. Start services using the docker images `built from source`: [Guide](./docker)

### Required Models

We set default model as "Intel/neural-chat-7b-v3-3", change "LLM_MODEL_ID" in "set_env.sh" if you want to use other models.

```
export LLM_MODEL_ID="Intel/neural-chat-7b-v3-3"
```

If use gated models, you also need to provide [huggingface token](https://huggingface.co/docs/hub/security-tokens) to "HUGGINGFACEHUB_API_TOKEN" environment variable.

### Setup Environment Variable

To set up environment variables for deploying Document Summarization services, follow these steps:
Expand Down
5 changes: 5 additions & 0 deletions DocSum/docker/gaudi/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,11 @@ Then run the command `docker images`, you will have the following Docker Images:

## 🚀 Start Microservices and MegaService

### Required Models

We set default model as "Intel/neural-chat-7b-v3-3", change "LLM_MODEL_ID" in following setting if you want to use other models.
If use gated models, you also need to provide [huggingface token](https://huggingface.co/docs/hub/security-tokens) to "HUGGINGFACEHUB_API_TOKEN" environment variable.

### Setup Environment Variables

Since the `compose.yaml` will consume some environment variables, you need to setup them in advance as below.
Expand Down
5 changes: 5 additions & 0 deletions DocSum/docker/xeon/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,11 @@ Then run the command `docker images`, you will have the following Docker Images:

## 🚀 Start Microservices and MegaService

### Required Models

We set default model as "Intel/neural-chat-7b-v3-3", change "LLM_MODEL_ID" in following Environment Variables setting if you want to use other models.
If use gated models, you also need to provide [huggingface token](https://huggingface.co/docs/hub/security-tokens) to "HUGGINGFACEHUB_API_TOKEN" environment variable.

### Setup Environment Variables

Since the `compose.yaml` will consume some environment variables, you need to setup them in advance as below.
Expand Down
4 changes: 3 additions & 1 deletion DocSum/kubernetes/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,9 @@ These will be available on Docker Hub soon, simplifying installation.
This involves deploying the application pipeline custom resource. You can use docsum_xeon.yaml if you have just a Xeon cluster or docsum_gaudi.yaml if you have a Gaudi cluster.

1. Setup Environment variables. These are specific to the user. Skip the proxy settings if you are not operating behind one.


We use "Intel/neural-chat-7b-v3-3" as an example. If you want to use other models, change "LLM_MODEL_ID" in following setting and change "MODEL_ID" in manifests yaml file.

```bash
export no_proxy=${your_no_proxy}
export http_proxy=${your_http_proxy}
Expand Down
8 changes: 7 additions & 1 deletion FaqGen/docker/gaudi/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,12 @@ Then run the command `docker images`, you will have the following Docker Images:

## 🚀 Start Microservices and MegaService

### Required Models

We set default model as "meta-llama/Meta-Llama-3-8B-Instruct", change "LLM_MODEL_ID" in following Environment Variables setting if you want to use other models.

If use gated models, you also need to provide [huggingface token](https://huggingface.co/docs/hub/security-tokens) to "HUGGINGFACEHUB_API_TOKEN" environment variable.

### Setup Environment Variables

Since the `compose.yaml` will consume some environment variables, you need to setup them in advance as below.
Expand All @@ -72,7 +78,7 @@ Since the `compose.yaml` will consume some environment variables, you need to se
export no_proxy=${your_no_proxy}
export http_proxy=${your_http_proxy}
export https_proxy=${your_http_proxy}
export LLM_MODEL_ID="Intel/neural-chat-7b-v3-3"
export LLM_MODEL_ID="meta-llama/Meta-Llama-3-8B-Instruct"
export TGI_LLM_ENDPOINT="http://${your_ip}:8008"
export HUGGINGFACEHUB_API_TOKEN=${your_hf_api_token}
export MEGA_SERVICE_HOST_IP=${host_ip}
Expand Down
8 changes: 7 additions & 1 deletion FaqGen/docker/xeon/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,12 @@ Then run the command `docker images`, you will have the following Docker Images:

## 🚀 Start Microservices and MegaService

### Required Models

We set default model as "meta-llama/Meta-Llama-3-8B-Instruct", change "LLM_MODEL_ID" in following Environment Variables setting if you want to use other models.

If use gated models, you also need to provide [huggingface token](https://huggingface.co/docs/hub/security-tokens) to "HUGGINGFACEHUB_API_TOKEN" environment variable.

### Setup Environment Variables

Since the `compose.yaml` will consume some environment variables, you need to setup them in advance as below.
Expand All @@ -71,7 +77,7 @@ Since the `compose.yaml` will consume some environment variables, you need to se
export no_proxy=${your_no_proxy}
export http_proxy=${your_http_proxy}
export https_proxy=${your_http_proxy}
export LLM_MODEL_ID="Intel/neural-chat-7b-v3-3"
export LLM_MODEL_ID="meta-llama/Meta-Llama-3-8B-Instruct"
export TGI_LLM_ENDPOINT="http://${your_ip}:8008"
export HUGGINGFACEHUB_API_TOKEN=${your_hf_api_token}
export MEGA_SERVICE_HOST_IP=${host_ip}
Expand Down
11 changes: 10 additions & 1 deletion FaqGen/kubernetes/manifests/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,16 @@
> [NOTE]
> The following values must be set before you can deploy:
> HUGGINGFACEHUB_API_TOKEN
> You can also customize the "MODEL_ID" and "model-volume"
> You can also customize the "MODEL_ID" and "model-volume".
## Required Models
We set "meta-llama/Meta-Llama-3-8B-Instruct" as default model, if you want to use other models, change arguments "--model-id" in `xeon/faqgen.yaml` or `gaudi/faqgen.yaml`.
```
- --model-id
- 'meta-llama/Meta-Llama-3-8B-Instruct'
```

If use gated models, you also need to provide [huggingface token](https://huggingface.co/docs/hub/security-tokens) to "HUGGINGFACEHUB_API_TOKEN" environment variable.

## Deploy On Xeon

Expand Down

0 comments on commit d487093

Please sign in to comment.