Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add default model in readme for FaqGen and DocSum #693

Merged
merged 8 commits into from
Aug 30, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions DocSum/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,16 @@ Currently we support two ways of deploying Document Summarization services with

2. Start services using the docker images `built from source`: [Guide](./docker)

### Required Models

We set default model as "Intel/neural-chat-7b-v3-3", change "LLM_MODEL_ID" in "set_env.sh" if you want to use other models.

```
export LLM_MODEL_ID="Intel/neural-chat-7b-v3-3"
```

If use gated models, you also need to provide [huggingface token](https://huggingface.co/docs/hub/security-tokens) to "HUGGINGFACEHUB_API_TOKEN" environment variable.

### Setup Environment Variable

To set up environment variables for deploying Document Summarization services, follow these steps:
Expand Down
5 changes: 5 additions & 0 deletions DocSum/docker/gaudi/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,11 @@ Then run the command `docker images`, you will have the following Docker Images:

## 🚀 Start Microservices and MegaService

### Required Models

We set default model as "Intel/neural-chat-7b-v3-3", change "LLM_MODEL_ID" in following setting if you want to use other models.
If use gated models, you also need to provide [huggingface token](https://huggingface.co/docs/hub/security-tokens) to "HUGGINGFACEHUB_API_TOKEN" environment variable.

### Setup Environment Variables

Since the `compose.yaml` will consume some environment variables, you need to setup them in advance as below.
Expand Down
5 changes: 5 additions & 0 deletions DocSum/docker/xeon/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,11 @@ Then run the command `docker images`, you will have the following Docker Images:

## 🚀 Start Microservices and MegaService

### Required Models

We set default model as "Intel/neural-chat-7b-v3-3", change "LLM_MODEL_ID" in following Environment Variables setting if you want to use other models.
If use gated models, you also need to provide [huggingface token](https://huggingface.co/docs/hub/security-tokens) to "HUGGINGFACEHUB_API_TOKEN" environment variable.

### Setup Environment Variables

Since the `compose.yaml` will consume some environment variables, you need to setup them in advance as below.
Expand Down
4 changes: 3 additions & 1 deletion DocSum/kubernetes/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,9 @@ These will be available on Docker Hub soon, simplifying installation.
This involves deploying the application pipeline custom resource. You can use docsum_xeon.yaml if you have just a Xeon cluster or docsum_gaudi.yaml if you have a Gaudi cluster.

1. Setup Environment variables. These are specific to the user. Skip the proxy settings if you are not operating behind one.


We use "Intel/neural-chat-7b-v3-3" as an example. If you want to use other models, change "LLM_MODEL_ID" in following setting and change "MODEL_ID" in manifests yaml file.

```bash
export no_proxy=${your_no_proxy}
export http_proxy=${your_http_proxy}
Expand Down
8 changes: 7 additions & 1 deletion FaqGen/docker/gaudi/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,12 @@ Then run the command `docker images`, you will have the following Docker Images:

## 🚀 Start Microservices and MegaService

### Required Models

We set default model as "meta-llama/Meta-Llama-3-8B-Instruct", change "LLM_MODEL_ID" in following Environment Variables setting if you want to use other models.

If use gated models, you also need to provide [huggingface token](https://huggingface.co/docs/hub/security-tokens) to "HUGGINGFACEHUB_API_TOKEN" environment variable.

### Setup Environment Variables

Since the `compose.yaml` will consume some environment variables, you need to setup them in advance as below.
Expand All @@ -72,7 +78,7 @@ Since the `compose.yaml` will consume some environment variables, you need to se
export no_proxy=${your_no_proxy}
export http_proxy=${your_http_proxy}
export https_proxy=${your_http_proxy}
export LLM_MODEL_ID="Intel/neural-chat-7b-v3-3"
export LLM_MODEL_ID="meta-llama/Meta-Llama-3-8B-Instruct"
export TGI_LLM_ENDPOINT="http://${your_ip}:8008"
export HUGGINGFACEHUB_API_TOKEN=${your_hf_api_token}
export MEGA_SERVICE_HOST_IP=${host_ip}
Expand Down
8 changes: 7 additions & 1 deletion FaqGen/docker/xeon/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,12 @@ Then run the command `docker images`, you will have the following Docker Images:

## 🚀 Start Microservices and MegaService

### Required Models

We set default model as "meta-llama/Meta-Llama-3-8B-Instruct", change "LLM_MODEL_ID" in following Environment Variables setting if you want to use other models.

If use gated models, you also need to provide [huggingface token](https://huggingface.co/docs/hub/security-tokens) to "HUGGINGFACEHUB_API_TOKEN" environment variable.

### Setup Environment Variables

Since the `compose.yaml` will consume some environment variables, you need to setup them in advance as below.
Expand All @@ -71,7 +77,7 @@ Since the `compose.yaml` will consume some environment variables, you need to se
export no_proxy=${your_no_proxy}
export http_proxy=${your_http_proxy}
export https_proxy=${your_http_proxy}
export LLM_MODEL_ID="Intel/neural-chat-7b-v3-3"
export LLM_MODEL_ID="meta-llama/Meta-Llama-3-8B-Instruct"
hshen14 marked this conversation as resolved.
Show resolved Hide resolved
export TGI_LLM_ENDPOINT="http://${your_ip}:8008"
export HUGGINGFACEHUB_API_TOKEN=${your_hf_api_token}
export MEGA_SERVICE_HOST_IP=${host_ip}
Expand Down
11 changes: 10 additions & 1 deletion FaqGen/kubernetes/manifests/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,16 @@
> [NOTE]
> The following values must be set before you can deploy:
> HUGGINGFACEHUB_API_TOKEN
> You can also customize the "MODEL_ID" and "model-volume"
> You can also customize the "MODEL_ID" and "model-volume".
## Required Models
We set "meta-llama/Meta-Llama-3-8B-Instruct" as default model, if you want to use other models, change arguments "--model-id" in `xeon/faqgen.yaml` or `gaudi/faqgen.yaml`.
```
- --model-id
- 'meta-llama/Meta-Llama-3-8B-Instruct'
```

If use gated models, you also need to provide [huggingface token](https://huggingface.co/docs/hub/security-tokens) to "HUGGINGFACEHUB_API_TOKEN" environment variable.

## Deploy On Xeon

Expand Down