Skip to content

Commit

Permalink
Upgrade tgi-gaudi to version 2.0.6 (#551)
Browse files Browse the repository at this point in the history
* Upgrade tgi-gaudi to version 2.0.6
* Fix faqgen test to allign with GenAIExamples

Signed-off-by: Lianhao Lu <[email protected]>
  • Loading branch information
lianhao authored Nov 14, 2024
1 parent 691bbc5 commit 915baa0
Show file tree
Hide file tree
Showing 15 changed files with 22 additions and 18 deletions.
2 changes: 1 addition & 1 deletion helm-charts/agentqna/gaudi-values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ tgi:
accelDevice: "gaudi"
image:
repository: ghcr.io/huggingface/tgi-gaudi
tag: "2.0.5"
tag: "2.0.6"
resources:
limits:
habana.ai/gaudi: 4
Expand Down
2 changes: 1 addition & 1 deletion helm-charts/audioqna/gaudi-values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ tgi:
accelDevice: "gaudi"
image:
repository: ghcr.io/huggingface/tgi-gaudi
tag: "2.0.5"
tag: "2.0.6"
resources:
limits:
habana.ai/gaudi: 1
Expand Down
2 changes: 1 addition & 1 deletion helm-charts/chatqna/gaudi-values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ tgi:
accelDevice: "gaudi"
image:
repository: ghcr.io/huggingface/tgi-gaudi
tag: "2.0.5"
tag: "2.0.6"
resources:
limits:
habana.ai/gaudi: 1
Expand Down
4 changes: 2 additions & 2 deletions helm-charts/chatqna/guardrails-gaudi-values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ tgi:
accelDevice: "gaudi"
image:
repository: ghcr.io/huggingface/tgi-gaudi
tag: "2.0.5"
tag: "2.0.6"
resources:
limits:
habana.ai/gaudi: 1
Expand Down Expand Up @@ -81,7 +81,7 @@ tgi-guardrails:
LLM_MODEL_ID: "meta-llama/Meta-Llama-Guard-2-8B"
image:
repository: ghcr.io/huggingface/tgi-gaudi
tag: "2.0.5"
tag: "2.0.6"
resources:
limits:
habana.ai/gaudi: 1
Expand Down
2 changes: 1 addition & 1 deletion helm-charts/codegen/gaudi-values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ tgi:
accelDevice: "gaudi"
image:
repository: ghcr.io/huggingface/tgi-gaudi
tag: "2.0.5"
tag: "2.0.6"
resources:
limits:
habana.ai/gaudi: 1
Expand Down
2 changes: 1 addition & 1 deletion helm-charts/codetrans/gaudi-values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ tgi:
accelDevice: "gaudi"
image:
repository: ghcr.io/huggingface/tgi-gaudi
tag: "2.0.5"
tag: "2.0.6"
resources:
limits:
habana.ai/gaudi: 1
Expand Down
2 changes: 1 addition & 1 deletion helm-charts/common/agent/gaudi-values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ tgi:
accelDevice: "gaudi"
image:
repository: ghcr.io/huggingface/tgi-gaudi
tag: "2.0.5"
tag: "2.0.6"
resources:
limits:
habana.ai/gaudi: 4
Expand Down
2 changes: 1 addition & 1 deletion helm-charts/common/tgi/gaudi-values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ accelDevice: "gaudi"

image:
repository: ghcr.io/huggingface/tgi-gaudi
tag: "2.0.5"
tag: "2.0.6"

MAX_INPUT_LENGTH: "1024"
MAX_TOTAL_TOKENS: "2048"
Expand Down
2 changes: 1 addition & 1 deletion helm-charts/docsum/gaudi-values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ tgi:
accelDevice: "gaudi"
image:
repository: ghcr.io/huggingface/tgi-gaudi
tag: "2.0.5"
tag: "2.0.6"
resources:
limits:
habana.ai/gaudi: 1
Expand Down
6 changes: 4 additions & 2 deletions helm-charts/faqgen/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,10 @@ Open another terminal and run the following command to verify the service if wor

```console
curl http://localhost:8888/v1/faqgen \
-H "Content-Type: application/json" \
-d '{"messages": "Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5."}'
-H "Content-Type: multipart/form-data" \
-F "messages=Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5." \
-F "max_tokens=32" \
-F "stream=false"
```

### Verify the workload through UI
Expand Down
2 changes: 1 addition & 1 deletion helm-charts/faqgen/gaudi-values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ tgi:
accelDevice: "gaudi"
image:
repository: ghcr.io/huggingface/tgi-gaudi
tag: "2.0.5"
tag: "2.0.6"
resources:
limits:
habana.ai/gaudi: 1
Expand Down
6 changes: 4 additions & 2 deletions helm-charts/faqgen/templates/tests/test-pod.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,10 @@ spec:
max_retry=20;
for ((i=1; i<=max_retry; i++)); do
curl http://{{ include "faqgen.fullname" . }}:{{ .Values.service.port }}/v1/faqgen -sS --fail-with-body \
-H "Content-Type: application/json" \
-d '{"messages": "Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5.","max_tokens":17}' && break;
-H "Content-Type: multipart/form-data" \
-F "messages=Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5." \
-F "max_tokens=32" \
-F "stream=false" && break;
curlcode=$?
if [[ $curlcode -eq 7 ]]; then sleep 10; else echo "curl failed with code $curlcode"; exit 1; fi;
done;
Expand Down
2 changes: 1 addition & 1 deletion helm-charts/visualqna/gaudi-values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ tgi:
accelDevice: "gaudi"
image:
repository: ghcr.io/huggingface/tgi-gaudi
tag: "2.0.5"
tag: "2.0.6"
resources:
limits:
habana.ai/gaudi: 1
Expand Down
2 changes: 1 addition & 1 deletion microservices-connector/config/manifests/tgi_gaudi.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,7 @@ spec:
optional: true
securityContext:
{}
image: "ghcr.io/huggingface/tgi-gaudi:2.0.5"
image: "ghcr.io/huggingface/tgi-gaudi:2.0.6"
imagePullPolicy: IfNotPresent
volumeMounts:
- mountPath: /data
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ Should you desire to use the Gaudi accelerator, two alternate images are used fo
For Gaudi:

- tei-embedding-service: ghcr.io/huggingface/tei-gaudi:1.5.0
- tgi-service: ghcr.io/huggingface/tgi-gaudi:2.0.5
- tgi-service: ghcr.io/huggingface/tgi-gaudi:2.0.6

## Deploy ChatQnA pipeline

Expand Down

0 comments on commit 915baa0

Please sign in to comment.