-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support multiple env-var for ENDPOINT in GMC #166
Comments
kubectl get svc -n switch
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
embedding-svc ClusterIP 10.96.212.67 <none> 6000/TCP 11m
embedding-svc-large ClusterIP 10.96.214.151 <none> 6000/TCP 11m
embedding-svc-small ClusterIP 10.96.39.68 <none> 6000/TCP 11m
llm-svc ClusterIP 10.96.236.0 <none> 9000/TCP 11m
llm-svc-intel ClusterIP 10.96.10.72 <none> 9000/TCP 11m
llm-svc-llama ClusterIP 10.96.176.105 <none> 9000/TCP 11m
redis-vector-db ClusterIP 10.96.190.159 <none> 6379/TCP,8001/TCP 11m
reranking-svc ClusterIP 10.96.224.112 <none> 8000/TCP 11m
retriever-svc ClusterIP 10.96.173.137 <none> 7000/TCP 11m
router-service ClusterIP 10.96.185.113 <none> 8080/TCP 11m
tei-embedding-svc-bge-small ClusterIP 10.96.236.40 <none> 6006/TCP 11m
tei-embedding-svc-bge15 ClusterIP 10.96.57.9 <none> 6006/TCP 11m
tei-reranking-svc ClusterIP 10.96.34.209 <none> 8808/TCP 11m
tgi-service-intel ClusterIP 10.96.187.199 <none> 9009/TCP 11m
tgi-service-llama ClusterIP 10.96.41.154 <none> 9009/TCP 11m current env var
|
the solutions is to overwrite and inject ENV into every deployment according to the specific config. |
Hi @irisdingbj, It's unnecessary to set # Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
apiVersion: gmc.opea.io/v1alpha3
kind: GMConnector
metadata:
labels:
app.kubernetes.io/name: gmconnector
app.kubernetes.io/managed-by: kustomize
gmc/platform: xeon
name: switch
namespace: switch
spec:
routerConfig:
name: router
serviceName: router-service
nodes:
root:
routerType: Sequence
steps:
- name: Embedding
nodeName: node1
- name: Retriever
data: $response
internalService:
serviceName: retriever-svc
config:
endpoint: /v1/retrieval
- name: VectorDB
internalService:
serviceName: redis-vector-db
isDownstreamService: true
- name: Reranking
data: $response
internalService:
serviceName: reranking-svc
config:
endpoint: /v1/reranking
- name: TeiReranking
internalService:
serviceName: tei-reranking-svc
config:
endpoint: /rerank
isDownstreamService: true
- name: Llm
nodeName: node2
node1:
routerType: Switch
steps:
- name: Embedding
condition: embedding-model-id==large
internalService:
serviceName: embedding-svc-large
config:
endpoint: /v1/embeddings
- name: TeiEmbedding
internalService:
serviceName: tei-embedding-svc-bge15
config:
EMBEDDING_MODEL_ID: BAAI/bge-base-en-v1.5
isDownstreamService: true
- name: Embedding
condition: embedding-model-id==small
internalService:
serviceName: embedding-svc-small
config:
endpoint: /v1/embeddings
- name: TeiEmbedding
internalService:
serviceName: tei-embedding-svc-bge-small
config:
EMBEDDING_MODEL_ID: BAAI/bge-small-en-v1.5
isDownstreamService: true
node2:
routerType: Switch
steps:
- name: Llm
condition: model_id==intel
data: $response
internalService:
serviceName: llm-svc-intel
config:
endpoint: /v1/chat/completions
- name: Tgi
internalService:
serviceName: tgi-service-intel
config:
endpoint: /generate
LLM_MODEL_ID: Intel/neural-chat-7b-v3-3
isDownstreamService: true
- name: Llm
condition: model_id==llama
data: $response
internalService:
serviceName: llm-svc-llama
config:
endpoint: /v1/chat/completions
- name: Tgi
internalService:
serviceName: tgi-service-llama
config:
endpoint: /generate
LLM_MODEL_ID: openlm-research/open_llama_3b
isDownstreamService: true |
Hi all, please refer to #206 for the final yaml files of multiple env-var endpoints support and switch support. |
TEI_EMBEDDING_ENDPOINT
TEI_RERANKING_ENDPOINT
TGI_LLM_ENDPOINT
The text was updated successfully, but these errors were encountered: