support multiple env-var for ENDPOINT in GMC #166

irisdingbj · 2024-07-11T22:54:45Z

TEI_EMBEDDING_ENDPOINT
TEI_RERANKING_ENDPOINT
TGI_LLM_ENDPOINT

# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

apiVersion: gmc.opea.io/v1alpha3
kind: GMConnector
metadata:
  labels:
    app.kubernetes.io/name: gmconnector
    app.kubernetes.io/managed-by: kustomize
    gmc/platform: xeon
  name: switch
  namespace: switch
spec:
  routerConfig:
    name: router
    serviceName: router-service
  nodes:
    root:
      routerType: Sequence
      steps:
      - name: Embedding
        nodeName: node1
      - name: Retriever
        data: $response
        internalService:
          serviceName: retriever-svc
          config:
            endpoint: /v1/retrieval
      - name: VectorDB
        internalService:
          serviceName: redis-vector-db
          isDownstreamService: true
      - name: Reranking
        data: $response
        internalService:
          serviceName: reranking-svc
          config:
            endpoint: /v1/reranking
      - name: TeiReranking
        internalService:
          serviceName: tei-reranking-svc
          config:
            endpoint: /rerank
          isDownstreamService: true
      - name: Llm
        nodeName: node2
    node1:
      routerType: Switch
      steps:
        - name: Embedding
          condition: embedding-model-id==large
          internalService:
            serviceName: embedding-svc-large
            config:
              endpoint: /v1/embeddings
        - name: TeiEmbedding
          condition: embedding-model-id==large
          internalService:
            serviceName: tei-embedding-svc-bge15
            config:
              EMBEDDING_MODEL_ID: BAAI/bge-base-en-v1.5
            isDownstreamService: true
        - name: Embedding
          condition: embedding-model-id==small
          internalService:
            serviceName: embedding-svc-small
            config:
              endpoint: /v1/embeddings
        - name: TeiEmbedding
          condition: embedding-model-id==small
          internalService:
            serviceName: tei-embedding-svc-bge-small
            config:
              EMBEDDING_MODEL_ID: BAAI/bge-small-en-v1.5
            isDownstreamService: true
    node2:
      routerType: Switch
      steps:
        - name: Llm
          condition: model_id==intel
          data: $response
          internalService:
            serviceName: llm-svc-intel
            config:
              endpoint: /v1/chat/completions
        - name: Tgi
          condition: model-id==intel
          internalService:
            serviceName: tgi-service-intel
            config:
              endpoint: /generate
              LLM_MODEL_ID: Intel/neural-chat-7b-v3-3
            isDownstreamService: true
        - name: Llm
          condition: model_id==llama
          data: $response
          internalService:
            serviceName: llm-svc-llama
            config:
              endpoint: /v1/chat/completions
        - name: Tgi
          condition: model_id==llama
          internalService:
            serviceName: tgi-service-llama
            config:
              endpoint: /generate
              LLM_MODEL_ID: openlm-research/open_llama_3b
            isDownstreamService: true

irisdingbj · 2024-07-11T23:04:13Z

kubectl get svc -n switch
NAME                          TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)             AGE
embedding-svc                 ClusterIP   10.96.212.67    <none>        6000/TCP            11m
embedding-svc-large           ClusterIP   10.96.214.151   <none>        6000/TCP            11m
embedding-svc-small           ClusterIP   10.96.39.68     <none>        6000/TCP            11m
llm-svc                       ClusterIP   10.96.236.0     <none>        9000/TCP            11m
llm-svc-intel                 ClusterIP   10.96.10.72     <none>        9000/TCP            11m
llm-svc-llama                 ClusterIP   10.96.176.105   <none>        9000/TCP            11m
redis-vector-db               ClusterIP   10.96.190.159   <none>        6379/TCP,8001/TCP   11m
reranking-svc                 ClusterIP   10.96.224.112   <none>        8000/TCP            11m
retriever-svc                 ClusterIP   10.96.173.137   <none>        7000/TCP            11m
router-service                ClusterIP   10.96.185.113   <none>        8080/TCP            11m
tei-embedding-svc-bge-small   ClusterIP   10.96.236.40    <none>        6006/TCP            11m
tei-embedding-svc-bge15       ClusterIP   10.96.57.9      <none>        6006/TCP            11m
tei-reranking-svc             ClusterIP   10.96.34.209    <none>        8808/TCP            11m
tgi-service-intel             ClusterIP   10.96.187.199   <none>        9009/TCP            11m
tgi-service-llama             ClusterIP   10.96.41.154    <none>        9009/TCP            11m

current env var

  EMBEDDING_MODEL_ID: BAAI/bge-base-en-v1.5
  EMBEDDING_SERVICE_HOST_IP: embedding-svc
  HUGGINGFACEHUB_API_TOKEN: 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
  INDEX_NAME: rag-redis
  LLM_MODEL_ID: Intel/neural-chat-7b-v3-3
  LLM_SERVICE_HOST_IP: llm-svc
  REDIS_URL: redis://redis-vector-db.switch.svc.cluster.local:6379
  RERANK_MODEL_ID: BAAI/bge-reranker-large
  RERANK_SERVICE_HOST_IP: reranking-svc
  RETRIEVER_SERVICE_HOST_IP: retriever-svc
  TEI_EMBEDDING_ENDPOINT: http://tei-embedding-svc-bge-small.switch.svc.cluster.local:6006
  TEI_RERANKING_ENDPOINT: http://tei-reranking-svc.switch.svc.cluster.local:8808
  TGI_LLM_ENDPOINT: http://tgi-service-llama.switch.svc.cluster.local:9009

KfreeZ · 2024-07-12T05:35:51Z

the solutions is to overwrite and inject ENV into every deployment according to the specific config.

zhlsunshine · 2024-07-17T10:32:06Z

Hi @irisdingbj, It's unnecessary to set condition for some components, such as TeiEmbedding and Tgi, because they are decided by Embedding and Llm. Does it make sense? If yes, the YAML should be below:

# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

apiVersion: gmc.opea.io/v1alpha3
kind: GMConnector
metadata:
  labels:
    app.kubernetes.io/name: gmconnector
    app.kubernetes.io/managed-by: kustomize
    gmc/platform: xeon
  name: switch
  namespace: switch
spec:
  routerConfig:
    name: router
    serviceName: router-service
  nodes:
    root:
      routerType: Sequence
      steps:
      - name: Embedding
        nodeName: node1
      - name: Retriever
        data: $response
        internalService:
          serviceName: retriever-svc
          config:
            endpoint: /v1/retrieval
      - name: VectorDB
        internalService:
          serviceName: redis-vector-db
          isDownstreamService: true
      - name: Reranking
        data: $response
        internalService:
          serviceName: reranking-svc
          config:
            endpoint: /v1/reranking
      - name: TeiReranking
        internalService:
          serviceName: tei-reranking-svc
          config:
            endpoint: /rerank
          isDownstreamService: true
      - name: Llm
        nodeName: node2
    node1:
      routerType: Switch
      steps:
        - name: Embedding
          condition: embedding-model-id==large
          internalService:
            serviceName: embedding-svc-large
            config:
              endpoint: /v1/embeddings
        - name: TeiEmbedding
          internalService:
            serviceName: tei-embedding-svc-bge15
            config:
              EMBEDDING_MODEL_ID: BAAI/bge-base-en-v1.5
            isDownstreamService: true
        - name: Embedding
          condition: embedding-model-id==small
          internalService:
            serviceName: embedding-svc-small
            config:
              endpoint: /v1/embeddings
        - name: TeiEmbedding
          internalService:
            serviceName: tei-embedding-svc-bge-small
            config:
              EMBEDDING_MODEL_ID: BAAI/bge-small-en-v1.5
            isDownstreamService: true
    node2:
      routerType: Switch
      steps:
        - name: Llm
          condition: model_id==intel
          data: $response
          internalService:
            serviceName: llm-svc-intel
            config:
              endpoint: /v1/chat/completions
        - name: Tgi
          internalService:
            serviceName: tgi-service-intel
            config:
              endpoint: /generate
              LLM_MODEL_ID: Intel/neural-chat-7b-v3-3
            isDownstreamService: true
        - name: Llm
          condition: model_id==llama
          data: $response
          internalService:
            serviceName: llm-svc-llama
            config:
              endpoint: /v1/chat/completions
        - name: Tgi
          internalService:
            serviceName: tgi-service-llama
            config:
              endpoint: /generate
              LLM_MODEL_ID: openlm-research/open_llama_3b
            isDownstreamService: true

zhlsunshine · 2024-07-22T08:57:54Z

Hi all, please refer to #206 for the final yaml files of multiple env-var endpoints support and switch support.

irisdingbj changed the title ~~support multiple multiple env-var for ENDPOINT~~ support multiple multiple env-var for ENDPOINT in GMC Jul 11, 2024

irisdingbj changed the title ~~support multiple multiple env-var for ENDPOINT in GMC~~ support multiple env-var for ENDPOINT in GMC Jul 11, 2024

irisdingbj assigned KfreeZ Jul 11, 2024

KfreeZ mentioned this issue Jul 12, 2024

GMC failed to continue the process if there's branch or switch case happen in the middle of the pipeline. #143

Closed

bharagha mentioned this issue Jul 16, 2024

[RFC]: Dynamic pipeline composition opea-project/GenAIExamples#270

Closed

KfreeZ mentioned this issue Jul 16, 2024

support multiple nodes of a graph, apply different endpoint env for different node #176

Closed

3 tasks

KfreeZ mentioned this issue Jul 19, 2024

GMC: adopt new common/menifests #203

Merged

3 tasks

zhlsunshine mentioned this issue Jul 22, 2024

Support ChatQnA example in switch mode via GMC for MI6 team #206

Merged

3 tasks

zhlsunshine mentioned this issue Jul 22, 2024

Update all examples yaml files of GMC in GenAIExample opea-project/GenAIExamples#436

Merged

4 tasks

daisy-ycguo added the gmc label Jul 25, 2024

KfreeZ closed this as completed Jul 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support multiple env-var for ENDPOINT in GMC #166

support multiple env-var for ENDPOINT in GMC #166

irisdingbj commented Jul 11, 2024

irisdingbj commented Jul 11, 2024

KfreeZ commented Jul 12, 2024 •

edited

Loading

zhlsunshine commented Jul 17, 2024

zhlsunshine commented Jul 22, 2024

support multiple env-var for ENDPOINT in GMC #166

support multiple env-var for ENDPOINT in GMC #166

Comments

irisdingbj commented Jul 11, 2024

irisdingbj commented Jul 11, 2024

KfreeZ commented Jul 12, 2024 • edited Loading

zhlsunshine commented Jul 17, 2024

zhlsunshine commented Jul 22, 2024

KfreeZ commented Jul 12, 2024 •

edited

Loading