Merge pull request #1 from thepetk/feat/add_generate_script

Feat/add generate script
thepetk · Jan 27, 2025 · f8f4106 · f8f4106
2 parents ffeb175 + 96f116e
commit f8f4106
Show file tree

Hide file tree

Showing 16 changed files with 192 additions and 128 deletions.
diff --git a/.github/workflows/update-charts.yaml b/.github/workflows/update-charts.yaml
@@ -0,0 +1,24 @@
+name: CI
+
+on:
+  push:
+    branches: [main]
+  pull_request:
+    branches: [main]
+
+jobs:
+  verify-dependencies:
+    runs-on: ubuntu-latest
+
+    steps:
+      - uses: actions/[email protected]
+
+      - name: Run generate.sh and check if the ai-lab-app based charts are up-to-date
+        run: |
+          bash generate.sh
+          if [[ ! -z $(git status -s) ]]
+          then
+            echo 'The script `./generate.sh` did introduce changes, which should ideally be checked in as part of the PR.'
+            git status
+            exit 1
+          fi
diff --git a/README.md b/README.md
@@ -8,8 +8,8 @@ The gitops component, handled by ArgoCD for the RHDH case, is replaced by a Kube
 
 - Creates the GitHub repository for the application.
 - Copies the application source code into the new repository.
-- Copies the Tekton `PipelineRun` that builds new images for the application after pull requests merge or commits are pushed directly to the `main` branch, 
-and then updates the Deployment of the application with the new version of the image by directly patching the Deployment via `oc`, vs. ArgoCD using gitops to patch the Deployment.
+- Copies the Tekton `PipelineRun` that builds new images for the application after pull requests merge or commits are pushed directly to the `main` branch,
+  and then updates the Deployment of the application with the new version of the image by directly patching the Deployment via `oc`, vs. ArgoCD using gitops to patch the Deployment.
 - Commits these changes and pushes the commit to the preferred branch of the new repository.
 
 The source code is [here](charts/ai-software-templates/chatbot/templates/application-gitops-job.yaml).

diff --git a/charts/ai-software-templates/README.md b/charts/ai-software-templates/README.md
@@ -0,0 +1,9 @@
+## ai-software-templates
+
+Apart from the [pipeline-install](./pipeline-install/) and the [pipeline-setup](./pipeline-setup/) which are tools to help you install the tekton pipelines along with your charts, the rest of charts under this directory are based on the [redhat-ai-dev/ai-lab-app](https://github.com/redhat-ai-dev/ai-lab-app) gitops resources repo.
+
+### Pull automatically latest changes
+
+To pull all latest changes for the charts based on `ai-lab-app` you can simply run the `generate.sh` script placed in the root dir of this repository.
+
+The script will clone the `ai-lab-app` and will convert the necessary resources to helm chart template files. Each chart has a corresponding `env` file under `scripts/envs` dir which helps us configure the behavior of the conversion process.
diff --git a/charts/ai-software-templates/chatbot/README.md b/charts/ai-software-templates/chatbot/README.md
@@ -10,7 +10,12 @@ This Helm Chart deploys a Large Language Model (LLM)-enabled [chat bot applicati
 This Helm Chart creates the following components:
 
 ### The Model Service
-Based on the `llama.cpp` inference server and related to the [ai-lab-recipes model server](https://github.com/containers/ai-lab-recipes/tree/main/model_servers/llamacpp_python).
+By default the `chatbot-ai-sample` supports the `llama.cpp` inference server and related to the [ai-lab-recipes model server](https://github.com/containers/ai-lab-recipes/tree/main/model_servers/llamacpp_python).
+
+However, the usage of `vLLM` model services or existing model services is also supported:
+* For the `vLLM` model service case, the `Values.model.vllmSelected` value should be `true`, the `Values.model.vllmModelServiceContainer` and the `Values.model.modelName` should be configured too.
+* For the existing model service case, the `Values.model.existingModelServer` value should be `true` and the `Values.model.modelEndpoint` should be set to the URL of the existing model endpoint we would like to use for this deployment.
+* In case the existing model service requires bearer authentication the `Values.model.includeModelEndpointSecret` should be set to `true`, the `Values.model.modelEndpointSecretName` and the `Values.model.modelEndpointSecretKey` should be configured.
 
 ### The Application
 A [Streamlit](https://github.com/streamlit/streamlit) application to interact with the model service which is based on the related [Chatbot Template](https://github.com/redhat-ai-dev/ai-lab-template/tree/main/templates/chatbot/content).
@@ -71,19 +76,24 @@ Kubernetes: `>= 1.27.0-0`
 | application.appContainer | string | `"quay.io/redhat-ai-dev/chatbot:latest"` | The image used for the initial chatbot application interface |
 | application.appPort | int | `8501` | The exposed port of the application |
 | gitops.gitDefaultBranch | string | `"main"` | The default branch for the chatbot application Github repository |
-| gitops.gitSecretKeyToken | string | `"GITHUB_TOKEN"` | The name of the Secret's key with the Github token value |
+| gitops.gitSecretKeyToken | string | `"password"` | The name of the Secret's key with the Github token value |
 | gitops.gitSecretName | string | `"github-secrets"` | The name of the Secret containing the required Github token |
 | gitops.gitSourceRepo | string | `"redhat-ai-dev/ai-lab-samples"` | The Github Repository with the contents of the ai-lab sample chatbot application |
 | gitops.githubOrgName | string | `""` | [REQUIRED] The Github Organization name that the chatbot application repository will be created in |
 | gitops.quayAccountName | string | `""` | [REQUIRED] The quay.io account that the application image will be pushed |
-| model.dbRequired | bool | `false` | The bool variable for support of model database |
 | model.existingModelServer | bool | `false` | The bool variable for support of existing model server |
 | model.includeModelEndpointSecret | bool | `false` | The bool variable for support of bearer token authentication for existing model server authentication |
 | model.initContainer | string | `"quay.io/redhat-ai-dev/granite-7b-lab:latest"` | The image used for the initContainer of the model service deployment |
+| model.maxModelLength | int | `4096` | The maximum sequence length of the model. It is used only for the vllm case and the default value is 4096. |
+| model.modelEndpoint | string | `""` | The endpoint url of the model for the existing model service case. Is used only if existingModelServer is set to true. |
+| model.modelEndpointSecretKey | string | `""` | The name of the secret field storing the bearer value for the existing model service if the endpoint requires bearer authentication. Is used only if includeModelEndpointSecret is set to true. |
+| model.modelEndpointSecretName | string | `""` | The name of the secret storing the credentials for the existing model service if the endpoint requires bearer authentication. Is used only if includeModelEndpointSecret is set to true. |
 | model.modelInitCommand | string | `"['/usr/bin/install', '/model/model.file', '/shared/']"` | The model service initContainer command |
+| model.modelName | string | `""` | The name of the model. By defaults it is set to instructlab/granite-7b-lab. It is used only for vllm and/or existing model service cases. |
 | model.modelPath | string | `"/model/model.file"` | The path of the model file inside the model service container |
-| model.modelServiceContainer | string | `"quay.io/ai-lab/llamacpp_python:latest"` | The image used for the model service |
+| model.modelServiceContainer | string | `"quay.io/ai-lab/llamacpp_python:latest"` | The image used for the model service. For the VLLM case please see vllmModelServiceContainer |
 | model.modelServicePort | int | `8001` | The exposed port of the model service |
-| model.vllmSelected | bool | `false` | The bool variable for support of vllms |
+| model.vllmModelServiceContainer | string | `""` | The image used for the model service for the VLLM use case. |
+| model.vllmSelected | bool | `false` | The bool variable for support of vllm instead of llama_cpp. Be sure that your system has GPU support for this case. |
 
 **NOTE:** Your helm release's name will be used as the name of the application github repository
diff --git a/charts/ai-software-templates/chatbot/README.md.gotmpl b/charts/ai-software-templates/chatbot/README.md.gotmpl
@@ -10,7 +10,12 @@
 This Helm Chart creates the following components:
 
 ### The Model Service
-Based on the `llama.cpp` inference server and related to the [ai-lab-recipes model server](https://github.com/containers/ai-lab-recipes/tree/main/model_servers/llamacpp_python).
+By default the `chatbot-ai-sample` supports the `llama.cpp` inference server and related to the [ai-lab-recipes model server](https://github.com/containers/ai-lab-recipes/tree/main/model_servers/llamacpp_python).
+
+However, the usage of `vLLM` model services or existing model services is also supported:
+* For the `vLLM` model service case, the `Values.model.vllmSelected` value should be `true`, the `Values.model.vllmModelServiceContainer` and the `Values.model.modelName` should be configured too. 
+* For the existing model service case, the `Values.model.existingModelServer` value should be `true` and the `Values.model.modelEndpoint` should be set to the URL of the existing model endpoint we would like to use for this deployment.
+* In case the existing model service requires bearer authentication the `Values.model.includeModelEndpointSecret` should be set to `true`, the `Values.model.modelEndpointSecretName` and the `Values.model.modelEndpointSecretKey` should be configured.
 
 ### The Application
 A [Streamlit](https://github.com/streamlit/streamlit) application to interact with the model service which is based on the related [Chatbot Template](https://github.com/redhat-ai-dev/ai-lab-template/tree/main/templates/chatbot/content).

diff --git a/charts/ai-software-templates/chatbot/templates/deployment-model-server.yaml b/charts/ai-software-templates/chatbot/templates/deployment-model-server.yaml
@@ -1,87 +1,80 @@
+{{ if not .Values.model.existingModelServer }}
 apiVersion: apps/v1
 kind: Deployment
 metadata:
-  labels: 
+  labels:
     app.kubernetes.io/instance: {{ .Release.Name }}-model-server
-    app.kubernetes.io/name:  {{ .Release.Name }}-model-server
-    app.kubernetes.io/part-of: {{ .Release.Name }}  
+    app.kubernetes.io/name: {{ .Release.Name }}-model-server
+    app.kubernetes.io/part-of: {{ .Release.Name }}
   name: {{ .Release.Name }}-model-server
   namespace: {{ .Release.Namespace }}
 spec:
   replicas: 1
   selector:
     matchLabels:
-      app.kubernetes.io/instance:  {{ .Release.Name }}-model-server 
+      app.kubernetes.io/instance: {{ .Release.Name }}-model-server
   template:
-    metadata: 
+    metadata:
       labels:
-        app.kubernetes.io/instance:  {{ .Release.Name }}-model-server
+        app.kubernetes.io/instance: {{ .Release.Name }}-model-server
     spec:
       {{- if or (not .Values.model.vllmSelected) (eq .Values.model.vllmSelected nil) }}
       initContainers:
-      - name: model-file
-        image: {{ .Values.model.initContainer }}
-        command: {{ .Values.model.modelInitCommand }}
-        volumeMounts:
         - name: model-file
-          mountPath: /shared
+          image: {{ .Values.model.initContainer }}
+          command: {{ .Values.model.modelInitCommand }}
+          volumeMounts:
+            - name: model-file
+              mountPath: /shared
       {{ end }}
       containers:
+        {{ if .Values.model.vllmSelected }}
+        - image: {{ .Values.model.vllmModelServiceContainer }}
+          args: ["--model", "{{ .Values.model.modelName }}", "--port", "{{ .Values.model.modelServicePort }}", "--download-dir", "/models-cache", "--max-model-len", "{{ .Values.model.maxModelLength }}"]
+          resources:
+            limits:
+              nvidia.com/gpu: '1'
+          volumeMounts:
+            - name: dshm
+              mountPath: /dev/shm
+            - name: models-cache
+              mountPath: /models-cache
+        {{ else }}
+        - env:
+            - name: HOST
+              value: "0.0.0.0"
+            - name: PORT
+              value: "{{ .Values.model.modelServicePort }}"
+            - name: MODEL_PATH
+              value: {{ .Values.model.modelPath }}
+            - name: CHAT_FORMAT
+              value: openchat
+          image: {{ .Values.model.modelServiceContainer }}
+          volumeMounts:
+            - name: model-file
+              mountPath: /model
+              {{ end }}
+          name: app-model-service
+          ports:
+            - containerPort: {{ .Values.model.modelServicePort }}
+          securityContext:
+            runAsNonRoot: true
       {{ if .Values.model.vllmSelected }}
-      - image: {{ .Values.model.vllmModelServiceContainer }}
-        args: [
-            "--model",
-            "{{ .Values.model.modelName }}",
-            "--port",
-            "{{ .Values.model.modelServicePort }}",
-            "--download-dir",
-            "/models-cache",
-            "--max-model-len",
-            "{{values.maxModelLength}}"]
-        resources:
-          limits:
-            nvidia.com/gpu: '1'
-        volumeMounts:
+      volumes:
         - name: dshm
-          mountPath: /dev/shm
+          emptyDir:
+            medium: Memory
+            sizeLimit: "2Gi"
         - name: models-cache
-          mountPath: /models-cache
-      {{ else }}
-      - env:
-        - name: HOST
-          value: "0.0.0.0"
-        - name: PORT
-          value: "{{ .Values.model.modelServicePort }}"
-        - name: MODEL_PATH
-          value: {{ .Values.model.modelPath }}
-        - name: CHAT_FORMAT
-          value: openchat
-        image: {{ .Values.model.modelServiceContainer }}
-        volumeMounts:
-        - name: model-file
-          mountPath: /model
-      {{ end }}
-        name: app-model-service
-        ports:
-        - containerPort: {{ .Values.model.modelServicePort }}
-        securityContext:
-          runAsNonRoot: true
-      {{ if .Values.model.vllmSelected }}
-      volumes:
-      - name: dshm
-        emptyDir:
-          medium: Memory
-          sizeLimit: "2Gi"
-      - name: models-cache
-        persistentVolumeClaim:
-          claimName: {{ .Release.Name }}
-
+          persistentVolumeClaim:
+            claimName: {{ .Release.Name }}
       tolerations:
         - key: nvidia.com/gpu
           operator: Exists
           effect: NoSchedule
       {{ else }}
       volumes:
-      - name: model-file
-        emptyDir: {}
+        - name: model-file
+          emptyDir: {}
       {{ end }}
+{{ end }}
diff --git a/charts/ai-software-templates/chatbot/templates/deployment.yaml b/charts/ai-software-templates/chatbot/templates/deployment.yaml
@@ -1,46 +1,42 @@
 apiVersion: apps/v1
 kind: Deployment
 metadata:
-  annotations:  
+  annotations:
     tad.gitops.set/image: ".spec.template.spec.containers[0].image"
     tad.gitops.get/image: ".spec.template.spec.containers[0].image"
     tad.gitops.set/replicas: ".spec.replicas"
-    tad.gitops.get/replicas: ".spec.replicas" 
-  labels: 
+    tad.gitops.get/replicas: ".spec.replicas"
+  labels:
     app.kubernetes.io/instance: {{ .Release.Name }}
-    app.kubernetes.io/name:  {{ .Release.Name }}
-    app.kubernetes.io/part-of: {{ .Release.Name }}  
+    app.kubernetes.io/name: {{ .Release.Name }}
+    app.kubernetes.io/part-of: {{ .Release.Name }}
   name: {{ .Release.Name }}
   namespace: {{ .Release.Namespace }}
 spec:
   replicas: 1
   selector:
     matchLabels:
-      app.kubernetes.io/instance:  {{ .Release.Name }} 
+      app.kubernetes.io/instance: {{ .Release.Name }}
   template:
-    metadata: 
+    metadata:
       labels:
-        app.kubernetes.io/instance:  {{ .Release.Name }}
+        app.kubernetes.io/instance: {{ .Release.Name }}
     spec:
       containers:
-      - image:  {{ .Values.application.appContainer }}
-        name: app-inference
-        envFrom:
-        - configMapRef:
-            name: {{ .Release.Name }}-model-config
-        {{ if .Values.model.dbRequired }}
-        - configMapRef:
-            name: {{ .Release.Name }}-database-config
-        {{ end }}
-        {{ if .Values.model.includeModelEndpointSecret }}
-        env:
-        - name: MODEL_ENDPOINT_BEARER
-          valueFrom:
-            secretKeyRef:
-              name: {{ .Values.model.modelEndpointSecretName }}
-              key: {{ .Values.model.modelEndpointSecretKey }}
-        {{ end }}
-        ports:
-        - containerPort: {{ .Values.application.appPort }}
-        securityContext:
-          runAsNonRoot: true
+        - image: {{ .Values.application.appContainer }}
+          name: app-inference
+          envFrom:
+            - configMapRef:
+                name: {{ .Release.Name }}-model-config
+          {{ if .Values.model.includeModelEndpointSecret }}
+          env:
+            - name: MODEL_ENDPOINT_BEARER
+              valueFrom:
+                secretKeyRef:
+                  name: {{ .Values.model.modelEndpointSecretName }}
+                  key: {{ .Values.model.modelEndpointSecretKey }}
+          {{ end }}
+          ports:
+            - containerPort: {{ .Values.application.appPort }}
+          securityContext:
+            runAsNonRoot: true
diff --git a/charts/ai-software-templates/chatbot/templates/pvc.yaml b/charts/ai-software-templates/chatbot/templates/pvc.yaml
@@ -1,12 +1,11 @@
----
 apiVersion: v1
 kind: PersistentVolumeClaim
 metadata:
-  labels: 
+  labels:
     app.kubernetes.io/instance: {{ .Release.Name }}
-    app.kubernetes.io/name: {{ .Release.Name }} 
-  namespace: {{ .Release.Namespace }}
+    app.kubernetes.io/name: {{ .Release.Name }}
   name: {{ .Release.Name }}
+  namespace: {{ .Release.Namespace }}
 spec:
   accessModes:
     - ReadWriteOnce

diff --git a/charts/ai-software-templates/chatbot/templates/route.yaml b/charts/ai-software-templates/chatbot/templates/route.yaml
@@ -1,11 +1,11 @@
 apiVersion: route.openshift.io/v1
 kind: Route
-metadata: 
-  labels: 
+metadata:
+  labels:
     app.kubernetes.io/instance: {{ .Release.Name }}
-    app.kubernetes.io/name: {{ .Release.Name }} 
-  namespace: {{ .Release.Namespace }}
+    app.kubernetes.io/name: {{ .Release.Name }}
   name: {{ .Release.Name }}
+  namespace: {{ .Release.Namespace }}
 spec:
   port:
     targetPort: {{ .Values.application.appPort }}
@@ -14,6 +14,6 @@ spec:
     termination: edge
   to:
     kind: Service
-    name: {{ .Release.Name }} 
-    weight: 100 
+    name: {{ .Release.Name }}
+    weight: 100
   wildcardPolicy: None
diff --git a/charts/ai-software-templates/chatbot/templates/service-model-server.yaml b/charts/ai-software-templates/chatbot/templates/service-model-server.yaml
@@ -1,15 +1,17 @@
+{{ if not .Values.model.existingModelServer }}
 apiVersion: v1
 kind: Service
-metadata: 
-  labels: 
+metadata:
+  labels:
     app.kubernetes.io/instance: {{ .Release.Name }}-model-server
     app.kubernetes.io/name: {{ .Release.Name }}-model-server
-  namespace: {{ .Release.Namespace }}
   name: {{ .Release.Name }}-model-server
+  namespace: {{ .Release.Namespace }}
 spec:
   ports:
-  - port: {{ .Values.model.modelServicePort }}
-    protocol: TCP
-    targetPort: {{ .Values.model.modelServicePort }}
+    - port: {{ .Values.model.modelServicePort }}
+      protocol: TCP
+      targetPort: {{ .Values.model.modelServicePort }}
   selector:
     app.kubernetes.io/instance: {{ .Release.Name }}-model-server
+{{ end }}