Skip to content

Commit

Permalink
Merge pull request #1 from thepetk/feat/add_generate_script
Browse files Browse the repository at this point in the history
Feat/add generate script
  • Loading branch information
thepetk authored Jan 27, 2025
2 parents ffeb175 + 96f116e commit f8f4106
Show file tree
Hide file tree
Showing 16 changed files with 192 additions and 128 deletions.
24 changes: 24 additions & 0 deletions .github/workflows/update-charts.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
name: CI

on:
push:
branches: [main]
pull_request:
branches: [main]

jobs:
verify-dependencies:
runs-on: ubuntu-latest

steps:
- uses: actions/[email protected]

- name: Run generate.sh and check if the ai-lab-app based charts are up-to-date
run: |
bash generate.sh
if [[ ! -z $(git status -s) ]]
then
echo 'The script `./generate.sh` did introduce changes, which should ideally be checked in as part of the PR.'
git status
exit 1
fi
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,8 @@ The gitops component, handled by ArgoCD for the RHDH case, is replaced by a Kube

- Creates the GitHub repository for the application.
- Copies the application source code into the new repository.
- Copies the Tekton `PipelineRun` that builds new images for the application after pull requests merge or commits are pushed directly to the `main` branch,
and then updates the Deployment of the application with the new version of the image by directly patching the Deployment via `oc`, vs. ArgoCD using gitops to patch the Deployment.
- Copies the Tekton `PipelineRun` that builds new images for the application after pull requests merge or commits are pushed directly to the `main` branch,
and then updates the Deployment of the application with the new version of the image by directly patching the Deployment via `oc`, vs. ArgoCD using gitops to patch the Deployment.
- Commits these changes and pushes the commit to the preferred branch of the new repository.

The source code is [here](charts/ai-software-templates/chatbot/templates/application-gitops-job.yaml).
Expand Down
9 changes: 9 additions & 0 deletions charts/ai-software-templates/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
## ai-software-templates

Apart from the [pipeline-install](./pipeline-install/) and the [pipeline-setup](./pipeline-setup/) which are tools to help you install the tekton pipelines along with your charts, the rest of charts under this directory are based on the [redhat-ai-dev/ai-lab-app](https://github.com/redhat-ai-dev/ai-lab-app) gitops resources repo.

### Pull automatically latest changes

To pull all latest changes for the charts based on `ai-lab-app` you can simply run the `generate.sh` script placed in the root dir of this repository.

The script will clone the `ai-lab-app` and will convert the necessary resources to helm chart template files. Each chart has a corresponding `env` file under `scripts/envs` dir which helps us configure the behavior of the conversion process.
20 changes: 15 additions & 5 deletions charts/ai-software-templates/chatbot/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,12 @@ This Helm Chart deploys a Large Language Model (LLM)-enabled [chat bot applicati
This Helm Chart creates the following components:

### The Model Service
Based on the `llama.cpp` inference server and related to the [ai-lab-recipes model server](https://github.com/containers/ai-lab-recipes/tree/main/model_servers/llamacpp_python).
By default the `chatbot-ai-sample` supports the `llama.cpp` inference server and related to the [ai-lab-recipes model server](https://github.com/containers/ai-lab-recipes/tree/main/model_servers/llamacpp_python).

However, the usage of `vLLM` model services or existing model services is also supported:
* For the `vLLM` model service case, the `Values.model.vllmSelected` value should be `true`, the `Values.model.vllmModelServiceContainer` and the `Values.model.modelName` should be configured too.
* For the existing model service case, the `Values.model.existingModelServer` value should be `true` and the `Values.model.modelEndpoint` should be set to the URL of the existing model endpoint we would like to use for this deployment.
* In case the existing model service requires bearer authentication the `Values.model.includeModelEndpointSecret` should be set to `true`, the `Values.model.modelEndpointSecretName` and the `Values.model.modelEndpointSecretKey` should be configured.

### The Application
A [Streamlit](https://github.com/streamlit/streamlit) application to interact with the model service which is based on the related [Chatbot Template](https://github.com/redhat-ai-dev/ai-lab-template/tree/main/templates/chatbot/content).
Expand Down Expand Up @@ -71,19 +76,24 @@ Kubernetes: `>= 1.27.0-0`
| application.appContainer | string | `"quay.io/redhat-ai-dev/chatbot:latest"` | The image used for the initial chatbot application interface |
| application.appPort | int | `8501` | The exposed port of the application |
| gitops.gitDefaultBranch | string | `"main"` | The default branch for the chatbot application Github repository |
| gitops.gitSecretKeyToken | string | `"GITHUB_TOKEN"` | The name of the Secret's key with the Github token value |
| gitops.gitSecretKeyToken | string | `"password"` | The name of the Secret's key with the Github token value |
| gitops.gitSecretName | string | `"github-secrets"` | The name of the Secret containing the required Github token |
| gitops.gitSourceRepo | string | `"redhat-ai-dev/ai-lab-samples"` | The Github Repository with the contents of the ai-lab sample chatbot application |
| gitops.githubOrgName | string | `""` | [REQUIRED] The Github Organization name that the chatbot application repository will be created in |
| gitops.quayAccountName | string | `""` | [REQUIRED] The quay.io account that the application image will be pushed |
| model.dbRequired | bool | `false` | The bool variable for support of model database |
| model.existingModelServer | bool | `false` | The bool variable for support of existing model server |
| model.includeModelEndpointSecret | bool | `false` | The bool variable for support of bearer token authentication for existing model server authentication |
| model.initContainer | string | `"quay.io/redhat-ai-dev/granite-7b-lab:latest"` | The image used for the initContainer of the model service deployment |
| model.maxModelLength | int | `4096` | The maximum sequence length of the model. It is used only for the vllm case and the default value is 4096. |
| model.modelEndpoint | string | `""` | The endpoint url of the model for the existing model service case. Is used only if existingModelServer is set to true. |
| model.modelEndpointSecretKey | string | `""` | The name of the secret field storing the bearer value for the existing model service if the endpoint requires bearer authentication. Is used only if includeModelEndpointSecret is set to true. |
| model.modelEndpointSecretName | string | `""` | The name of the secret storing the credentials for the existing model service if the endpoint requires bearer authentication. Is used only if includeModelEndpointSecret is set to true. |
| model.modelInitCommand | string | `"['/usr/bin/install', '/model/model.file', '/shared/']"` | The model service initContainer command |
| model.modelName | string | `""` | The name of the model. By defaults it is set to instructlab/granite-7b-lab. It is used only for vllm and/or existing model service cases. |
| model.modelPath | string | `"/model/model.file"` | The path of the model file inside the model service container |
| model.modelServiceContainer | string | `"quay.io/ai-lab/llamacpp_python:latest"` | The image used for the model service |
| model.modelServiceContainer | string | `"quay.io/ai-lab/llamacpp_python:latest"` | The image used for the model service. For the VLLM case please see vllmModelServiceContainer |
| model.modelServicePort | int | `8001` | The exposed port of the model service |
| model.vllmSelected | bool | `false` | The bool variable for support of vllms |
| model.vllmModelServiceContainer | string | `""` | The image used for the model service for the VLLM use case. |
| model.vllmSelected | bool | `false` | The bool variable for support of vllm instead of llama_cpp. Be sure that your system has GPU support for this case. |

**NOTE:** Your helm release's name will be used as the name of the application github repository
7 changes: 6 additions & 1 deletion charts/ai-software-templates/chatbot/README.md.gotmpl
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,12 @@
This Helm Chart creates the following components:

### The Model Service
Based on the `llama.cpp` inference server and related to the [ai-lab-recipes model server](https://github.com/containers/ai-lab-recipes/tree/main/model_servers/llamacpp_python).
By default the `chatbot-ai-sample` supports the `llama.cpp` inference server and related to the [ai-lab-recipes model server](https://github.com/containers/ai-lab-recipes/tree/main/model_servers/llamacpp_python).

However, the usage of `vLLM` model services or existing model services is also supported:
* For the `vLLM` model service case, the `Values.model.vllmSelected` value should be `true`, the `Values.model.vllmModelServiceContainer` and the `Values.model.modelName` should be configured too.
* For the existing model service case, the `Values.model.existingModelServer` value should be `true` and the `Values.model.modelEndpoint` should be set to the URL of the existing model endpoint we would like to use for this deployment.
* In case the existing model service requires bearer authentication the `Values.model.includeModelEndpointSecret` should be set to `true`, the `Values.model.modelEndpointSecretName` and the `Values.model.modelEndpointSecretKey` should be configured.

### The Application
A [Streamlit](https://github.com/streamlit/streamlit) application to interact with the model service which is based on the related [Chatbot Template](https://github.com/redhat-ai-dev/ai-lab-template/tree/main/templates/chatbot/content).
Expand Down
Original file line number Diff line number Diff line change
@@ -1,87 +1,80 @@
{{ if not .Values.model.existingModelServer }}
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
labels:
app.kubernetes.io/instance: {{ .Release.Name }}-model-server
app.kubernetes.io/name: {{ .Release.Name }}-model-server
app.kubernetes.io/part-of: {{ .Release.Name }}
app.kubernetes.io/name: {{ .Release.Name }}-model-server
app.kubernetes.io/part-of: {{ .Release.Name }}
name: {{ .Release.Name }}-model-server
namespace: {{ .Release.Namespace }}
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/instance: {{ .Release.Name }}-model-server
app.kubernetes.io/instance: {{ .Release.Name }}-model-server
template:
metadata:
metadata:
labels:
app.kubernetes.io/instance: {{ .Release.Name }}-model-server
app.kubernetes.io/instance: {{ .Release.Name }}-model-server
spec:
{{- if or (not .Values.model.vllmSelected) (eq .Values.model.vllmSelected nil) }}
initContainers:
- name: model-file
image: {{ .Values.model.initContainer }}
command: {{ .Values.model.modelInitCommand }}
volumeMounts:
- name: model-file
mountPath: /shared
image: {{ .Values.model.initContainer }}
command: {{ .Values.model.modelInitCommand }}
volumeMounts:
- name: model-file
mountPath: /shared
{{ end }}
containers:
{{ if .Values.model.vllmSelected }}
- image: {{ .Values.model.vllmModelServiceContainer }}
args: ["--model", "{{ .Values.model.modelName }}", "--port", "{{ .Values.model.modelServicePort }}", "--download-dir", "/models-cache", "--max-model-len", "{{ .Values.model.maxModelLength }}"]
resources:
limits:
nvidia.com/gpu: '1'
volumeMounts:
- name: dshm
mountPath: /dev/shm
- name: models-cache
mountPath: /models-cache
{{ else }}
- env:
- name: HOST
value: "0.0.0.0"
- name: PORT
value: "{{ .Values.model.modelServicePort }}"
- name: MODEL_PATH
value: {{ .Values.model.modelPath }}
- name: CHAT_FORMAT
value: openchat
image: {{ .Values.model.modelServiceContainer }}
volumeMounts:
- name: model-file
mountPath: /model
{{ end }}
name: app-model-service
ports:
- containerPort: {{ .Values.model.modelServicePort }}
securityContext:
runAsNonRoot: true
{{ if .Values.model.vllmSelected }}
- image: {{ .Values.model.vllmModelServiceContainer }}
args: [
"--model",
"{{ .Values.model.modelName }}",
"--port",
"{{ .Values.model.modelServicePort }}",
"--download-dir",
"/models-cache",
"--max-model-len",
"{{values.maxModelLength}}"]
resources:
limits:
nvidia.com/gpu: '1'
volumeMounts:
volumes:
- name: dshm
mountPath: /dev/shm
emptyDir:
medium: Memory
sizeLimit: "2Gi"
- name: models-cache
mountPath: /models-cache
{{ else }}
- env:
- name: HOST
value: "0.0.0.0"
- name: PORT
value: "{{ .Values.model.modelServicePort }}"
- name: MODEL_PATH
value: {{ .Values.model.modelPath }}
- name: CHAT_FORMAT
value: openchat
image: {{ .Values.model.modelServiceContainer }}
volumeMounts:
- name: model-file
mountPath: /model
{{ end }}
name: app-model-service
ports:
- containerPort: {{ .Values.model.modelServicePort }}
securityContext:
runAsNonRoot: true
{{ if .Values.model.vllmSelected }}
volumes:
- name: dshm
emptyDir:
medium: Memory
sizeLimit: "2Gi"
- name: models-cache
persistentVolumeClaim:
claimName: {{ .Release.Name }}

persistentVolumeClaim:
claimName: {{ .Release.Name }}
tolerations:
- key: nvidia.com/gpu
operator: Exists
effect: NoSchedule
{{ else }}
volumes:
- name: model-file
emptyDir: {}
- name: model-file
emptyDir: {}
{{ end }}
{{ end }}
54 changes: 25 additions & 29 deletions charts/ai-software-templates/chatbot/templates/deployment.yaml
Original file line number Diff line number Diff line change
@@ -1,46 +1,42 @@
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
annotations:
tad.gitops.set/image: ".spec.template.spec.containers[0].image"
tad.gitops.get/image: ".spec.template.spec.containers[0].image"
tad.gitops.set/replicas: ".spec.replicas"
tad.gitops.get/replicas: ".spec.replicas"
labels:
tad.gitops.get/replicas: ".spec.replicas"
labels:
app.kubernetes.io/instance: {{ .Release.Name }}
app.kubernetes.io/name: {{ .Release.Name }}
app.kubernetes.io/part-of: {{ .Release.Name }}
app.kubernetes.io/name: {{ .Release.Name }}
app.kubernetes.io/part-of: {{ .Release.Name }}
name: {{ .Release.Name }}
namespace: {{ .Release.Namespace }}
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/instance: {{ .Release.Name }}
app.kubernetes.io/instance: {{ .Release.Name }}
template:
metadata:
metadata:
labels:
app.kubernetes.io/instance: {{ .Release.Name }}
app.kubernetes.io/instance: {{ .Release.Name }}
spec:
containers:
- image: {{ .Values.application.appContainer }}
name: app-inference
envFrom:
- configMapRef:
name: {{ .Release.Name }}-model-config
{{ if .Values.model.dbRequired }}
- configMapRef:
name: {{ .Release.Name }}-database-config
{{ end }}
{{ if .Values.model.includeModelEndpointSecret }}
env:
- name: MODEL_ENDPOINT_BEARER
valueFrom:
secretKeyRef:
name: {{ .Values.model.modelEndpointSecretName }}
key: {{ .Values.model.modelEndpointSecretKey }}
{{ end }}
ports:
- containerPort: {{ .Values.application.appPort }}
securityContext:
runAsNonRoot: true
- image: {{ .Values.application.appContainer }}
name: app-inference
envFrom:
- configMapRef:
name: {{ .Release.Name }}-model-config
{{ if .Values.model.includeModelEndpointSecret }}
env:
- name: MODEL_ENDPOINT_BEARER
valueFrom:
secretKeyRef:
name: {{ .Values.model.modelEndpointSecretName }}
key: {{ .Values.model.modelEndpointSecretKey }}
{{ end }}
ports:
- containerPort: {{ .Values.application.appPort }}
securityContext:
runAsNonRoot: true
7 changes: 3 additions & 4 deletions charts/ai-software-templates/chatbot/templates/pvc.yaml
Original file line number Diff line number Diff line change
@@ -1,12 +1,11 @@
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
labels:
labels:
app.kubernetes.io/instance: {{ .Release.Name }}
app.kubernetes.io/name: {{ .Release.Name }}
namespace: {{ .Release.Namespace }}
app.kubernetes.io/name: {{ .Release.Name }}
name: {{ .Release.Name }}
namespace: {{ .Release.Namespace }}
spec:
accessModes:
- ReadWriteOnce
Expand Down
12 changes: 6 additions & 6 deletions charts/ai-software-templates/chatbot/templates/route.yaml
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
apiVersion: route.openshift.io/v1
kind: Route
metadata:
labels:
metadata:
labels:
app.kubernetes.io/instance: {{ .Release.Name }}
app.kubernetes.io/name: {{ .Release.Name }}
namespace: {{ .Release.Namespace }}
app.kubernetes.io/name: {{ .Release.Name }}
name: {{ .Release.Name }}
namespace: {{ .Release.Namespace }}
spec:
port:
targetPort: {{ .Values.application.appPort }}
Expand All @@ -14,6 +14,6 @@ spec:
termination: edge
to:
kind: Service
name: {{ .Release.Name }}
weight: 100
name: {{ .Release.Name }}
weight: 100
wildcardPolicy: None
Original file line number Diff line number Diff line change
@@ -1,15 +1,17 @@
{{ if not .Values.model.existingModelServer }}
apiVersion: v1
kind: Service
metadata:
labels:
metadata:
labels:
app.kubernetes.io/instance: {{ .Release.Name }}-model-server
app.kubernetes.io/name: {{ .Release.Name }}-model-server
namespace: {{ .Release.Namespace }}
name: {{ .Release.Name }}-model-server
namespace: {{ .Release.Namespace }}
spec:
ports:
- port: {{ .Values.model.modelServicePort }}
protocol: TCP
targetPort: {{ .Values.model.modelServicePort }}
- port: {{ .Values.model.modelServicePort }}
protocol: TCP
targetPort: {{ .Values.model.modelServicePort }}
selector:
app.kubernetes.io/instance: {{ .Release.Name }}-model-server
{{ end }}
Loading

0 comments on commit f8f4106

Please sign in to comment.