Skip to content

Commit

Permalink
Add Kubernetes manifest files for deploying CodeTrans (#435)
Browse files Browse the repository at this point in the history
Signed-off-by: Lianhao Lu <[email protected]>
  • Loading branch information
lianhao authored Jul 23, 2024
1 parent 8314632 commit c9548d7
Show file tree
Hide file tree
Showing 6 changed files with 839 additions and 1 deletion.
6 changes: 5 additions & 1 deletion CodeTrans/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,10 +71,14 @@ docker compose -f docker_compose.yaml up -d

Refer to the [Xeon Guide](./docker/xeon/README.md) for more instructions on building docker images from source.

## Deploy with Kubernetes
## Deploy using Kubernetes with GMC

Please refer to the [Code Translation Kubernetes Guide](./kubernetes/README.md)

## Deploy using Kubernetes without GMC

Please refer to the [Code Translation Kubernetes Guide](./kubernetes/manifests/README.md)

# Consume Code Translation Service

Two ways of consuming Code Translation Service:
Expand Down
41 changes: 41 additions & 0 deletions CodeTrans/kubernetes/manifests/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
<h1 align="center" id="title">Deploy CodeTrans in Kubernetes Cluster</h1>

> [NOTE]
> The following values must be set before you can deploy:
> HUGGINGFACEHUB_API_TOKEN
> You can also customize the "MODEL_ID" if needed.
> You need to make sure you have created the directory `/mnt/opea-models` to save the cached model on the node where the CodeTrans workload is running. Otherwise, you need to modify the `codetrans.yaml` file to change the `model-volume` to a directory that exists on the node.
## Deploy On Xeon

```
cd GenAIExamples/CodeTrans/kubernetes/manifests/xeon
export HUGGINGFACEHUB_API_TOKEN="YourOwnToken"
sed -i "s/insert-your-huggingface-token-here/${HUGGINGFACEHUB_API_TOKEN}/g" codetrans.yaml
kubectl apply -f codetrans.yaml
```

## Deploy On Gaudi

```
cd GenAIExamples/CodeTrans/kubernetes/manifests/gaudi
export HUGGINGFACEHUB_API_TOKEN="YourOwnToken"
sed -i "s/insert-your-huggingface-token-here/${HUGGINGFACEHUB_API_TOKEN}/g" codetrans.yaml
kubectl apply -f codetrans.yaml
```

## Verify Services

To verify the installation, run the command `kubectl get pod` to make sure all pods are running.

Then run the command `kubectl port-forward svc/docsum 8888:8888` to expose the CodeTrans service for access.

Open another terminal and run the following command to verify the service if working:

```console
curl http://localhost:7777/v1/codetrans \
-H 'Content-Type: application/json' \
-d '{"language_from": "Golang","language_to": "Python","source_code": "package main\n\nimport \"fmt\"\nfunc main() {\n fmt.Println(\"Hello, World!\");\n}"}'
```
312 changes: 312 additions & 0 deletions CodeTrans/kubernetes/manifests/gaudi/codetrans.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,312 @@
---
# Source: codetrans/charts/llm-uservice/templates/configmap.yaml
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

apiVersion: v1
kind: ConfigMap
metadata:
name: codetrans-llm-uservice-config
labels:
helm.sh/chart: llm-uservice-0.8.0
app.kubernetes.io/name: llm-uservice
app.kubernetes.io/instance: codetrans
app.kubernetes.io/version: "1.0.0"
app.kubernetes.io/managed-by: Helm
data:
TGI_LLM_ENDPOINT: "http://codetrans-tgi"
HUGGINGFACEHUB_API_TOKEN: "insert-your-huggingface-token-here"
HF_HOME: "/tmp/.cache/huggingface"
http_proxy: ""
https_proxy: ""
no_proxy: ""
LANGCHAIN_TRACING_V2: "false"
LANGCHAIN_API_KEY: insert-your-langchain-key-here
LANGCHAIN_PROJECT: "opea-llm-uservice"
---
# Source: codetrans/charts/tgi/templates/configmap.yaml
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

apiVersion: v1
kind: ConfigMap
metadata:
name: codetrans-tgi-config
labels:
helm.sh/chart: tgi-0.8.0
app.kubernetes.io/name: tgi
app.kubernetes.io/instance: codetrans
app.kubernetes.io/version: "2.1.0"
app.kubernetes.io/managed-by: Helm
data:
MODEL_ID: "HuggingFaceH4/mistral-7b-grok"
PORT: "2080"
HUGGING_FACE_HUB_TOKEN: "insert-your-huggingface-token-here"
HF_TOKEN: "insert-your-huggingface-token-here"
MAX_INPUT_TOKENS: "1024"
MAX_TOTAL_TOKENS: "4096"
http_proxy: ""
https_proxy: ""
no_proxy: ""
HABANA_LOGS: "/tmp/habana_logs"
NUMBA_CACHE_DIR: "/tmp"
TRANSFORMERS_CACHE: "/tmp/transformers_cache"
HF_HOME: "/tmp/.cache/huggingface"
---
# Source: codetrans/charts/llm-uservice/templates/service.yaml
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

apiVersion: v1
kind: Service
metadata:
name: codetrans-llm-uservice
labels:
helm.sh/chart: llm-uservice-0.8.0
app.kubernetes.io/name: llm-uservice
app.kubernetes.io/instance: codetrans
app.kubernetes.io/version: "1.0.0"
app.kubernetes.io/managed-by: Helm
spec:
type: ClusterIP
ports:
- port: 9000
targetPort: 9000
protocol: TCP
name: llm-uservice
selector:
app.kubernetes.io/name: llm-uservice
app.kubernetes.io/instance: codetrans
---
# Source: codetrans/charts/tgi/templates/service.yaml
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

apiVersion: v1
kind: Service
metadata:
name: codetrans-tgi
labels:
helm.sh/chart: tgi-0.8.0
app.kubernetes.io/name: tgi
app.kubernetes.io/instance: codetrans
app.kubernetes.io/version: "2.1.0"
app.kubernetes.io/managed-by: Helm
spec:
type: ClusterIP
ports:
- port: 80
targetPort: 2080
protocol: TCP
name: tgi
selector:
app.kubernetes.io/name: tgi
app.kubernetes.io/instance: codetrans
---
# Source: codetrans/templates/service.yaml
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

apiVersion: v1
kind: Service
metadata:
name: codetrans
labels:
helm.sh/chart: codetrans-0.8.0
app.kubernetes.io/name: codetrans
app.kubernetes.io/instance: codetrans
app.kubernetes.io/version: "1.0.0"
app.kubernetes.io/managed-by: Helm
spec:
type: ClusterIP
ports:
- port: 7777
targetPort: 7777
protocol: TCP
name: codetrans
selector:
app.kubernetes.io/name: codetrans
app.kubernetes.io/instance: codetrans
---
# Source: codetrans/charts/llm-uservice/templates/deployment.yaml
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

apiVersion: apps/v1
kind: Deployment
metadata:
name: codetrans-llm-uservice
labels:
helm.sh/chart: llm-uservice-0.8.0
app.kubernetes.io/name: llm-uservice
app.kubernetes.io/instance: codetrans
app.kubernetes.io/version: "1.0.0"
app.kubernetes.io/managed-by: Helm
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: llm-uservice
app.kubernetes.io/instance: codetrans
template:
metadata:
labels:
app.kubernetes.io/name: llm-uservice
app.kubernetes.io/instance: codetrans
spec:
securityContext:
{}
containers:
- name: codetrans
envFrom:
- configMapRef:
name: codetrans-llm-uservice-config
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
readOnlyRootFilesystem: false
runAsNonRoot: true
runAsUser: 1000
seccompProfile:
type: RuntimeDefault
image: "opea/llm-tgi:latest"
imagePullPolicy: IfNotPresent
ports:
- name: llm-uservice
containerPort: 9000
protocol: TCP
volumeMounts:
- mountPath: /tmp
name: tmp
startupProbe:
exec:
command:
- curl
- http://codetrans-tgi
initialDelaySeconds: 5
periodSeconds: 5
failureThreshold: 120
resources:
{}
volumes:
- name: tmp
emptyDir: {}
---
# Source: codetrans/charts/tgi/templates/deployment.yaml
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

apiVersion: apps/v1
kind: Deployment
metadata:
name: codetrans-tgi
labels:
helm.sh/chart: tgi-0.8.0
app.kubernetes.io/name: tgi
app.kubernetes.io/instance: codetrans
app.kubernetes.io/version: "2.1.0"
app.kubernetes.io/managed-by: Helm
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: tgi
app.kubernetes.io/instance: codetrans
template:
metadata:
labels:
app.kubernetes.io/name: tgi
app.kubernetes.io/instance: codetrans
spec:
securityContext:
{}
containers:
- name: tgi
envFrom:
- configMapRef:
name: codetrans-tgi-config
securityContext:
{}
image: "ghcr.io/huggingface/tgi-gaudi:2.0.1"
imagePullPolicy: IfNotPresent
volumeMounts:
- mountPath: /data
name: model-volume
- mountPath: /tmp
name: tmp
ports:
- name: http
containerPort: 2080
protocol: TCP
resources:
limits:
habana.ai/gaudi: 1
volumes:
- name: model-volume
hostPath:
path: /mnt/opea-models
type: Directory
- name: tmp
emptyDir: {}
---
# Source: codetrans/templates/deployment.yaml
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

apiVersion: apps/v1
kind: Deployment
metadata:
name: codetrans
labels:
helm.sh/chart: codetrans-0.8.0
app.kubernetes.io/name: codetrans
app.kubernetes.io/instance: codetrans
app.kubernetes.io/version: "1.0.0"
app.kubernetes.io/managed-by: Helm
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: codetrans
app.kubernetes.io/instance: codetrans
template:
metadata:
labels:
app.kubernetes.io/name: codetrans
app.kubernetes.io/instance: codetrans
spec:
securityContext:
null
containers:
- name: codetrans
env:
- name: LLM_SERVICE_HOST_IP
value: codetrans-llm-uservice
#- name: MEGA_SERVICE_PORT
# value: 7777
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 1000
seccompProfile:
type: RuntimeDefault
image: "opea/codetrans:latest"
imagePullPolicy: IfNotPresent
volumeMounts:
- mountPath: /tmp
name: tmp
ports:
- name: codetrans
containerPort: 7777
protocol: TCP
resources:
null
volumes:
- name: tmp
emptyDir: {}
Loading

0 comments on commit c9548d7

Please sign in to comment.