Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GKE single cluster configuration archetype #12

Closed
wants to merge 7 commits into from
Closed
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
GKE single cluster configuration archetype
 Demo purposes. To make it work, users will need to change some config
Ketan Umare committed Oct 31, 2019
commit 550596a498ee772e87214abc322a380891248504
9 changes: 0 additions & 9 deletions README.md

This file was deleted.

48 changes: 48 additions & 0 deletions kustomize/overlays/gke-single-cluster/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
###################################
# WORK IN PROGRESS still
###################################

SQL Database
------------
Create a SQL database (Postgres)
https://cloud.google.com/sql/docs/postgres/create-instance

Enable the the SQL server to be accessed from the GKE cluster that will host the FlyteAdmin service. This can be done using private networking mode and associating the shared network

Create a database called "flyte" in this DB instance

Configuring Flyte to access DB
------------------------------

In this sample we pass the username and password directly in the config file.
TODO: Example of how to use kube secrets to pass the username and password.

Auth / IAM
----------

On GKE you can follow instructions listed here
https://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity
to setup WorkloadIdentity and serviceAccounts.

Important commands
kubectl create serviceaccount --namespace flytekit-development flyte-sandbox
gcloud iam service-accounts add-iam-policy-binding --role roles/iam.workloadIdentityUser --member "serviceAccount:flyte-sandbox.svc.id.goog[flytekit-development/flyte-sandbox]" [email protected]
kubectl annotate serviceaccount --namespace flytekit-development flyte-sandbox iam.gke.io/gcp-service-account=flyte-sandbox@flyte-sandbox.iam.gserviceaccount.com


IAM For FLyte components
------------------------
Create the right service accounts in GKE cluster's flyte namespace and then add the serviceaccountname to propeller and flyteadmin deployments. You may also want to add it to the various plugin
deployments.

gcloud iam service-accounts add-iam-policy-binding --role roles/iam.workloadIdentityUser --member "serviceAccount:flyte-sandbox.svc.id.goog[flyte/flyteadmin]" [email protected]
kubectl annotate serviceaccount --namespace flyte flyteadmin iam.gke.io/gcp-service-account=flyte-sandbox@flyte-sandbox.iam.gserviceaccount.com
gcloud iam service-accounts add-iam-policy-binding --role roles/iam.workloadIdentityUser --member "serviceAccount:flyte-sandbox.svc.id.goog[flyte/flytepropeller]" [email protected]
kubectl annotate serviceaccount --namespace flyte flytepropeller iam.gke.io/gcp-service-account=flyte-sandbox@flyte-sandbox.iam.gserviceaccount.com

IAM for workflows
-----------------
As a platform admin, you will need to associate service accounts with the target namespaces (project-domain) combination. Flyte allows launching workflows with serviceAccounts. Thus when the end user
requests a workflow launch or declares a workflow the right account should be associated within the right namespace.

TODO: Future plans to automate this creation and association
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: flyteadmin
namespace: flyte
spec:
template:
spec:
volumes:
- name: resource-templates
configMap:
name: clusterresource-template
initContainers:
- name: check-db-ready
image: postgres:10.1
command: ['sh', '-c',
'until pg_isready -h postgres -p 5432;
do echo waiting for database; sleep 2; done;']
- name: run-migrations
image: docker.io/lyft/flyteadmin:v0.1.1
imagePullPolicy: IfNotPresent
command: ["flyteadmin", "--logtostderr", "--config", "/etc/flyte/config/flyteadmin_config.yaml",
"migrate", "run"]
volumeMounts:
- name: config-volume
mountPath: /etc/flyte/config
- name: seed-projects
image: docker.io/lyft/flyteadmin:v0.1.1
imagePullPolicy: IfNotPresent
command: ["flyteadmin", "--logtostderr", "--config", "/etc/flyte/config/flyteadmin_config.yaml",
"migrate", "seed-projects", "flytesnacks", "flytetester"]
volumeMounts:
- name: config-volume
mountPath: /etc/flyte/config
- name: sync-cluster-resources
image: docker.io/lyft/flyteadmin:v0.1.1
imagePullPolicy: IfNotPresent
command: ["flyteadmin", "--logtostderr", "--config", "/etc/flyte/config/flyteadmin_config.yaml", "clusterresource", "sync"]
volumeMounts:
- name: resource-templates
mountPath: /etc/flyte/clusterresource/templates
- name: config-volume
mountPath: /etc/flyte/config
containers:
- name: flyteadmin
resources:
limits:
memory: "200Mi"
cpu: "0.1"
ephemeral-storage: "100Mi"
---
apiVersion: v1
kind: Service
metadata:
name: flyteadmin
namespace: flyte
spec:
ports:
- name: redoc
protocol: TCP
port: 87
targetPort: 8087
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
apiVersion: v1
kind: Namespace
metadata:
name: {{ namespace }}
spec:
finalizers:
- kubernetes
30 changes: 30 additions & 0 deletions kustomize/overlays/gke-single-cluster/admindeployment/cron.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: syncresources
namespace: flyte
spec:
schedule: "*/1 * * * *"
jobTemplate:
spec:
template:
spec:
serviceAccountName: flyteadmin
containers:
- name: sync-cluster-resources
image: docker.io/lyft/flyteadmin:v0.1.1
imagePullPolicy: IfNotPresent
command: ["flyteadmin", "--logtostderr", "--config", "/etc/flyte/config/flyteadmin_config.yaml", "clusterresource", "sync"]
volumeMounts:
- name: resource-templates
mountPath: /etc/flyte/clusterresource/templates
- name: config-volume
mountPath: /etc/flyte/config
volumes:
- name: resource-templates
configMap:
name: clusterresource-template
- name: config-volume
configMap:
name: flyte-admin-config
restartPolicy: OnFailure
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
logger:
show-source: true
level: 5
application:
httpPort: 8088
grpcPort: 8089
flyteadmin:
roleNameKey: "iam.amazonaws.com/role"
profilerPort: 10254
metricsScope: "flyte:"
metadataStoragePrefix:
- "metadata"
- "admin"
testing:
host: http://flyteadmin
database:
# Create a database like postgres and override these values
port: 5432
username: postgres
password: awesomesauce
# Recommended to use passwordPath and mount it using kubescrets or the like
# passwordPath: "/var/run/CREDENTIALS_DB_PASSWORD"
# host here is the ip address of the CloudSQL Db in private mode
host: 10.23.0.3
dbname: flyte
options: sslmode=disable
storage:
type: stow
stow:
kind: google
config:
scopes: ""
project_id: flyte-sandbox
json: ""
container: "flyte-sandbox"
task_resources:
defaults:
cpu: 200m
gpu: 0
memory: 500Mi
storage: 100Mi
limits:
cpu: 62
gpu: 8
memory: 256Gi
storage: 5Gi
domains:
- id: development
name: development
- id: staging
name: staging
- id: production
name: production
- id: domain
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's nix this.

name: domain
registration:
maxWorkflowNodes: 100
scheduler:
eventScheduler:
scheme: local
region: "us-east-1"
scheduleRole: "arn:aws:iam::173840052742:role/mbadmin-development-scheduler"
targetName: "arn:aws:sqs:us-east-1:173840052742:flyteadmin-development-scheduler"
workflowExecutor:
scheme: local
region: "us-east-1"
scheduleQueueName: "won't-work-locally"
accountId: "173840052742"
notifications:
type: local
region: "us-east-1"
publisher:
topicName: "foo"
processor:
queueName: "queue"
accountId: "bar"
emailer:
subject: "Notice: Execution \"{{ name }}\" has {{ phase }} in \"{{ domain }}\"."
sender: "[email protected]"
body: >
Execution \"{{ name }}\" has {{ phase }} in \"{{ domain }}\". View details at
<a href=\http://flyte.lyft.net/projects/{{ project }}/domains/{{ domain }}/executions/{{ name }}>
http://flyte.lyft.net/projects/{{ project }}/domains/{{ domain }}/executions/{{ name }}</a>. {{ error }}
cluster_resources:
templatePath: "/etc/flyte/clusterresource/templates"
refresh: 5m
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
bases:
- ../../../base/admindeployment

namespace: flyte

resources:
- cron.yaml
- service.yaml

configMapGenerator:
# the main admin configmap
- name: flyte-admin-config
files:
- flyteadmin_config.yaml
# cluster resource templates
- name: clusterresource-template
files:
# Files are read in alphabetical order. To ensure that we create the namespace first, prefix the file name with "aa".
- clusterresource-templates/aa_namespace.yaml

patches:
- admindeployment.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
apiVersion: v1
kind: Service
metadata:
name: flyteadmin
annotations:
cloud.google.com/load-balancer-type: "Internal"
spec:
type: LoadBalancer
15 changes: 15 additions & 0 deletions kustomize/overlays/gke-single-cluster/console/console.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: flyteconsole
namespace: flyte
spec:
template:
spec:
containers:
- name: flyteconsole
resources:
limits:
memory: "150Mi"
cpu: "0.1"
ephemeral-storage: "100Mi"
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
bases:
- ../../../base/console

patches:
- console.yaml

resources:
- service.yaml
11 changes: 11 additions & 0 deletions kustomize/overlays/gke-single-cluster/console/service.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
---
# Service
apiVersion: v1
kind: Service
metadata:
name: flyteconsole
namespace: flyte
annotations:
cloud.google.com/load-balancer-type: "Internal"
spec:
type: LoadBalancer
30 changes: 30 additions & 0 deletions kustomize/overlays/gke-single-cluster/datacatalog/datacatalog.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: datacatalog
namespace: flyte
spec:
template:
spec:
initContainers:
- name: check-db-ready
image: postgres:10.1
command: ['sh', '-c',
'until pg_isready -h postgres -p 5432;
do echo waiting for database; sleep 2; done;']
volumeMounts:
- name: config-volume
mountPath: /etc/datacatalog/config
containers:
- name: datacatalog
resources:
limits:
memory: "200Mi"
cpu: "0.1"
ephemeral-storage: "100Mi"
---
apiVersion: v1
kind: Service
metadata:
name: datacatalog
namespace: flyte
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
logger:
show-source: true
level: 5
datacatalog:
storage-prefix: metadata/datacatalog
metrics-scope: "datacatalog"
profiler-port: 10254
application:
grpcPort: 8089
storage:
connection:
access-key: minio
auth-type: accesskey
disable-ssl: true
endpoint: http://minio.flyte.svc.cluster.local:9000
region: us-east-1
secret-key: miniostorage
cache:
max_size_mbs: 10
target_gc_percent: 100
container: my-container
type: minio
database:
port: 5432
username: postgres
host: postgres
dbname: datacatalog
options: sslmode=disable
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
bases:
- ../../../base/datacatalog

namespace: flyte

configMapGenerator:
- name: datacatalog-config
files:
- datacatalog_config.yaml

patches:
- datacatalog.yaml
19 changes: 19 additions & 0 deletions kustomize/overlays/gke-single-cluster/flyte/kustomization.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
bases:
# global resources
- ../../../base/namespace
- ../../../dependencies/database
- ../../../dependencies/storage

# user plane / control plane resources
- ../../../base/ingress
- ../../../dependencies/contour_ingress_controller
- ../admindeployment
- ../datacatalog
- ../console

# data plane resources
- ../../../base/wf_crd
- ../../../base/operators/spark
- ../../../base/adminserviceaccount
- ../propeller
- ../redis
54 changes: 54 additions & 0 deletions kustomize/overlays/gke-single-cluster/propeller/config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
propeller:
metadata-prefix: metadata/propeller
workers: 4
max-workflow-retries: 30
workflow-reeval-duration: 30s
downstream-eval-duration: 30s
limit-namespace: "all"
prof-port: 10254
metrics-prefix: flyte
enable-admin-launcher: true
leader-election:
lock-config-map:
name: propeller-leader
namespace: flyte
enabled: true
lease-duration: 15s
renew-deadline: 10s
retry-period: 2s
queue:
type: batch
batching-interval: 2s
batch-size: -1
queue:
type: bucket
rate: 10
capacity: 100
sub-queue:
type: bucket
rate: 10
capacity: 100
logger:
show-source: true
level: 5
storage:
type: stow
stow:
kind: google
config:
scopes: ""
project_id: flyte-sandbox
json: ""
container: "flyte-sandbox"
event:
type: admin
rate: 500
capacity: 1000
admin:
endpoint: flyteadmin:81
insecure: true
# TODO may be we should disable catalog cache in the default?
catalog-cache:
endpoint: datacatalog:89
type: datacatalog
insecure: true
31 changes: 31 additions & 0 deletions kustomize/overlays/gke-single-cluster/propeller/kustomization.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
bases:
- ../../../base/propeller

namespace: flyte

configMapGenerator:
# the main propeller configmap
- name: flyte-propeller-config
files:
- config.yaml
# the plugin-configmap
- name: flyte-plugin-config
files:
- plugins/config.yaml
# a configmap for each plugin
- name: flyte-spark-config
files:
- plugins/spark/config.yaml
- name: flyte-container-config
files:
- plugins/container/config.yaml
- name: flyte-qubole-config
files:
- plugins/qubole/config.yaml

patches:
- propeller.yaml
# add the volumemount for each plugin configmap
- plugins/spark/propeller-patch.yaml
- plugins/container/propeller-patch.yaml
- plugins/qubole/propeller-patch.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
plugins:
enabled-plugins:
- container
- spark
- waitable
- hiveExecutor
- sidecar
logs:
# Log links can link to multiple options
# #1 Kubernetes dashboard
kubernetes-enabled: false
# #2 GCP stackdriver
stackdriver-enabled: true
gcp-project: flyte-sandbox
stackdriver-logresourcename: flyte
k8s:
default-annotations:
# Example annotation that will be applied to every k8s resource launched
- flyte.lyft.net/deployment: base-google-gke
# Example Environment variables that will be applied to every container executed on k8s
default-env-vars:
- FLYTE_CLOUD_PLATFORM: google
Empty file.
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: flytepropeller
namespace: flyte
spec:
template:
spec:
volumes:
- name: container-config-volume
configMap:
name: flyte-container-config
containers:
- name: flytepropeller
volumeMounts:
- name: container-config-volume
mountPath: /etc/flyte/config-container
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
plugins:
qubole:
# Either create this file with your username with the real token, or set the QUBOLE_API_KEY environment variable
# See the secrets_manager.go file in the plugins repo for usage. Since the dev/test deployment of
# this has a dummy QUBOLE_API_KEY env var built in, this fake path won't break anything.
quboleTokenPath: "/Path/To/QUBOLE_CLIENT_TOKEN"
resourceManagerType: redis
redisHostPath: redis-resource-manager.flyte:6379
redisHostKey: mypassword
quboleLimit: 10
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# This file is only for volume mounts. The configmap itself that's being mounted is sufficiently different that
# there's no benefit to having it in this folder, since the entire thing gets overridden anyways.
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: flytepropeller
namespace: flyte
spec:
template:
spec:
volumes:
- name: qubole-config-volume
configMap:
name: flyte-qubole-config
containers:
- name: flytepropeller
volumeMounts:
- name: qubole-config-volume
mountPath: /etc/flyte/config-qubole
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
plugins:
spark:
spark-config-default:
- spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version: "2"
- spark.kubernetes.allocation.batch.size: "50"
- spark.hadoop.fs.s3a.acl.default: "BucketOwnerFullControl"
- spark.hadoop.fs.s3n.impl: "org.apache.hadoop.fs.s3a.S3AFileSystem"
- spark.hadoop.fs.AbstractFileSystem.s3n.impl: "org.apache.hadoop.fs.s3a.S3A"
- spark.hadoop.fs.s3.impl: "org.apache.hadoop.fs.s3a.S3AFileSystem"
- spark.hadoop.fs.AbstractFileSystem.s3.impl: "org.apache.hadoop.fs.s3a.S3A"
- spark.hadoop.fs.s3a.impl: "org.apache.hadoop.fs.s3a.S3AFileSystem"
- spark.hadoop.fs.AbstractFileSystem.s3a.impl: "org.apache.hadoop.fs.s3a.S3A"
- spark.hadoop.fs.s3a.multipart.threshold: "536870912"
- spark.blacklist.enabled: "true"
- spark.blacklist.timeout: "5m"
- spark.task.maxfailures: "8"
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: flytepropeller
namespace: flyte
spec:
template:
spec:
volumes:
- name: spark-config-volume
configMap:
name: flyte-spark-config
containers:
- name: flytepropeller
volumeMounts:
- name: spark-config-volume
mountPath: /etc/flyte/config-spark
18 changes: 18 additions & 0 deletions kustomize/overlays/gke-single-cluster/propeller/propeller.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: flytepropeller
namespace: flyte
spec:
template:
spec:
containers:
- name: flytepropeller
env:
- name: QUBOLE_API_KEY

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happen if we don't use QUBOLE?

value: notarealkey
resources:
limits:
memory: "100Mi"
cpu: "0.1"
ephemeral-storage: "100Mi"
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
bases:
- ../../../dependencies/redis

patches:
- storage.yaml
11 changes: 11 additions & 0 deletions kustomize/overlays/gke-single-cluster/redis/storage.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: redis
namespace: flyte
spec:
template:
spec:
volumes:
- name: redis-data
emptyDir: {}