Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deploying NASA Cryo cluster and hub #1768

Merged
merged 30 commits into from
Oct 21, 2022
Merged
Show file tree
Hide file tree
Changes from 25 commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
8c398dc
Add jsonnet file and ssh keys for nasa-cryo eksctl cluster
sgibson91 Oct 13, 2022
f3e48e5
Add generated .tfvars file for nasa-cryo
sgibson91 Oct 13, 2022
92ea205
Remove db_instance_identifier from .tfvars file
sgibson91 Oct 14, 2022
99b5329
Add cluster creds for nasa-cryo
sgibson91 Oct 14, 2022
6038f97
Add a minimal cluster.yaml file for nasa-cryo
sgibson91 Oct 14, 2022
a635dd9
Update cluster deployer credentials
sgibson91 Oct 14, 2022
bad6fcc
Add support chart config
sgibson91 Oct 14, 2022
79af854
Add a commant to output correct eksctl iam command to run
yuvipanda Oct 14, 2022
c6e5a76
Merge remote-tracking branch 'upstream/master' into nasa-cryo
yuvipanda Oct 14, 2022
140e01b
Remove remnant of merge conflict resolution
sgibson91 Oct 17, 2022
94b835a
Update docs to create a new terraform workspace
sgibson91 Oct 17, 2022
eb9680a
Update cluster creds
sgibson91 Oct 17, 2022
1c5734e
Move eksctl access section to after terraform section, Reference new …
sgibson91 Oct 17, 2022
4da183b
Fix cases where AWS_ variables are set in terminal
yuvipanda Oct 18, 2022
04c5ef2
Setup grafana dashboards for nasa-cryo cluster
sgibson91 Oct 18, 2022
1d7e8bb
Sketch out common.vlaues.yaml for nasa-cryo
sgibson91 Oct 18, 2022
df66638
Begin creating hub definitions in cluster.yaml for nasa-cryo
sgibson91 Oct 18, 2022
c488aa5
Update funded_by section of common values
sgibson91 Oct 18, 2022
3bbfa56
Make the shared directory read only
sgibson91 Oct 18, 2022
6f27234
Add Authenticator config to allow restricting profiles based on GitHu…
sgibson91 Oct 19, 2022
7751d60
Correct the URL for the logo
sgibson91 Oct 19, 2022
793c6e7
Add config for staging hub
sgibson91 Oct 19, 2022
3bbd847
Add config for prod hub
sgibson91 Oct 19, 2022
008b03d
Add new cluster to deploy and validate workflow files
sgibson91 Oct 19, 2022
559c1e3
Remove serviceAccountName from common config
sgibson91 Oct 19, 2022
fedcfb2
Enable autoscaler for the nasa-cryo cluster
yuvipanda Oct 19, 2022
c46e9b5
Add note on AWS quotas to docs
sgibson91 Oct 20, 2022
95bae94
Include warning about enabling the cluster-autoscaler subchart in sup…
sgibson91 Oct 20, 2022
e21b25f
Update team names to correct capitalisation
sgibson91 Oct 21, 2022
f6ba2bd
Update domains to what the community would like to use
sgibson91 Oct 21, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/workflows/deploy-grafana-dashboards.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ jobs:
- cluster_name: awi-ciroh
- cluster_name: callysto
- cluster_name: 2i2c-uk
- cluster_name: nasa-cryo
steps:
- name: Checkout repo
uses: actions/checkout@v3
Expand Down
1 change: 1 addition & 0 deletions .github/workflows/deploy-hubs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -156,6 +156,7 @@ jobs:
failure_linked-earth: "${{ steps.declare-failure-status.outputs.failure_linked-earth }}"
failure_awi-ciroh: "${{ steps.declare-failure-status.outputs.failure_awi-ciroh }}"
failure_callysto: "${{ steps.declare-failure-status.outputs.failure_callysto }}"
failure_nasa-cryo: "${{ steps.declare-failure-status.outputs.failure_nasa-cryo }}"

# Only run this job on pushes to the default branch and when the job output is not
# an empty list
Expand Down
1 change: 1 addition & 0 deletions .github/workflows/validate-clusters.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@ jobs:
- cluster_name: linked-earth
- cluster_name: awi-ciroh
- cluster_name: callysto
- cluster_name: nasa-cryo

steps:
- uses: actions/checkout@v3
Expand Down
42 changes: 42 additions & 0 deletions config/clusters/nasa-cryo/cluster.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
name: nasa-cryo
provider: aws
aws:
key: enc-deployer-credentials.secret.json
clusterType: eks
clusterName: nasa-cryo
region: us-west-2
support:
helm_chart_values_files:
- support.values.yaml
- enc-support.secret.values.yaml
hubs:
- name: staging
display_name: "NASA Cryo in the Cloud (staging)"
domain: staging.cryointhecloud.2i2c.cloud
helm_chart: daskhub
auth0:
# connection update? Also ensure the basehub Helm chart is provided a
# matching value for jupyterhub.custom.2i2c.add_staff_user_ids_of_type!
enabled: false
helm_chart_values_files:
# The order in which you list files here is the order the will be passed
# to the helm upgrade command in, and that has meaning. Please check
# that you intend for these files to be applied in this order.
- common.values.yaml
- staging.values.yaml
- enc-staging.secret.values.yaml
- name: prod
display_name: "NASA Cryo in the Cloud (prod)"
domain: cryointhecloud.2i2c.cloud
helm_chart: daskhub
auth0:
# connection update? Also ensure the basehub Helm chart is provided a
# matching value for jupyterhub.custom.2i2c.add_staff_user_ids_of_type!
enabled: false
helm_chart_values_files:
# The order in which you list files here is the order the will be passed
# to the helm upgrade command in, and that has meaning. Please check
# that you intend for these files to be applied in this order.
- common.values.yaml
- prod.values.yaml
- enc-prod.secret.values.yaml
129 changes: 129 additions & 0 deletions config/clusters/nasa-cryo/common.values.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,129 @@
basehub:
nfs:
pv:
# from https://docs.aws.amazon.com/efs/latest/ug/mounting-fs-nfs-mount-settings.html
mountOptions:
- rsize=1048576
- wsize=1048576
- timeo=600
- soft # We pick soft over hard, so NFS lockups don't lead to hung processes
- retrans=2
- noresvport
serverIP: fs-0872256335d483d5f.efs.us-west-2.amazonaws.com
baseShareName: /
jupyterhub:
custom:
2i2c:
add_staff_user_ids_to_admin_users: true
add_staff_user_ids_of_type: "github"
homepage:
templateVars:
org:
name: Cryo in the Cloud
logo_url: https://raw.githubusercontent.com/CryoInTheCloud/CryoCloudWebsite/main/cryocloud.png
url: https://github.com/CryoInTheCloud
designed_by:
name: 2i2c
url: https://2i2c.org
operated_by:
name: 2i2c
url: https://2i2c.org
# Ideally, this community would like to list more than one funder
# Issue tracking implementation of this feature:
# https://github.com/2i2c-org/default-hub-homepage/issues/16
funded_by:
name: "NASA ICESat-2 Science Team"
url: https://icesat-2.gsfc.nasa.gov/science_definition_team
hub:
config:
Authenticator:
# We are restricting profiles based on GitHub Team membership and
# so need to persist auth state
enable_auth_state: true
# This hub uses GitHub Teams auth and so we don't set
# allowed_users in order to not deny access to valid members of
# the listed teams. These people should have admin access though.
admin_users:
- tsnow03
- JessicaS11
- dfelikson
JupyterHub:
authenticator_class: github
GitHubOAuthenticator:
# We are restricting profiles based on GitHub Team membership and
# so need to populate the teams in the auth state
populate_teams_in_auth_state: true
allowed_organizations:
- 2i2c-org:tech-team
- CryoInTheCloud:CryoCloudUser
- CryoInTheCloud:CryoCloudAdvanced
scope:
- read:org
singleuser:
defaultUrl: /lab
# User image repo: https://github.com/CryoInTheCloud/CryoCloudWebsite/tree/main/conda
image:
# This image is available on both Docker Hub and quay.io. We use quay.io
# here due to its more generous pull rate limits.
name: quay.io/cryointhecloud/cryocloudwebsite
tag: "2022.10.12"
storage:
extraVolumeMounts:
- name: home
mountPath: /home/jovyan/shared
subPath: _shared
readOnly: true
profileList:
# The mem-guarantees are here so k8s doesn't schedule other pods
# on these nodes.
- display_name: "Small: m5.large"
description: "~2 CPU, ~8G RAM"
default: true
allowed_teams:
- 2i2c-org:tech-team
- CryoInTheCloud:CryoCloudUser
- CryoInTheCloud:CryoCloudAdvanced
kubespawner_override:
# Explicitly unset mem_limit, so it overrides the default memory limit we set in
# basehub/values.yaml
mem_limit: null
mem_guarantee: 6.5G
node_selector:
node.kubernetes.io/instance-type: m5.large
- display_name: "Medium: m5.xlarge"
description: "~4 CPU, ~15G RAM"
allowed_teams:
- 2i2c-org:tech-team
- CryoInTheCloud:CryoCloudUser
- CryoInTheCloud:CryoCloudAdvanced
kubespawner_override:
mem_limit: null
mem_guarantee: 12G
node_selector:
node.kubernetes.io/instance-type: m5.xlarge
- display_name: "Large: m5.2xlarge"
description: "~8 CPU, ~30G RAM"
allowed_teams:
- 2i2c-org:tech-team
- CryoInTheCloud:CryoCloudAdvanced
kubespawner_override:
mem_limit: null
mem_guarantee: 26G
node_selector:
node.kubernetes.io/instance-type: m5.2xlarge
- display_name: "Huge: m5.8xlarge"
description: "~32 CPU, ~128G RAM"
allowed_teams:
- 2i2c-org:tech-team
- CryoInTheCloud:CryoCloudAdvanced
kubespawner_override:
mem_limit: null
mem_guarantee: 115G
node_selector:
node.kubernetes.io/instance-type: m5.8xlarge
scheduling:
userPlaceholder:
enabled: false
replicas: 0
userScheduler:
enabled: false
25 changes: 25 additions & 0 deletions config/clusters/nasa-cryo/enc-deployer-credentials.secret.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
{
"AccessKey": {
"AccessKeyId": "ENC[AES256_GCM,data:ypqptk+AINfBDTr9yT8XI1i12+A=,iv:ZSzJEMsXfqMDwz1PUQWFD4UmDjxixs0ARfVa//iCOOw=,tag:WkM0TZhtdd4YTOdBGbimGg==,type:str]",
"SecretAccessKey": "ENC[AES256_GCM,data:gh7waVFBVdh5/7cCyq53ZWZm00HQA0W3lxKVmt01/mE9uRHlASLZfw==,iv:6azuIVBjSnEx/aXwi4Xt2LxdFXbKKOGnpAfeLdoEBtM=,tag:+MMmBcnkYXh9udYZBwA1eQ==,type:str]",
"UserName": "ENC[AES256_GCM,data:yecSFFtr/KTIFK5Z6LH4ElihJ7oE7d4=,iv:9A1RIWrNfKoOcDSJmdyvmdfGWTv09p2onTUjTzQpxa4=,tag:FMAmHMsZZ2VZ+VpI6RNqHw==,type:str]"
},
"sops": {
"kms": null,
"gcp_kms": [
{
"resource_id": "projects/two-eye-two-see/locations/global/keyRings/sops-keys/cryptoKeys/similar-hubs",
"created_at": "2022-10-17T10:43:57Z",
"enc": "CiQA4OM7eHvbSMs2Ln1giHqphcJf8uMlHXoFfuThnBwPiA6bYQUSSQDuy/p8VPlHP9sg3d5csL432NPi4NNGwe5VUeeHR9RdIcauiP5KhHiFOI7rVrGcz1SIlW5XTYmQ5Ochj4mbXgGhiZ2qKzn/PCE="
}
],
"azure_kv": null,
"hc_vault": null,
"age": null,
"lastmodified": "2022-10-17T10:43:58Z",
"mac": "ENC[AES256_GCM,data:Fr6V4xZ+3/EAKr9JBFX96ag1eMR8qm7p1HZ8yS+mZfCgWjlciHosSVjgd+gQG84xnilFh+dSpTjy4W8Vev5vfHAAdQZfl7FVg8GjdCfqT3LJabJDQMksdFNHKfux3bBHN4Uw0EB/n3VYJuNg7zocl21PjtLKMSzZtuaR3cruh9M=,iv:pFwAedyytIU1OaQfwfs4ipnAEmGhVXABHxQhSfR8xc8=,tag:vslkISuohP6V4Qlnynwk7Q==,type:str]",
"pgp": null,
"unencrypted_suffix": "_unencrypted",
"version": "3.7.3"
}
}
15 changes: 15 additions & 0 deletions config/clusters/nasa-cryo/enc-grafana-token.secret.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
grafana_token: ENC[AES256_GCM,data:ni6X5Vo/tkhe7yHD3lMz+7NY+rf289VToX5WBwHfH6yEbgzZBM7TRndXYo9wAvlmMVqmsrMsc3Ay7cuYnTUed5zND+lataA/iWjeKoefLxpNiv3uUl6vzYzMiymRUwLElmQe3te8iTWgAwgAY29b/w==,iv:FmihMyr/ojx1U0H4iXq+GYk7QN55zH2TuQRDyCGBWOk=,tag:kiuK3u2ajLRrSzwnDJIVug==,type:str]
sops:
kms: []
gcp_kms:
- resource_id: projects/two-eye-two-see/locations/global/keyRings/sops-keys/cryptoKeys/similar-hubs
created_at: "2022-10-18T10:32:19Z"
enc: CiQA4OM7eHhF9zEUU0TU6TneV3t9Lo/iGr79TcOPVLMNzA3YuGUSSQDuy/p8Cg17FCCUEdV+ER7ttX8eNgM09WgCIb6jj4vJgAhvI5OZdPL1t2GOQMJOOeFkYt/dH/ClrttGqJISI1yMhaiZEniCfWU=
azure_kv: []
hc_vault: []
age: []
lastmodified: "2022-10-18T10:32:20Z"
mac: ENC[AES256_GCM,data:O6ro11Orl2Y9SCRj+9LWtBN8d6vMFiubXeCdnxxyCEzcqZnzCquVvs68WxSiP+xZCvGlQtJZmO05IFMamVKHBrDfk8P172T5RyjFvZZ8J3VvvTgkGiRLaw2oZXhZrsnqcIxdNSGAUaapuApcDH69pEyhixRLXFFIxRw2yMyXfdk=,iv:JrIPlPvh4sJy2zLehDgneJfnTFi+crsrgC6QtVpwdN8=,tag:Sn1yYKcaGCfzdIRPLoqrRA==,type:str]
pgp: []
unencrypted_suffix: _unencrypted
version: 3.7.3
21 changes: 21 additions & 0 deletions config/clusters/nasa-cryo/enc-prod.secret.values.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
basehub:
jupyterhub:
hub:
config:
GitHubOAuthenticator:
client_id: ENC[AES256_GCM,data:ilEVTmjN83Y3YnOlBKjA9EOfUuM=,iv:C8143cwC6pB0sFueQq7T1XbbmABAHu6kiLnxbZy9hVc=,tag:Jc34Po9xZZShfq5l/2CEbQ==,type:str]
client_secret: ENC[AES256_GCM,data:cSq7uA39SxQ9wIXAWp2FkEkGuzPXHCixp6iwRnkya9YngnUhufpIYg==,iv:dSXXOKhEt/ipOD1bWhcghks+Mjpn2oUexzJhjwEt6ek=,tag:8woToirTzLvX1WqHjbLpGw==,type:str]
sops:
kms: []
gcp_kms:
- resource_id: projects/two-eye-two-see/locations/global/keyRings/sops-keys/cryptoKeys/similar-hubs
created_at: "2022-10-19T12:21:15Z"
enc: CiQA4OM7eHhdeYFpxkSDTcwgJSCnGcXyuW+RXURXZyqVtBExJLgSSQDuy/p8cJdVxnALmGegCdGggKwgIqy27Dtr97EyJnjFxmQMGDQsWh3vL/2xXyL4Gw8E1cYoxh7r+ecwI4YmBOO+q6rBIM2fS4I=
azure_kv: []
hc_vault: []
age: []
lastmodified: "2022-10-19T12:21:15Z"
mac: ENC[AES256_GCM,data:DyxGjuM/4q9M8zWES9QZcIJ+hD08HPlqyGucuZOrqwLQ51UegEG0hx94bt5ZDDpbAbwDl+0tUj59NKG98dPE0cZfwS+7mKmn7Ym+2l4rmuQpGQuZv2MCUBlVt+E5xZXxOTWV83HIbPgkP8+u/LtlNAhTFR9ehRG/0sFEmeLL5/w=,iv:ANa5xKiHr+06hatYIVsO/EFMiPymuXp8RndXzAcMl0Q=,tag:UDVpNvW8rudwk2x10uFPKw==,type:str]
pgp: []
unencrypted_suffix: _unencrypted
version: 3.7.3
21 changes: 21 additions & 0 deletions config/clusters/nasa-cryo/enc-staging.secret.values.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
basehub:
jupyterhub:
hub:
config:
GitHubOAuthenticator:
client_id: ENC[AES256_GCM,data:sJhtxTkW9/1/ytqDQjBfrC/U5BY=,iv:W7tJJWrTL69fzjr7BZnvvMAc487rpcgquytIzh8qCxU=,tag:oyb6+YeRPRmh/N9J7lJCog==,type:str]
client_secret: ENC[AES256_GCM,data:Mpdx/rFOEZ5J5aHb1tQe1PMB/V0J2NSEdjos+GTek9xeM3iUcIsbqQ==,iv:FLgvbSmzs8n1BXtPKeUp5dqj7k5R4DDr3FtTjFlxoLw=,tag:WzfGnrXWPrWbrjZ4E3Zolg==,type:str]
sops:
kms: []
gcp_kms:
- resource_id: projects/two-eye-two-see/locations/global/keyRings/sops-keys/cryptoKeys/similar-hubs
created_at: "2022-10-19T10:40:58Z"
enc: CiQA4OM7eJLTngjXLJHRpB6CqAPHIrdQnOXV+uIL/hvLdhrX3VsSSQDuy/p8Hc7fqevvelfq7RtQEofei3NfeOG0fs0cTWalKp4A9C6/X2gdUhtO0fbxQT7g3489b/vo3un1BKduXwESQjsZMaxGV18=
azure_kv: []
hc_vault: []
age: []
lastmodified: "2022-10-19T10:40:58Z"
mac: ENC[AES256_GCM,data:2SZIGenwl1+TxEwZ8afNW/RbfBJd50BrkAVUMFqstp2wdo3xlLY0oUKh88cEJSSeUE3fnB7fiqZru58Q9/yAuavXeMUkF2QTT+t6GEIiHMcoUzXvzrC3xGPt7/1WDRZSsf1uFw0tY3f2yHQFIaQnxY3qpWadSeknPqqvlfDrthg=,iv:IGbuX99q+4AgddeSwhoqXB8NgbZq1ZPxKDfLL+aTOnE=,tag:cKYljtQuEU2WS0v48wzR6w==,type:str]
pgp: []
unencrypted_suffix: _unencrypted
version: 3.7.3
17 changes: 17 additions & 0 deletions config/clusters/nasa-cryo/enc-support.secret.values.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
prometheusIngressAuthSecret:
username: ENC[AES256_GCM,data:ebPGMvUPCEaDzGY144yJZkI/+0FJuvFxcs0+lxKt1o1s6wMzkL5di+HB3NY98VIOFtbHro0cOBjtFyes3UkgSA==,iv:j4SXSqWNJjXhPcnDep2tRPsdF+9vUG+d9QW0ZRqNxGM=,tag:lAG+hK1dLSwJHVy5tT1kjA==,type:str]
password: ENC[AES256_GCM,data:zHPuOyaT5+HC+3IKin54as4YI1nJrxCa7huIP85MEKWaHa11lyQLjAW4bQsQY1hBH/0kgW+kzVtkDlcIghtjGA==,iv:8nHsm5KipEKs7o4KV7mBQ31a6xCXMp8oBL7Ip1IWpy4=,tag:tuKlTIAiMCWnuME0EpJJKQ==,type:str]
sops:
kms: []
gcp_kms:
- resource_id: projects/two-eye-two-see/locations/global/keyRings/sops-keys/cryptoKeys/similar-hubs
created_at: "2022-10-14T10:37:09Z"
enc: CiQA4OM7eI66DIfP/2zTsYsei1C4UzAv+/lOYjahuA9Bmh8CgXYSSQDuy/p8pHTxiQlqkmY/NUjPdfxlAT5uiYBjNBX3RdE5ikDvQnpY9BwyQy4bDs075GlkLAU4BwMax0iC0s20gPu8+01pphDS0DM=
azure_kv: []
hc_vault: []
age: []
lastmodified: "2022-10-14T10:37:09Z"
mac: ENC[AES256_GCM,data:rIOhzKLT0i55m9Ro+1DZ56DFnM5/uyMoGgaKWsgFCCxlIXcbSjwY3opyKjI8yn2JTZjxh1/8aScxHiLfI3tVsa1hYQqnI7HVtmbqR8PYLxi4W7OTWmoc8xoaYGuSD2SWRM7s/JXcZqcBLA3JqqZUiAmfnHfqo6bcU5WclItI6I0=,iv:o+c+SeWnt9E+b9zoDnUr3lxfrDVD7aUNl7fyvK8QVm4=,tag:g01sgSXHjlZEIy3DSrKH5g==,type:str]
pgp: []
unencrypted_suffix: _unencrypted
version: 3.7.3
9 changes: 9 additions & 0 deletions config/clusters/nasa-cryo/prod.values.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
basehub:
userServiceAccount:
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::574251165169:role/nasa-cryo-prod
jupyterhub:
hub:
config:
GitHubOAuthenticator:
oauth_callback_url: https://cryointhecloud.2i2c.cloud/hub/oauth_callback
9 changes: 9 additions & 0 deletions config/clusters/nasa-cryo/staging.values.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
basehub:
userServiceAccount:
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::574251165169:role/nasa-cryo-staging
jupyterhub:
hub:
config:
GitHubOAuthenticator:
oauth_callback_url: https://staging.cryointhecloud.2i2c.cloud/hub/oauth_callback
22 changes: 22 additions & 0 deletions config/clusters/nasa-cryo/support.values.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
prometheusIngressAuthSecret:
enabled: true

grafana:
ingress:
hosts:
- grafana.cryointhecloud.2i2c.cloud
tls:
- secretName: grafana-tls
hosts:
- grafana.cryointhecloud.2i2c.cloud

prometheus:
server:
ingress:
enabled: true
hosts:
- prometheus.cryointhecloud.2i2c.cloud
tls:
- secretName: prometheus-tls
hosts:
- prometheus.cryointhecloud.2i2c.cloud
Loading