-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deploy CarbonPlan cluster with kops + jsonnet #389
Conversation
Each kops cluster we want to deploy will have some things in common, but many differences. 2i2c engineers should be able to create and manage these without a lot of cognitive overhead. 1. All clusters should have some basic opinionated defaults 2. But everything should be customizable as necessary 3. kops configuration is already declerative, and we should *not* add another layer on top. jsonnet helps with this. It's a powerful declarative language generating JSON, with conditionals, prototype inheritance, deep merging, maps, etc. In this PR, we create two base objects that can be customized - a `cluster` object that generates a kops Cluster object, and a `instancegroup` object that generates a kops InstanceGroup object. We use these to generate our kops config for carbonplan.jsonnet. jsonnet is only used for some abstraction and deep merging - there isn't an extra level of abstraction here.
I've run into kubernetes/kops#11199, which I've fixed with the workaround I recorded there. |
- Must always be at least 1. Autoscaler can't trigger up because there is nowhere to run the autoscaler! - Apply the label that allows our hub core pods to run on this node as well
I can render this to YAML with ---
{
"apiVersion": "kops.k8s.io/v1alpha2",
"kind": "Cluster",
"metadata": {
"name": "carbonplanhub.k8s.local"
},
"spec": {
"api": {
"loadBalancer": {
"class": "Classic",
"type": "Public"
}
},
"authorization": {
"rbac": { }
},
"channel": "stable",
"cloudProvider": "aws",
"clusterAutoscaler": {
"enabled": true
},
"configBase": "s3://2i2c-carbonplan-kops-state",
"containerRuntime": "docker",
"dns": {
"kubeDNS": {
"provider": "CoreDNS"
}
},
"etcdClusters": [
{
"cpuRequest": "200m",
"etcdMembers": [
{
"instanceGroup": "master",
"name": "a"
}
],
"memoryRequest": "100Mi",
"name": "main"
},
{
"cpuRequest": "100m",
"etcdMembers": [
{
"instanceGroup": "master",
"name": "a"
}
],
"memoryRequest": "100Mi",
"name": "events"
}
],
"iam": {
"allowContainerRegistry": true,
"legacy": false
},
"kubeControllerManager": {
"featureGates": {
"LegacyNodeRoleBehavior": "false",
"ServiceNodeExclusion": "false"
}
},
"kubelet": {
"anonymousAuth": false,
"featureGates": {
"LegacyNodeRoleBehavior": "false",
"ServiceNodeExclusion": "false"
}
},
"kubernetesApiAccess": [
"0.0.0.0/0"
],
"kubernetesVersion": "1.19.7",
"masterPublicName": "api.carbonplanhub.k8s.local",
"networkCIDR": "172.20.0.0/16",
"networking": {
"calico": {
"majorVersion": "v3"
}
},
"nonMasqueradeCIDR": "100.64.0.0/10",
"sshAccess": [
"0.0.0.0/0"
],
"subnets": [
{
"cidr": "172.20.32.0/19",
"name": "us-west-2a",
"type": "Public",
"zone": "us-west-2a"
}
],
"topology": {
"dns": {
"type": "Public"
},
"masters": "public",
"nodes": "public"
}
}
}
---
{
"apiVersion": "kops.k8s.io/v1alpha2",
"kind": "InstanceGroup",
"metadata": {
"labels": {
"kops.k8s.io/cluster": "carbonplanhub.k8s.local"
},
"name": "master"
},
"spec": {
"cloudLabels": {
"k8s.io/cluster-autoscaler/node-template/label/hub.jupyter.org/pool-name": "core-pool"
},
"image": "099720109477/ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-20210119.1",
"machineType": "t3.medium",
"maxSize": 3,
"minSize": 1,
"nodeLabels": {
"hub.jupyter.org/pool-name": "core-pool"
},
"role": "Master",
"subnets": [
"us-west-2a"
],
"taints": [ ]
}
}
---
{
"apiVersion": "kops.k8s.io/v1alpha2",
"kind": "InstanceGroup",
"metadata": {
"labels": {
"kops.k8s.io/cluster": "carbonplanhub.k8s.local"
},
"name": "notebook-r5-large"
},
"spec": {
"cloudLabels": {
"k8s.io/cluster-autoscaler/node-template/label/hub.jupyter.org/pool-name": "notebook-r5-large",
"k8s.io/cluster-autoscaler/node-template/taint/hub.jupyter.org/dedicated": "user:NoSchedule",
"k8s.io/cluster-autoscaler/node-template/taint/hub.jupyter.org_dedicated": "user:NoSchedule"
},
"image": "099720109477/ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-20210119.1",
"machineType": "r5.large",
"maxSize": 20,
"minSize": 0,
"nodeLabels": {
"hub.jupyter.org/pool-name": "notebook-r5-large"
},
"role": "Node",
"subnets": [
"us-west-2a"
],
"taints": [
"hub.jupyter.org_dedicated=user:NoSchedule",
"hub.jupyter.org/dedicated=user:NoSchedule"
]
}
}
---
{
"apiVersion": "kops.k8s.io/v1alpha2",
"kind": "InstanceGroup",
"metadata": {
"labels": {
"kops.k8s.io/cluster": "carbonplanhub.k8s.local"
},
"name": "notebook-r5-xlarge"
},
"spec": {
"cloudLabels": {
"k8s.io/cluster-autoscaler/node-template/label/hub.jupyter.org/pool-name": "notebook-r5-xlarge",
"k8s.io/cluster-autoscaler/node-template/taint/hub.jupyter.org/dedicated": "user:NoSchedule",
"k8s.io/cluster-autoscaler/node-template/taint/hub.jupyter.org_dedicated": "user:NoSchedule"
},
"image": "099720109477/ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-20210119.1",
"machineType": "r5.xlarge",
"maxSize": 20,
"minSize": 0,
"nodeLabels": {
"hub.jupyter.org/pool-name": "notebook-r5-xlarge"
},
"role": "Node",
"subnets": [
"us-west-2a"
],
"taints": [
"hub.jupyter.org_dedicated=user:NoSchedule",
"hub.jupyter.org/dedicated=user:NoSchedule"
]
}
}
---
{
"apiVersion": "kops.k8s.io/v1alpha2",
"kind": "InstanceGroup",
"metadata": {
"labels": {
"kops.k8s.io/cluster": "carbonplanhub.k8s.local"
},
"name": "notebook-r5-2xlarge"
},
"spec": {
"cloudLabels": {
"k8s.io/cluster-autoscaler/node-template/label/hub.jupyter.org/pool-name": "notebook-r5-2xlarge",
"k8s.io/cluster-autoscaler/node-template/taint/hub.jupyter.org/dedicated": "user:NoSchedule",
"k8s.io/cluster-autoscaler/node-template/taint/hub.jupyter.org_dedicated": "user:NoSchedule"
},
"image": "099720109477/ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-20210119.1",
"machineType": "r5.2xlarge",
"maxSize": 20,
"minSize": 0,
"nodeLabels": {
"hub.jupyter.org/pool-name": "notebook-r5-2xlarge"
},
"role": "Node",
"subnets": [
"us-west-2a"
],
"taints": [
"hub.jupyter.org_dedicated=user:NoSchedule",
"hub.jupyter.org/dedicated=user:NoSchedule"
]
}
}
---
{
"apiVersion": "kops.k8s.io/v1alpha2",
"kind": "InstanceGroup",
"metadata": {
"labels": {
"kops.k8s.io/cluster": "carbonplanhub.k8s.local"
},
"name": "notebook-r5-8xlarge"
},
"spec": {
"cloudLabels": {
"k8s.io/cluster-autoscaler/node-template/label/hub.jupyter.org/pool-name": "notebook-r5-8xlarge",
"k8s.io/cluster-autoscaler/node-template/taint/hub.jupyter.org/dedicated": "user:NoSchedule",
"k8s.io/cluster-autoscaler/node-template/taint/hub.jupyter.org_dedicated": "user:NoSchedule"
},
"image": "099720109477/ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-20210119.1",
"machineType": "r5.8xlarge",
"maxSize": 20,
"minSize": 0,
"nodeLabels": {
"hub.jupyter.org/pool-name": "notebook-r5-8xlarge"
},
"role": "Node",
"subnets": [
"us-west-2a"
],
"taints": [
"hub.jupyter.org_dedicated=user:NoSchedule",
"hub.jupyter.org/dedicated=user:NoSchedule"
]
}
}
... It does end with |
After that, I did: kops create -f carbonplan.kops.yaml
kops update cluster carbonplanhub.k8s.local --yes --ssh-public-key ssh-key.pub # I generated a key
kops rolling-update cluster --yes Tada! |
If you add new instancegroups / modify config, you'll need to: kops replace -f carbonplan.kops.yaml --force # the --force will add new resources too, not just modify existing ones
# If you are deleting an instancegroup, you also need to do `kops delete instancegroup` here
kops update cluster carbonplanhub.k8s.local --yes
kops rolling-update cluster --yes # Create new nodepools if needed |
}, | ||
_config+:: { | ||
zone: zone, | ||
masterInstanceGroupName: data.master.metadata.name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like this!
machineType: "t3.medium", | ||
subnets: [zone], | ||
nodeLabels+: { | ||
"hub.jupyter.org/pool-name": "core-pool" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Je, this one was the label I was missing in my kops cluster (copied from the pangeo hubs).
}, | ||
// Needs to be at least 1 | ||
minSize: 1, | ||
maxSize: 3, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, in the cluster I have deployed before I had min = 0 and max = 1 (copied from the pangeo hubs).
I think these are actually more sensible values.
"hub.jupyter.org/dedicated=user:NoSchedule" | ||
], | ||
}, | ||
} + n for n in nodes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lovely!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will only say... I really like jsonnet!
Each kops cluster we want to deploy will have some things in
common, but many differences. 2i2c engineers should be able to
create and manage these without a lot of cognitive overhead.
not add another layer on top.
jsonnet helps with this. It's a powerful declarative language
generating JSON, with conditionals, prototype inheritance,
deep merging, maps, etc.
In this PR, we create two base objects that can be customized -
a
cluster
object that generates a kops Cluster object,and a
instancegroup
object that generates a kops InstanceGroupobject. We use these to generate our kops config for
carbonplan.jsonnet. jsonnet is only used for some abstraction
and deep merging - there isn't an extra level of abstraction
here.
Ref: #291