Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
…into remove-post-install-hook
  • Loading branch information
engedaam committed Aug 21, 2024
2 parents b950af4 + bb87d5b commit 9d99148
Show file tree
Hide file tree
Showing 65 changed files with 1,080 additions and 738 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.15.0
controller-gen.kubebuilder.io/version: v0.16.1
name: ec2nodeclasses.karpenter.k8s.aws
spec:
group: karpenter.k8s.aws
Expand Down Expand Up @@ -164,24 +164,18 @@ spec:
gp2 volumes, this represents the baseline performance of the volume and the
rate at which the volume accumulates I/O credits for bursting.
The following are the supported values for each volume type:
* gp3: 3,000-16,000 IOPS
* io1: 100-64,000 IOPS
* io2: 100-64,000 IOPS
For io1 and io2 volumes, we guarantee 64,000 IOPS only for Instances built
on the Nitro System (https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instance-types.html#ec2-nitro-instances).
Other instance families guarantee performance up to 32,000 IOPS.
This parameter is supported for io1, io2, and gp3 volumes only. This parameter
is not supported for gp2, st1, sc1, or standard volumes.
format: int64
Expand All @@ -204,16 +198,12 @@ spec:
a volume size. The following are the supported volumes sizes for each volume
type:
* gp2 and gp3: 1-16,384
* io1 and io2: 4-16,384
* st1 and sc1: 125-16,384
* standard: 1-1,024
pattern: ^((?:[1-9][0-9]{0,3}|[1-4][0-9]{4}|[5][0-8][0-9]{3}|59000)Gi|(?:[1-9][0-9]{0,3}|[1-5][0-9]{4}|[6][0-3][0-9]{3}|64000)G|([1-9]||[1-5][0-7]|58)Ti|([1-9]||[1-5][0-9]|6[0-3]|64)T)$
type: string
Expand Down Expand Up @@ -392,14 +382,12 @@ spec:
description: |-
MetadataOptions for the generated launch template of provisioned nodes.
This specifies the exposure of the Instance Metadata Service to
provisioned EC2 nodes. For more information,
see Instance Metadata and User Data
(https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html)
in the Amazon Elastic Compute Cloud User Guide.
Refer to recommended, security best practices
(https://aws.github.io/aws-eks-best-practices/security/docs/iam/#restrict-access-to-the-instance-profile-assigned-to-the-worker-node)
for limiting exposure of Instance Metadata and User Data to pods.
Expand All @@ -414,7 +402,6 @@ spec:
nodes. If metadata options is non-nil, but this parameter is not specified,
the default state is "enabled".
If you specify a value of "disabled", instance metadata will not be accessible
on the node.
enum:
Expand Down Expand Up @@ -450,14 +437,12 @@ spec:
requests. If metadata options is non-nil, but this parameter is not
specified, the default state is "required".
If the state is optional, one can choose to retrieve instance metadata with
or without a signed token header on the request. If one retrieves the IAM
role credentials without a token, the version 1.0 role credentials are
returned. If one retrieves the IAM role credentials using a valid signed
token, the version 2.0 role credentials are returned.
If the state is "required", one must send a signed token header with any
instance metadata retrieval requests. In this state, retrieving the IAM
role credentials always returns the version 2.0 credentials; the version
Expand Down Expand Up @@ -693,12 +678,7 @@ spec:
- Unknown
type: string
type:
description: |-
type of condition in CamelCase or in foo.example.com/CamelCase.
---
Many .condition.type values are consistent across resources like Available, but because arbitrary conditions can be
useful (see .node.status.conditions), the ability to deconflict is important.
The regex it matches is (dns1123SubdomainFmt/)?(qualifiedNameFmt)
description: type of condition in CamelCase or in foo.example.com/CamelCase.
maxLength: 316
pattern: ^([a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*/)?(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])$
type: string
Expand Down Expand Up @@ -864,24 +844,18 @@ spec:
gp2 volumes, this represents the baseline performance of the volume and the
rate at which the volume accumulates I/O credits for bursting.
The following are the supported values for each volume type:
* gp3: 3,000-16,000 IOPS
* io1: 100-64,000 IOPS
* io2: 100-64,000 IOPS
For io1 and io2 volumes, we guarantee 64,000 IOPS only for Instances built
on the Nitro System (https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instance-types.html#ec2-nitro-instances).
Other instance families guarantee performance up to 32,000 IOPS.
This parameter is supported for io1, io2, and gp3 volumes only. This parameter
is not supported for gp2, st1, sc1, or standard volumes.
format: int64
Expand All @@ -904,16 +878,12 @@ spec:
a volume size. The following are the supported volumes sizes for each volume
type:
* gp2 and gp3: 1-16,384
* io1 and io2: 4-16,384
* st1 and sc1: 125-16,384
* standard: 1-1,024
pattern: ^((?:[1-9][0-9]{0,3}|[1-4][0-9]{4}|[5][0-8][0-9]{3}|59000)Gi|(?:[1-9][0-9]{0,3}|[1-5][0-9]{4}|[6][0-3][0-9]{3}|64000)G|([1-9]||[1-5][0-7]|58)Ti|([1-9]||[1-5][0-9]|6[0-3]|64)T)$
type: string
Expand Down Expand Up @@ -978,14 +948,12 @@ spec:
description: |-
MetadataOptions for the generated launch template of provisioned nodes.
This specifies the exposure of the Instance Metadata Service to
provisioned EC2 nodes. For more information,
see Instance Metadata and User Data
(https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html)
in the Amazon Elastic Compute Cloud User Guide.
Refer to recommended, security best practices
(https://aws.github.io/aws-eks-best-practices/security/docs/iam/#restrict-access-to-the-instance-profile-assigned-to-the-worker-node)
for limiting exposure of Instance Metadata and User Data to pods.
Expand All @@ -1000,7 +968,6 @@ spec:
nodes. If metadata options is non-nil, but this parameter is not specified,
the default state is "enabled".
If you specify a value of "disabled", instance metadata will not be accessible
on the node.
enum:
Expand Down Expand Up @@ -1036,14 +1003,12 @@ spec:
requests. If metadata options is non-nil, but this parameter is not
specified, the default state is "required".
If the state is optional, one can choose to retrieve instance metadata with
or without a signed token header on the request. If one retrieves the IAM
role credentials without a token, the version 1.0 role credentials are
returned. If one retrieves the IAM role credentials using a valid signed
token, the version 2.0 role credentials are returned.
If the state is "required", one must send a signed token header with any
instance metadata retrieval requests. In this state, retrieving the IAM
role credentials always returns the version 2.0 credentials; the version
Expand Down Expand Up @@ -1269,12 +1234,7 @@ spec:
- Unknown
type: string
type:
description: |-
type of condition in CamelCase or in foo.example.com/CamelCase.
---
Many .condition.type values are consistent across resources like Available, but because arbitrary conditions can be
useful (see .node.status.conditions), the ability to deconflict is important.
The regex it matches is (dns1123SubdomainFmt/)?(qualifiedNameFmt)
description: type of condition in CamelCase or in foo.example.com/CamelCase.
maxLength: 316
pattern: ^([a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*/)?(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])$
type: string
Expand Down
20 changes: 3 additions & 17 deletions charts/karpenter-crd/templates/karpenter.sh_nodeclaims.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.15.0
controller-gen.kubebuilder.io/version: v0.16.1
name: nodeclaims.karpenter.sh
spec:
group: karpenter.sh
Expand Down Expand Up @@ -262,19 +262,15 @@ spec:
description: |-
TerminationGracePeriod is the maximum duration the controller will wait before forcefully deleting the pods on a node, measured from when deletion is first initiated.
Warning: this feature takes precedence over a Pod's terminationGracePeriodSeconds value, and bypasses any blocked PDBs or the karpenter.sh/do-not-disrupt annotation.
This field is intended to be used by cluster administrators to enforce that nodes can be cycled within a given time period.
When set, drifted nodes will begin draining even if there are pods blocking eviction. Draining will respect PDBs and the do-not-disrupt annotation until the TGP is reached.
Karpenter will preemptively delete pods so their terminationGracePeriodSeconds align with the node's terminationGracePeriod.
If a pod would be terminated without being granted its full terminationGracePeriodSeconds prior to the node timeout,
that pod will be deleted at T = node timeout - pod terminationGracePeriodSeconds.
The feature can also be used to allow maximum time limits for long-running jobs which can delay node termination with preStop hooks.
If left undefined, the controller will wait indefinitely for pods to be drained.
pattern: ^([0-9]+(s|m|h))+$
Expand Down Expand Up @@ -350,12 +346,7 @@ spec:
- Unknown
type: string
type:
description: |-
type of condition in CamelCase or in foo.example.com/CamelCase.
---
Many .condition.type values are consistent across resources like Available, but because arbitrary conditions can be
useful (see .node.status.conditions), the ability to deconflict is important.
The regex it matches is (dns1123SubdomainFmt/)?(qualifiedNameFmt)
description: type of condition in CamelCase or in foo.example.com/CamelCase.
maxLength: 316
pattern: ^([a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*/)?(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])$
type: string
Expand Down Expand Up @@ -798,12 +789,7 @@ spec:
- Unknown
type: string
type:
description: |-
type of condition in CamelCase or in foo.example.com/CamelCase.
---
Many .condition.type values are consistent across resources like Available, but because arbitrary conditions can be
useful (see .node.status.conditions), the ability to deconflict is important.
The regex it matches is (dns1123SubdomainFmt/)?(qualifiedNameFmt)
description: type of condition in CamelCase or in foo.example.com/CamelCase.
maxLength: 316
pattern: ^([a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*/)?(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])$
type: string
Expand Down
20 changes: 3 additions & 17 deletions charts/karpenter-crd/templates/karpenter.sh_nodepools.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.15.0
controller-gen.kubebuilder.io/version: v0.16.1
name: nodepools.karpenter.sh
spec:
group: karpenter.sh
Expand Down Expand Up @@ -392,19 +392,15 @@ spec:
description: |-
TerminationGracePeriod is the maximum duration the controller will wait before forcefully deleting the pods on a node, measured from when deletion is first initiated.
Warning: this feature takes precedence over a Pod's terminationGracePeriodSeconds value, and bypasses any blocked PDBs or the karpenter.sh/do-not-disrupt annotation.
This field is intended to be used by cluster administrators to enforce that nodes can be cycled within a given time period.
When set, drifted nodes will begin draining even if there are pods blocking eviction. Draining will respect PDBs and the do-not-disrupt annotation until the TGP is reached.
Karpenter will preemptively delete pods so their terminationGracePeriodSeconds align with the node's terminationGracePeriod.
If a pod would be terminated without being granted its full terminationGracePeriodSeconds prior to the node timeout,
that pod will be deleted at T = node timeout - pod terminationGracePeriodSeconds.
The feature can also be used to allow maximum time limits for long-running jobs which can delay node termination with preStop hooks.
If left undefined, the controller will wait indefinitely for pods to be drained.
pattern: ^([0-9]+(s|m|h))+$
Expand Down Expand Up @@ -476,12 +472,7 @@ spec:
- Unknown
type: string
type:
description: |-
type of condition in CamelCase or in foo.example.com/CamelCase.
---
Many .condition.type values are consistent across resources like Available, but because arbitrary conditions can be
useful (see .node.status.conditions), the ability to deconflict is important.
The regex it matches is (dns1123SubdomainFmt/)?(qualifiedNameFmt)
description: type of condition in CamelCase or in foo.example.com/CamelCase.
maxLength: 316
pattern: ^([a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*/)?(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])$
type: string
Expand Down Expand Up @@ -1047,12 +1038,7 @@ spec:
- Unknown
type: string
type:
description: |-
type of condition in CamelCase or in foo.example.com/CamelCase.
---
Many .condition.type values are consistent across resources like Available, but because arbitrary conditions can be
useful (see .node.status.conditions), the ability to deconflict is important.
The regex it matches is (dns1123SubdomainFmt/)?(qualifiedNameFmt)
description: type of condition in CamelCase or in foo.example.com/CamelCase.
maxLength: 316
pattern: ^([a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*/)?(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])$
type: string
Expand Down
2 changes: 1 addition & 1 deletion designs/node-ownership.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ _Note: This internal Machine CR will come in as an alpha API and an internal des

## Background

Karpenter currently creates the node object on the Kubernetes api server immediately after creating the VM instance. Kubernetes cloud providers (EKS, AKS, GKE, etc.) assume that, ultimately, the kubelet will be the entity responsible for registering the node to the api-server. This is reflected [through the userData](https://github.com/awslabs/amazon-eks-ami/blob/master/files/bootstrap.sh) where KubeletConfig can be set [that is only properly propogated for all values when the kubelet is the node creator](https://github.com/kubernetes/kubernetes/blob/39c76ba2edeadb84a115cc3fbd9204a2177f1c28/pkg/kubelet/kubelet_node_status.go#L286). However, Karpenter’s current architecture necessitates that it both launches the VM instance and creates the node object on the Kubernetes API server in succession (more on this [below](#why-does-karpenter-createoperate-on-the-node-at-all)).
Karpenter currently creates the node object on the Kubernetes api server immediately after creating the VM instance. Kubernetes cloud providers (EKS, AKS, GKE, etc.) assume that, ultimately, the kubelet will be the entity responsible for registering the node to the api-server. This is reflected [through the userData](https://github.com/awslabs/amazon-eks-ami/blob/master/files/bootstrap.sh) where KubeletConfig can be set [that is only properly propagated for all values when the kubelet is the node creator](https://github.com/kubernetes/kubernetes/blob/39c76ba2edeadb84a115cc3fbd9204a2177f1c28/pkg/kubelet/kubelet_node_status.go#L286). However, Karpenter’s current architecture necessitates that it both launches the VM instance and creates the node object on the Kubernetes API server in succession (more on this [below](#why-does-karpenter-createoperate-on-the-node-at-all)).

This document describes the current node creation flow for Karpenter as well as the rationale for why Karpenter originally created the node object. It then calls out the specific problems with this approach and recommends an alternative approach to creating the Node object that solves for the current approach’s problems.

Expand Down
Loading

0 comments on commit 9d99148

Please sign in to comment.