Skip to content

Commit

Permalink
Merge branch 'main' into feat/crd-additional-annotations
Browse files Browse the repository at this point in the history
  • Loading branch information
victor-keltio authored Dec 10, 2024
2 parents cd0968a + f28522f commit 2d73a1b
Show file tree
Hide file tree
Showing 163 changed files with 15,363 additions and 7,483 deletions.
2 changes: 1 addition & 1 deletion .github/actions/e2e/run-tests-private-cluster/action.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,7 @@ runs:
CLUSTER_VPC_ID: ${{ env.CLUSTER_VPC_ID }}
EKS_CLUSTER_SG: ${{ env.EKS_CLUSTER_SG }}
CLEANUP: ${{ inputs.cleanup }}
uses: aws-actions/aws-codebuild-run-build@540238d832197229bfaab9785feb4fb8450f6396 # v1.0.17
uses: aws-actions/aws-codebuild-run-build@4d15a47425739ac2296ba5e7eee3bdd4bfbdd767 # v1.0.18
with:
project-name: E2EPrivateClusterCodeBuildProject-us-east-1
buildspec-override: |
Expand Down
2 changes: 1 addition & 1 deletion .github/actions/install-deps/action.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ runs:
# Root path permission workaround for caching https://github.com/actions/cache/issues/845#issuecomment-1252594999
- run: sudo chown "$USER" /usr/local
shell: bash
- uses: actions/cache@6849a6489940f00c2f30c0fb92c6274307ccb58a # v4.1.2
- uses: actions/cache@1bd1e32a3bdc45362d1e726936510720a7c30a57 # v4.2.0
id: cache-toolchain
with:
path: |
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/e2e-matrix.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,7 @@ jobs:
statuses: write # ./.github/actions/commit-status/start
uses: ./.github/workflows/e2e-upgrade.yaml
with:
from_git_ref: 2f4cebea345e6a399ea00149ec7a41269739bb3b
from_git_ref: 2fb10b6d330ac9662bd35dc81124c7666f66e453
to_git_ref: ${{ inputs.git_ref }}
region: ${{ inputs.region }}
k8s_version: ${{ inputs.k8s_version }}
Expand Down
2 changes: 2 additions & 0 deletions ADOPTERS.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,10 @@ If you are open to others contacting you about your use of Karpenter on Slack, a
| GlobalDots | Using Karpenter to scale Kubernetes clusters for a lot of our clients & for internal needs | `@vainkop` | [GlobalDots](https://globaldots.com) |
| Grafana Labs | Using Karpenter as our Autoscaling tool on EKS | `@paulajulve`, `@logyball` | [Homepage](https://grafana.com/) & [Blog](https://grafana.com/blog/2023/11/09/how-grafana-labs-switched-to-karpenter-to-reduce-costs-and-complexities-in-amazon-eks/) |
| H2O.ai | Dynamically scaling CPU and GPU nodes for AI workloads | `@Ophir Zahavi`, `@Asaf Oren` | [H2O.ai](https://h2o.ai/) |
| HENNGE K.K. | Dynamically scaling production workloads in Tokyo region | `@furqan.habibi`, `@Hans Gunawan` | [HENNGE](https://hennge.com/global/) |
| Homa | Using Karpenter to manage dynamically big instances and save cost effectively with disruptions | `@afreyermuth98`, `@alexbescond` | [Homa](https://www.homagames.com/) |
| idealo | Scaling multi-arch IPv6 clusters hosting web and event-driven applications | `@Heiko Rothe` | [Homepage](https://www.idealo.de) |
| Kaltura | Using karpenter to deliver video to millions of end users | `@Ido Ziv` | [Homepage](https://corp.kaltura.com/) |
| Livspace | Replacement for cluster autoscaler on production and staging EKS clusters | `@praveen-livspace` | [Homepage](https://www.livspace.com) |
| Nexxiot | Easier, Safer, Cleaner Global Transportation - Using Karpenter to manage EKS nodes | `@Alex Berger` | [Homepage](https://nexxiot.com/) |
| Nirvana Money | Building healthy, happy financial lives - Using Karpenter to manage all-Spot clusters | `@DWSR` | [Homepage](https://www.nirvana.money/) |
Expand Down
4 changes: 2 additions & 2 deletions charts/karpenter-crd/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@ apiVersion: v2
name: karpenter-crd
description: A Helm chart for Karpenter Custom Resource Definitions (CRDs).
type: application
version: 1.0.0
appVersion: 1.0.0
version: 1.1.0
appVersion: 1.1.0
keywords:
- cluster
- node
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -118,7 +118,7 @@ spec:
additionalProperties:
type: string
description: |-
Tags is a map of key/value tags used to select subnets
Tags is a map of key/value tags used to select amis.
Specifying '*' for a value selects all values for a given tag key.
maxProperties: 20
type: object
Expand Down Expand Up @@ -488,7 +488,7 @@ spec:
additionalProperties:
type: string
description: |-
Tags is a map of key/value tags used to select subnets
Tags is a map of key/value tags used to select security groups.
Specifying '*' for a value selects all values for a given tag key.
maxProperties: 20
type: object
Expand Down Expand Up @@ -595,6 +595,9 @@ spec:
items:
description: AMI contains resolved AMI selector values utilized for node launch
properties:
deprecated:
description: Deprecation status of the AMI
type: boolean
id:
description: ID of the AMI
type: string
Expand Down Expand Up @@ -698,7 +701,7 @@ spec:
type: string
securityGroups:
description: |-
SecurityGroups contains the current Security Groups values that are available to the
SecurityGroups contains the current security group values that are available to the
cluster under the SecurityGroups selectors.
items:
description: SecurityGroup contains resolved SecurityGroup selector values utilized for node launch
Expand All @@ -715,7 +718,7 @@ spec:
type: array
subnets:
description: |-
Subnets contains the current Subnet values that are available to the
Subnets contains the current subnet values that are available to the
cluster under the subnet selectors.
items:
description: Subnet contains resolved Subnet selector values utilized for node launch
Expand Down
4 changes: 2 additions & 2 deletions charts/karpenter/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@ apiVersion: v2
name: karpenter
description: A Helm chart for Karpenter, an open-source node provisioning project built for Kubernetes.
type: application
version: 1.0.0
appVersion: 1.0.0
version: 1.1.0
appVersion: 1.1.0
keywords:
- cluster
- node
Expand Down
22 changes: 13 additions & 9 deletions charts/karpenter/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

A Helm chart for Karpenter, an open-source node provisioning project built for Kubernetes.

![Version: 1.0.0](https://img.shields.io/badge/Version-1.0.0-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 1.0.0](https://img.shields.io/badge/AppVersion-1.0.0-informational?style=flat-square)
![Version: 1.1.0](https://img.shields.io/badge/Version-1.1.0-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 1.1.0](https://img.shields.io/badge/AppVersion-1.1.0-informational?style=flat-square)

## Documentation

Expand All @@ -15,7 +15,7 @@ You can follow the detailed installation instruction in the [documentation](http
```bash
helm upgrade --install --namespace karpenter --create-namespace \
karpenter oci://public.ecr.aws/karpenter/karpenter \
--version 1.0.0 \
--version 1.1.0 \
--set "serviceAccount.annotations.eks\.amazonaws\.com/role-arn=${KARPENTER_IAM_ROLE_ARN}" \
--set settings.clusterName=${CLUSTER_NAME} \
--set settings.interruptionQueue=${CLUSTER_NAME} \
Expand All @@ -27,13 +27,13 @@ helm upgrade --install --namespace karpenter --create-namespace \
As the OCI Helm chart is signed by [Cosign](https://github.com/sigstore/cosign) as part of the release process you can verify the chart before installing it by running the following command.

```shell
cosign verify public.ecr.aws/karpenter/karpenter:1.0.0 \
cosign verify public.ecr.aws/karpenter/karpenter:1.1.0 \
--certificate-oidc-issuer=https://token.actions.githubusercontent.com \
--certificate-identity-regexp='https://github\.com/aws/karpenter-provider-aws/\.github/workflows/release\.yaml@.+' \
--certificate-github-workflow-repository=aws/karpenter-provider-aws \
--certificate-github-workflow-name=Release \
--certificate-github-workflow-ref=refs/tags/v1.0.0 \
--annotations version=1.0.0
--certificate-github-workflow-ref=refs/tags/v1.1.0 \
--annotations version=1.1.0
```

## Values
Expand All @@ -44,13 +44,14 @@ cosign verify public.ecr.aws/karpenter/karpenter:1.0.0 \
| additionalClusterRoleRules | list | `[]` | Specifies additional rules for the core ClusterRole. |
| additionalLabels | object | `{}` | Additional labels to add into metadata. |
| affinity | object | `{"nodeAffinity":{"requiredDuringSchedulingIgnoredDuringExecution":{"nodeSelectorTerms":[{"matchExpressions":[{"key":"karpenter.sh/nodepool","operator":"DoesNotExist"}]}]}},"podAntiAffinity":{"requiredDuringSchedulingIgnoredDuringExecution":[{"topologyKey":"kubernetes.io/hostname"}]}}` | Affinity rules for scheduling the pod. If an explicit label selector is not provided for pod affinity or pod anti-affinity one will be created from the pod selector labels. |
| controller.containerName | string | `"controller"` | Distinguishing container name (containerName: karpenter-controller). |
| controller.env | list | `[]` | Additional environment variables for the controller pod. |
| controller.envFrom | list | `[]` | |
| controller.extraVolumeMounts | list | `[]` | Additional volumeMounts for the controller pod. |
| controller.healthProbe.port | int | `8081` | The container port to use for http health probe. |
| controller.image.digest | string | `"sha256:1eb1073b9f4ed804634aabf320e4d6e822bb61c0f5ecfd9c3a88f05f1ca4c5c5"` | SHA256 digest of the controller image. |
| controller.image.digest | string | `"sha256:51bca600197c7c6e6e0838549664b2c12c3f8dd4b23744ab28202ae97ca5aed1"` | SHA256 digest of the controller image. |
| controller.image.repository | string | `"public.ecr.aws/karpenter/controller"` | Repository path to the controller image. |
| controller.image.tag | string | `"1.0.0"` | Tag of the controller image. |
| controller.image.tag | string | `"1.1.0"` | Tag of the controller image. |
| controller.metrics.port | int | `8080` | The container port to use for metrics. |
| controller.resources | object | `{}` | Resources for the controller pod. |
| controller.sidecarContainer | list | `[]` | Additional sidecarContainer config |
Expand All @@ -75,19 +76,22 @@ cosign verify public.ecr.aws/karpenter/karpenter:1.0.0 \
| priorityClassName | string | `"system-cluster-critical"` | PriorityClass name for the pod. |
| replicas | int | `2` | Number of replicas. |
| revisionHistoryLimit | int | `10` | The number of old ReplicaSets to retain to allow rollback. |
| service.annotations | object | `{}` | Additional annotations for the Service. |
| serviceAccount.annotations | object | `{}` | Additional annotations for the ServiceAccount. |
| serviceAccount.create | bool | `true` | Specifies if a ServiceAccount should be created. |
| serviceAccount.name | string | `""` | The name of the ServiceAccount to use. If not set and create is true, a name is generated using the fullname template. |
| serviceMonitor.additionalLabels | object | `{}` | Additional labels for the ServiceMonitor. |
| serviceMonitor.enabled | bool | `false` | Specifies whether a ServiceMonitor should be created. |
| serviceMonitor.endpointConfig | object | `{}` | Configuration on `http-metrics` endpoint for the ServiceMonitor. Not to be used to add additional endpoints. See the Prometheus operator documentation for configurable fields https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#endpoint |
| settings | object | `{"batchIdleDuration":"1s","batchMaxDuration":"10s","clusterCABundle":"","clusterEndpoint":"","clusterName":"","featureGates":{"spotToSpotConsolidation":false},"interruptionQueue":"","isolatedVPC":false,"reservedENIs":"0","vmMemoryOverheadPercent":0.075}` | Global Settings to configure Karpenter |
| settings | object | `{"batchIdleDuration":"1s","batchMaxDuration":"10s","clusterCABundle":"","clusterEndpoint":"","clusterName":"","eksControlPlane":false,"featureGates":{"nodeRepair":false,"spotToSpotConsolidation":false},"interruptionQueue":"","isolatedVPC":false,"reservedENIs":"0","vmMemoryOverheadPercent":0.075}` | Global Settings to configure Karpenter |
| settings.batchIdleDuration | string | `"1s"` | The maximum amount of time with no new ending pods that if exceeded ends the current batching window. If pods arrive faster than this time, the batching window will be extended up to the maxDuration. If they arrive slower, the pods will be batched separately. |
| settings.batchMaxDuration | string | `"10s"` | The maximum length of a batch window. The longer this is, the more pods we can consider for provisioning at one time which usually results in fewer but larger nodes. |
| settings.clusterCABundle | string | `""` | Cluster CA bundle for TLS configuration of provisioned nodes. If not set, this is taken from the controller's TLS configuration for the API server. |
| settings.clusterEndpoint | string | `""` | Cluster endpoint. If not set, will be discovered during startup (EKS only) |
| settings.clusterName | string | `""` | Cluster name. |
| settings.featureGates | object | `{"spotToSpotConsolidation":false}` | Feature Gate configuration values. Feature Gates will follow the same graduation process and requirements as feature gates in Kubernetes. More information here https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/#feature-gates-for-alpha-or-beta-features |
| settings.eksControlPlane | bool | `false` | Marking this true means that your cluster is running with an EKS control plane and Karpenter should attempt to discover cluster details from the DescribeCluster API |
| settings.featureGates | object | `{"nodeRepair":false,"spotToSpotConsolidation":false}` | Feature Gate configuration values. Feature Gates will follow the same graduation process and requirements as feature gates in Kubernetes. More information here https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/#feature-gates-for-alpha-or-beta-features |
| settings.featureGates.nodeRepair | bool | `false` | nodeRepair is ALPHA and is disabled by default. Setting this to true will enable node repair. |
| settings.featureGates.spotToSpotConsolidation | bool | `false` | spotToSpotConsolidation is ALPHA and is disabled by default. Setting this to true will enable spot replacement consolidation for both single and multi-node consolidation. |
| settings.interruptionQueue | string | `""` | Interruption queue is the name of the SQS queue used for processing interruption events from EC2 Interruption handling is disabled if not specified. Enabling interruption handling may require additional permissions on the controller service account. Additional permissions are outlined in the docs. |
| settings.isolatedVPC | bool | `false` | If true then assume we can't reach AWS services which don't have a VPC endpoint This also has the effect of disabling look-ups to the AWS pricing endpoint |
Expand Down
9 changes: 8 additions & 1 deletion charts/karpenter/templates/deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -56,8 +56,11 @@ spec:
{{- if .Values.hostNetwork }}
hostNetwork: true
{{- end }}
{{- with .Values.schedulerName }}
schedulerName: {{ . | quote }}
{{- end }}
containers:
- name: controller
- name: {{ .Values.controller.containerName | default "controller" }}
securityContext:
runAsUser: 65532
runAsGroup: 65532
Expand Down Expand Up @@ -128,6 +131,10 @@ spec:
- name: ISOLATED_VPC
value: "{{ . }}"
{{- end }}
{{- with .Values.settings.eksControlPlane }}
- name: EKS_CONTROL_PLANE
value: "{{ . }}"
{{- end }}
{{- with .Values.settings.vmMemoryOverheadPercent }}
- name: VM_MEMORY_OVERHEAD_PERCENT
value: "{{ . }}"
Expand Down
12 changes: 9 additions & 3 deletions charts/karpenter/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,8 @@ terminationGracePeriodSeconds:
# -- Bind the pod to the host network.
# This is required when using a custom CNI.
hostNetwork: false
# -- Specify which Kubernetes scheduler should dispatch the pod.
schedulerName: default-scheduler
# -- Configure the DNS Policy for the pod
dnsPolicy: ClusterFirst
# -- Configure DNS Config for the pod
Expand Down Expand Up @@ -100,13 +102,15 @@ extraVolumes: []
# expirationSeconds: 86400
# path: token
controller:
# -- Distinguishing container name (containerName: karpenter-controller).
containerName: controller
image:
# -- Repository path to the controller image.
repository: public.ecr.aws/karpenter/controller
# -- Tag of the controller image.
tag: 1.0.0
tag: 1.1.0
# -- SHA256 digest of the controller image.
digest: sha256:1eb1073b9f4ed804634aabf320e4d6e822bb61c0f5ecfd9c3a88f05f1ca4c5c5
digest: sha256:51bca600197c7c6e6e0838549664b2c12c3f8dd4b23744ab28202ae97ca5aed1
# -- Additional environment variables for the controller pod.
env: []
# - name: AWS_REGION
Expand Down Expand Up @@ -166,6 +170,8 @@ settings:
# -- If true then assume we can't reach AWS services which don't have a VPC endpoint
# This also has the effect of disabling look-ups to the AWS pricing endpoint
isolatedVPC: false
# Marking this true means that your cluster is running with an EKS control plane and Karpenter should attempt to discover cluster details from the DescribeCluster API
eksControlPlane: false
# -- The VM memory overhead as a percent that will be subtracted from the total memory for all instance types. The value of `0.075` equals to 7.5%.
vmMemoryOverheadPercent: 0.075
# -- Interruption queue is the name of the SQS queue used for processing interruption events from EC2
Expand All @@ -181,6 +187,6 @@ settings:
# -- spotToSpotConsolidation is ALPHA and is disabled by default.
# Setting this to true will enable spot replacement consolidation for both single and multi-node consolidation.
spotToSpotConsolidation: false
# -- nodeRepair is ALPHA and is disabled by default.
# -- nodeRepair is ALPHA and is disabled by default.
# Setting this to true will enable node repair.
nodeRepair: false
1 change: 1 addition & 0 deletions cmd/controller/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,7 @@ func main() {
op.PricingProvider,
op.AMIProvider,
op.LaunchTemplateProvider,
op.VersionProvider,
op.InstanceTypesProvider,
)...).
Start(ctx)
Expand Down
Loading

0 comments on commit 2d73a1b

Please sign in to comment.