From 1b0b9f5f7b7ee5146ee5432c12aa3e37952cb679 Mon Sep 17 00:00:00 2001 From: Himanshu Sharma <79965161+himanshu-kun@users.noreply.github.com> Date: Sat, 25 Jun 2022 14:55:11 +0530 Subject: [PATCH] Sync with upstream v1.21.3 (#129) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * Set maxAsgNamesPerDescribe to the new maximum value While this was previously effectively limited to 50, `DescribeAutoScalingGroups` now supports fetching 100 ASG per calls on all regions, matching what's documented: https://docs.aws.amazon.com/autoscaling/ec2/APIReference/API_DescribeAutoScalingGroups.html ``` AutoScalingGroupNames.member.N The names of the Auto Scaling groups. By default, you can only specify up to 50 names. You can optionally increase this limit using the MaxRecords parameter. MaxRecords The maximum number of items to return with this call. The default value is 50 and the maximum value is 100. ``` Doubling this halves API calls on large clusters, which should help to prevent throttling. * Break out unmarshal from GenerateEC2InstanceTypes Refactor to allow for optimisation * Optimise GenerateEC2InstanceTypes unmarshal memory usage The pricing json for us-east-1 is currently 129MB. Currently fetching this into memory and parsing results in a large memory footprint on startup, and can lead to the autoscaler being OOMKilled. Change the ReadAll/Unmarshal logic to a stream decoder to significantly reduce the memory use. * use aws sdk to find region * Merge pull request #4274 from kinvolk/imran/cloud-provider-packet-fix Cloud provider[Packet] fixes * Fix templated nodeinfo names collisions in BinpackingNodeEstimator Both upscale's `getUpcomingNodeInfos` and the binpacking estimator now uses the same shared DeepCopyTemplateNode function and inherits its naming pattern, which is great as that fixes a long standing bug. Due to that, `getUpcomingNodeInfos` will enrich the cluster snapshots with generated nodeinfos and nodes having predictable names (using template name + an incremental ordinal starting at 0) for upcoming nodes. Later, when it looks for fitting nodes for unschedulable pods (when upcoming nodes don't satisfy those (FitsAnyNodeMatching failing due to nodes capacity, or pods antiaffinity, ...), the binpacking estimator will also build virtual nodes and place them in a snapshot fork to evaluate scheduler predicates. Those temporary virtual nodes are built using the same pattern (template name and an index ordinal also starting at 0) as the one previously used by `getUpcomingNodeInfos`, which means it will generate the same nodeinfos/nodes names for nodegroups having upcoming nodes. But adding nodes by the same name in an existing cluster snapshot isn't allowed, and the evaluation attempt will fail. Practically this blocks re-upscales for nodegroups having upcoming nodes, which can cause a significant delay. * Improve misleading log Signed-off-by: Sylvain Rabot * dont proactively decrement azure cache for unregistered nodes * annotate fakeNodes so that cloudprovider implementations can identify them if needed * move annotations to cloudprovider package * Cluster Autoscaler 1.21.1 * CA - AWS - Instance List Update 03-10-21 - 1.21 release branch * CA - AWS - Instance List Update 29-10-21 - 1.21 release branch * Cluster-Autoscaler update AWS EC2 instance types with g5, m6 and r6 * CA - AWS Instance List Update - 13/12/21 - 1.21 * Merge pull request #4497 from marwanad/add-more-azure-instance-types add more azure instance types * Cluster Autoscaler 1.21.2 * Add `--feature-gates` flag to support scale up on volume limits (CSI migration enabled) Signed-off-by: ialidzhikov * [Cherry pick 1.21] Remove TestDeleteBlob UT Signed-off-by: Zhecheng Li * cherry-pick #4022 [cluster-autoscaler] Publish node group min/max metrics * Skipping metrics tests added in #4022 Each test works in isolation, but they cause panic when the entire suite is run (ex. make test-in-docker), because the underlying metrics library panics when the same metric is registered twice. (cherry picked from commit 52392b3707cb8192bd2841b6f2e8da9678c13fd9) * cherry-pick #4162 and #4172 [cluster-autoscaler]Add flag to control DaemonSet eviction on non-empty nodes & Allow DaemonSet pods to opt in/out from eviction. * CA - AWS Cloud Provider - 1.21 Static Instance List Update 02-06-2022 * fix instance type fallback Instead of logging a fatal error, log a standard error and fall back to loading instance types from the static list. * Cluster Autoscaler - 1.21.3 release * FAQ updated * Sync_changes file updated Co-authored-by: Benjamin Pineau Co-authored-by: Adrian Lai Co-authored-by: darkpssngr Co-authored-by: Kubernetes Prow Robot Co-authored-by: Sylvain Rabot Co-authored-by: Marwan Ahmed Co-authored-by: Jakub Tużnik Co-authored-by: GuyTempleton Co-authored-by: sturman <4456572+sturman@users.noreply.github.com> Co-authored-by: Maciek Pytel Co-authored-by: ialidzhikov Co-authored-by: Zhecheng Li Co-authored-by: Shubham Kuchhal Co-authored-by: Todd Neal --- cluster-autoscaler/FAQ.md | 486 +++--- .../SYNC-CHANGES/SYNC-CHANGES-1.21.md | 32 + .../cloudprovider/aws/auto_scaling_test.go | 14 +- .../cloudprovider/aws/aws_cloud_provider.go | 5 +- .../aws/aws_cloud_provider_test.go | 19 + .../cloudprovider/aws/aws_manager.go | 4 +- .../cloudprovider/aws/aws_util.go | 101 +- .../cloudprovider/aws/aws_util_test.go | 119 +- .../cloudprovider/aws/ec2_instance_types.go | 1218 +++++++++++++- .../azure/azure_instance_types.go | 1490 ++++++++++++++++- .../azure/azure_instance_types/gen.go | 3 +- .../cloudprovider/azure/azure_scale_set.go | 17 +- .../azure/azure_scale_set_test.go | 77 + .../cloudprovider/azure/azure_util_test.go | 26 - .../cloudprovider/cloud_provider.go | 10 + .../cloudprovider/packet/README.md | 31 +- .../packet/packet_cloud_provider.go | 13 +- .../packet/packet_manager_rest.go | 44 +- .../clusterstate/clusterstate.go | 9 +- .../config/autoscaling_options.go | 2 + cluster-autoscaler/core/scale_down.go | 15 +- cluster-autoscaler/core/scale_down_test.go | 85 +- cluster-autoscaler/core/scale_up_test.go | 2 +- cluster-autoscaler/core/static_autoscaler.go | 8 +- .../estimator/binpacking_estimator.go | 3 +- cluster-autoscaler/main.go | 9 +- cluster-autoscaler/metrics/metrics.go | 33 +- cluster-autoscaler/metrics/metrics_test.go | 44 + .../utils/daemonset/daemonset.go | 20 + .../utils/daemonset/daemonset_test.go | 70 + .../utils/scheduler/scheduler.go | 6 +- cluster-autoscaler/version/version.go | 2 +- 32 files changed, 3545 insertions(+), 472 deletions(-) create mode 100644 cluster-autoscaler/metrics/metrics_test.go diff --git a/cluster-autoscaler/FAQ.md b/cluster-autoscaler/FAQ.md index bb699928273d..b8f8efbce099 100644 --- a/cluster-autoscaler/FAQ.md +++ b/cluster-autoscaler/FAQ.md @@ -12,51 +12,57 @@ this document: # Table of Contents: * [Basics](#basics) - * [What is Cluster Autoscaler?](#what-is-cluster-autoscaler) - * [When does Cluster Autoscaler change the size of a cluster?](#when-does-cluster-autoscaler-change-the-size-of-a-cluster) - * [What types of pods can prevent CA from removing a node?](#what-types-of-pods-can-prevent-ca-from-removing-a-node) - * [Which version on Cluster Autoscaler should I use in my cluster?](#which-version-on-cluster-autoscaler-should-i-use-in-my-cluster) - * [Is Cluster Autoscaler an Alpha, Beta or GA product?](#is-cluster-autoscaler-an-alpha-beta-or-ga-product) - * [What are the Service Level Objectives for Cluster Autoscaler?](#what-are-the-service-level-objectives-for-cluster-autoscaler) - * [How does Horizontal Pod Autoscaler work with Cluster Autoscaler?](#how-does-horizontal-pod-autoscaler-work-with-cluster-autoscaler) - * [What are the key best practices for running Cluster Autoscaler?](#what-are-the-key-best-practices-for-running-cluster-autoscaler) - * [Should I use a CPU-usage-based node autoscaler with Kubernetes?](#should-i-use-a-cpu-usage-based-node-autoscaler-with-kubernetes) - * [How is Cluster Autoscaler different from CPU-usage-based node autoscalers?](#how-is-cluster-autoscaler-different-from-cpu-usage-based-node-autoscalers) - * [Is Cluster Autoscaler compatible with CPU-usage-based node autoscalers?](#is-cluster-autoscaler-compatible-with-cpu-usage-based-node-autoscalers) - * [How does Cluster Autoscaler work with Pod Priority and Preemption?](#how-does-cluster-autoscaler-work-with-pod-priority-and-preemption) - * [How does Cluster Autoscaler remove nodes?](#how-does-cluster-autoscaler-remove-nodes) + * [What is Cluster Autoscaler?](#what-is-cluster-autoscaler) + * [When does Cluster Autoscaler change the size of a cluster?](#when-does-cluster-autoscaler-change-the-size-of-a-cluster) + * [What types of pods can prevent CA from removing a node?](#what-types-of-pods-can-prevent-ca-from-removing-a-node) + * [Which version on Cluster Autoscaler should I use in my cluster?](#which-version-on-cluster-autoscaler-should-i-use-in-my-cluster) + * [Is Cluster Autoscaler an Alpha, Beta or GA product?](#is-cluster-autoscaler-an-alpha-beta-or-ga-product) + * [What are the Service Level Objectives for Cluster Autoscaler?](#what-are-the-service-level-objectives-for-cluster-autoscaler) + * [How does Horizontal Pod Autoscaler work with Cluster Autoscaler?](#how-does-horizontal-pod-autoscaler-work-with-cluster-autoscaler) + * [What are the key best practices for running Cluster Autoscaler?](#what-are-the-key-best-practices-for-running-cluster-autoscaler) + * [Should I use a CPU-usage-based node autoscaler with Kubernetes?](#should-i-use-a-cpu-usage-based-node-autoscaler-with-kubernetes) + * [How is Cluster Autoscaler different from CPU-usage-based node autoscalers?](#how-is-cluster-autoscaler-different-from-cpu-usage-based-node-autoscalers) + * [Is Cluster Autoscaler compatible with CPU-usage-based node autoscalers?](#is-cluster-autoscaler-compatible-with-cpu-usage-based-node-autoscalers) + * [How does Cluster Autoscaler work with Pod Priority and Preemption?](#how-does-cluster-autoscaler-work-with-pod-priority-and-preemption) + * [How does Cluster Autoscaler remove nodes?](#how-does-cluster-autoscaler-remove-nodes) * [How to?](#how-to) - * [I'm running cluster with nodes in multiple zones for HA purposes. Is that supported by Cluster Autoscaler?](#im-running-cluster-with-nodes-in-multiple-zones-for-ha-purposes-is-that-supported-by-cluster-autoscaler) - * [How can I monitor Cluster Autoscaler?](#how-can-i-monitor-cluster-autoscaler) - * [How can I scale my cluster to just 1 node?](#how-can-i-scale-my-cluster-to-just-1-node) - * [How can I scale a node group to 0?](#how-can-i-scale-a-node-group-to-0) - * [How can I prevent Cluster Autoscaler from scaling down a particular node?](#how-can-i-prevent-cluster-autoscaler-from-scaling-down-a-particular-node) - * [How can I configure overprovisioning with Cluster Autoscaler?](#how-can-i-configure-overprovisioning-with-cluster-autoscaler) + * [I'm running cluster with nodes in multiple zones for HA purposes. Is that supported by Cluster Autoscaler?](#im-running-cluster-with-nodes-in-multiple-zones-for-ha-purposes-is-that-supported-by-cluster-autoscaler) + * [How can I monitor Cluster Autoscaler?](#how-can-i-monitor-cluster-autoscaler) + * [How can I see all the events from Cluster Autoscaler?](#how-can-i-see-all-events-from-cluster-autoscaler) + * [How can I scale my cluster to just 1 node?](#how-can-i-scale-my-cluster-to-just-1-node) + * [How can I scale a node group to 0?](#how-can-i-scale-a-node-group-to-0) + * [How can I prevent Cluster Autoscaler from scaling down a particular node?](#how-can-i-prevent-cluster-autoscaler-from-scaling-down-a-particular-node) + * [How can I prevent Cluster Autoscaler from scaling down non-empty nodes?](#how-can-i-prevent-cluster-autoscaler-from-scaling-down-non-empty-nodes) + * [How can I configure overprovisioning with Cluster Autoscaler?](#how-can-i-configure-overprovisioning-with-cluster-autoscaler) + * [How can I enable/disable eviction for a specific DaemonSet](#how-can-i-enabledisable-eviction-for-a-specific-daemonset) + * [How can I enable Cluster Autoscaler to scale up when Node's max volume count is exceeded (CSI migration enabled)?](#how-can-i-enable-cluster-autoscaler-to-scale-up-when-nodes-max-volume-count-is-exceeded-csi-migration-enabled) * [Internals](#internals) - * [Are all of the mentioned heuristics and timings final?](#are-all-of-the-mentioned-heuristics-and-timings-final) - * [How does scale-up work?](#how-does-scale-up-work) - * [How does scale-down work?](#how-does-scale-down-work) - * [Does CA work with PodDisruptionBudget in scale-down?](#does-ca-work-with-poddisruptionbudget-in-scale-down) - * [Does CA respect GracefulTermination in scale-down?](#does-ca-respect-gracefultermination-in-scale-down) - * [How does CA deal with unready nodes?](#how-does-ca-deal-with-unready-nodes) - * [How fast is Cluster Autoscaler?](#how-fast-is-cluster-autoscaler) - * [How fast is HPA when combined with CA?](#how-fast-is-hpa-when-combined-with-ca) - * [Where can I find the designs of the upcoming features?](#where-can-i-find-the-designs-of-the-upcoming-features) - * [What are Expanders?](#what-are-expanders) - * [Does CA respect node affinity when selecting node groups to scale up?](#does-ca-respect-node-affinity-when-selecting-node-groups-to-scale-up) - * [What are the parameters to CA?](#what-are-the-parameters-to-ca) + * [Are all of the mentioned heuristics and timings final?](#are-all-of-the-mentioned-heuristics-and-timings-final) + * [How does scale-up work?](#how-does-scale-up-work) + * [How does scale-down work?](#how-does-scale-down-work) + * [Does CA work with PodDisruptionBudget in scale-down?](#does-ca-work-with-poddisruptionbudget-in-scale-down) + * [Does CA respect GracefulTermination in scale-down?](#does-ca-respect-gracefultermination-in-scale-down) + * [How does CA deal with unready nodes?](#how-does-ca-deal-with-unready-nodes) + * [How fast is Cluster Autoscaler?](#how-fast-is-cluster-autoscaler) + * [How fast is HPA when combined with CA?](#how-fast-is-hpa-when-combined-with-ca) + * [Where can I find the designs of the upcoming features?](#where-can-i-find-the-designs-of-the-upcoming-features) + * [What are Expanders?](#what-are-expanders) + * [Does CA respect node affinity when selecting node groups to scale up?](#does-ca-respect-node-affinity-when-selecting-node-groups-to-scale-up) + * [What are the parameters to CA?](#what-are-the-parameters-to-ca) * [Troubleshooting](#troubleshooting) - * [I have a couple of nodes with low utilization, but they are not scaled down. Why?](#i-have-a-couple-of-nodes-with-low-utilization-but-they-are-not-scaled-down-why) - * [How to set PDBs to enable CA to move kube-system pods?](#how-to-set-pdbs-to-enable-ca-to-move-kube-system-pods) - * [I have a couple of pending pods, but there was no scale-up?](#i-have-a-couple-of-pending-pods-but-there-was-no-scale-up) - * [CA doesn’t work, but it used to work yesterday. Why?](#ca-doesnt-work-but-it-used-to-work-yesterday-why) - * [How can I check what is going on in CA ?](#how-can-i-check-what-is-going-on-in-ca-) - * [What events are emitted by CA?](#what-events-are-emitted-by-ca) - * [What happens in scale-up when I have no more quota in the cloud provider?](#what-happens-in-scale-up-when-i-have-no-more-quota-in-the-cloud-provider) + * [I have a couple of nodes with low utilization, but they are not scaled down. Why?](#i-have-a-couple-of-nodes-with-low-utilization-but-they-are-not-scaled-down-why) + * [How to set PDBs to enable CA to move kube-system pods?](#how-to-set-pdbs-to-enable-ca-to-move-kube-system-pods) + * [I have a couple of pending pods, but there was no scale-up?](#i-have-a-couple-of-pending-pods-but-there-was-no-scale-up) + * [CA doesn’t work, but it used to work yesterday. Why?](#ca-doesnt-work-but-it-used-to-work-yesterday-why) + * [How can I check what is going on in CA ?](#how-can-i-check-what-is-going-on-in-ca-) + * [What events are emitted by CA?](#what-events-are-emitted-by-ca) + * [My cluster is below minimum / above maximum number of nodes, but CA did not fix that! Why?](#my-cluster-is-below-minimum--above-maximum-number-of-nodes-but-ca-did-not-fix-that-why) + * [What happens in scale-up when I have no more quota in the cloud provider?](#what-happens-in-scale-up-when-i-have-no-more-quota-in-the-cloud-provider) * [Developer](#developer) - * [How can I run e2e tests?](#how-can-i-run-e2e-tests) - * [How should I test my code before submitting PR?](#how-should-i-test-my-code-before-submitting-pr) - * [How can I update CA dependencies (particularly k8s.io/kubernetes)?](#how-can-i-update-ca-dependencies-particularly-k8siokubernetes) + * [What go version should be used to compile CA?](#what-go-version-should-be-used-to-compile-ca) + * [How can I run e2e tests?](#how-can-i-run-e2e-tests) + * [How should I test my code before submitting PR?](#how-should-i-test-my-code-before-submitting-pr) + * [How can I update CA dependencies (particularly k8s.io/kubernetes)?](#how-can-i-update-ca-dependencies-particularly-k8siokubernetes) * [In the context of Gardener](#in-the-context-of-gardener) * [How do I rebase this fork of autoscaler with upstream?](#how-do-i-rebase-this-fork-of-autoscaler-with-upstream) @@ -82,12 +88,12 @@ Cluster Autoscaler decreases the size of the cluster when some nodes are consist * Pods with restrictive PodDisruptionBudget. * Kube-system pods that: - * are not run on the node by default, * - * don't have a [pod disruption budget](https://kubernetes.io/docs/concepts/workloads/pods/disruptions/#how-disruption-budgets-work) set or their PDB is too restrictive (since CA 0.6). + * are not run on the node by default, * + * don't have a [pod disruption budget](https://kubernetes.io/docs/concepts/workloads/pods/disruptions/#how-disruption-budgets-work) set or their PDB is too restrictive (since CA 0.6). * Pods that are not backed by a controller object (so not created by deployment, replica set, job, stateful set etc). * * Pods with local storage. * * Pods that cannot be moved elsewhere due to various constraints (lack of resources, non-matching node selectors or affinity, -matching anti-affinity, etc) + matching anti-affinity, etc) * Pods that have the following annotation set: ``` "cluster-autoscaler.kubernetes.io/safe-to-evict": "false" @@ -98,7 +104,7 @@ matching anti-affinity, etc) "cluster-autoscaler.kubernetes.io/safe-to-evict": "true" ``` -__Or__ you have have overridden this behaviour with one of the relevant flags. [See below for more information on these flags.](#what-are-the-parameters-to-ca) +__Or__ you have overridden this behaviour with one of the relevant flags. [See below for more information on these flags.](#what-are-the-parameters-to-ca) ### Which version on Cluster Autoscaler should I use in my cluster? @@ -108,22 +114,22 @@ See [Cluster Autoscaler Releases](https://github.com/kubernetes/autoscaler/tree/ Since version 1.0.0 we consider CA as GA. It means that: - * We have enough confidence that it does what it is expected to do. Each commit goes through a big suite of unit tests - with more than 75% coverage (on average). We have a series of e2e tests that validate that CA works well on - [GCE](https://k8s-testgrid.appspot.com/sig-autoscaling#gce-autoscaling) - and [GKE](https://k8s-testgrid.appspot.com/sig-autoscaling#gke-autoscaling). - Due to the missing testing infrastructure, AWS (or any other cloud provider) compatibility - tests are not the part of the standard development or release procedure. - However there is a number of AWS users who run CA in their production environment and submit new code, patches and bug reports. - * It was tested that CA scales well. CA should handle up to 1000 nodes running 30 pods each. Our testing procedure is described - [here](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/proposals/scalability_tests.md). - * Most of the pain-points reported by the users (like too short graceful termination support) were fixed, however - some of the less critical feature requests are yet to be implemented. - * CA has decent monitoring, logging and eventing. - * CA tries to handle most of the error situations in the cluster (like cloud provider stockouts, broken nodes, etc). The cases handled can however vary from cloudprovider to cloudprovider. - * CA developers are committed to maintaining and supporting CA in the foreseeable future. - -All of the previous versions (earlier that 1.0.0) are considered beta. +* We have enough confidence that it does what it is expected to do. Each commit goes through a big suite of unit tests + with more than 75% coverage (on average). We have a series of e2e tests that validate that CA works well on + [GCE](https://k8s-testgrid.appspot.com/sig-autoscaling#gce-autoscaling) + and [GKE](https://k8s-testgrid.appspot.com/sig-autoscaling#gke-autoscaling). + Due to the missing testing infrastructure, AWS (or any other cloud provider) compatibility + tests are not the part of the standard development or release procedure. + However there is a number of AWS users who run CA in their production environment and submit new code, patches and bug reports. +* It was tested that CA scales well. CA should handle up to 1000 nodes running 30 pods each. Our testing procedure is described + [here](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/proposals/scalability_tests.md). +* Most of the pain-points reported by the users (like too short graceful termination support) were fixed, however + some of the less critical feature requests are yet to be implemented. +* CA has decent monitoring, logging and eventing. +* CA tries to handle most of the error situations in the cluster (like cloud provider stockouts, broken nodes, etc). The cases handled can however vary from cloudprovider to cloudprovider. +* CA developers are committed to maintaining and supporting CA in the foreseeable future. + +All of the previous versions (earlier than 1.0.0) are considered beta. ### What are the Service Level Objectives for Cluster Autoscaler? @@ -223,9 +229,9 @@ priority pod preemption. Older versions of CA won't take priorities into account. More about Pod Priority and Preemption: - * [Priority in Kubernetes API](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/scheduling/pod-priority-api.md), - * [Pod Preemption in Kubernetes](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/scheduling/pod-preemption.md), - * [Pod Priority and Preemption tutorial](https://kubernetes.io/docs/concepts/configuration/pod-priority-preemption/). +* [Priority in Kubernetes API](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/scheduling/pod-priority-api.md), +* [Pod Preemption in Kubernetes](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/scheduling/pod-preemption.md), +* [Pod Priority and Preemption tutorial](https://kubernetes.io/docs/concepts/configuration/pod-priority-preemption/). ### How does Cluster Autoscaler remove nodes? @@ -267,6 +273,16 @@ respectively under `/metrics` and `/health-check`. Metrics are provided in Prometheus format and their detailed description is available [here](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/proposals/metrics.md). +### How can I see all events from Cluster Autoscaler? + +By default, the Cluster Autoscaler will deduplicate similar events that occur within a 5 minute +window. This is done to improve scalability performance where many similar events might be +triggered in a short timespan, such as when there are too many unscheduled pods. + +In some cases, such as for debugging or when scalability of events is not an issue, you might +want to see all the events coming from the Cluster Autoscaler. In these scenarios you should +use the `--record-duplicated-events` command line flag. + ### How can I scale my cluster to just 1 node? Prior to version 0.6, Cluster Autoscaler was not touching nodes that were running important @@ -276,7 +292,7 @@ CA could not scale the cluster down and the user could end up with a completely If the user configures a [PodDisruptionBudget](https://kubernetes.io/docs/concepts/workloads/pods/disruptions/) for the kube-system pod, then the default strategy of not touching the node running this pod is overridden with PDB settings. So, to enable kube-system pods migration, one should set -[minAvailable](https://kubernetes.io/docs/api-reference/v1.7/#poddisruptionbudgetspec-v1beta1-policy) +[minAvailable](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.21/#poddisruptionbudget-v1-policy) to 0 (or <= N if there are N+1 pod replicas.) See also [I have a couple of nodes with low utilization, but they are not scaled down. Why?](#i-have-a-couple-of-nodes-with-low-utilization-but-they-are-not-scaled-down-why) @@ -313,6 +329,13 @@ It can be added to (or removed from) a node using kubectl: kubectl annotate node cluster-autoscaler.kubernetes.io/scale-down-disabled=true ``` +### How can I prevent Cluster Autoscaler from scaling down non-empty nodes? + +CA might scale down non-empty nodes with utilization below a threshold +(configurable with `--scale-down-utilization-threshold` flag). + +To prevent this behavior, set the utilization threshold to `0`. + ### How can I configure overprovisioning with Cluster Autoscaler? Below solution works since version 1.1 (to be shipped with Kubernetes 1.9). @@ -344,9 +367,9 @@ export ENABLE_POD_PRIORITY=true For AWS using kops, see [this issue](https://github.com/kubernetes/autoscaler/issues/1410#issuecomment-439840945). 2. Define priority class for overprovisioning pods. Priority -1 will be reserved for -overprovisioning pods as it is the lowest priority that triggers scaling clusters. Other pods need -to use priority 0 or higher in order to be able to preempt overprovisioning pods. You can use -following definitions. + overprovisioning pods as it is the lowest priority that triggers scaling clusters. Other pods need + to use priority 0 or higher in order to be able to preempt overprovisioning pods. You can use + following definitions. **For 1.10, and below:** @@ -360,10 +383,10 @@ globalDefault: false description: "Priority class used by overprovisioning." ``` -**For 1.11:** +**For 1.11+:** ```yaml -apiVersion: scheduling.k8s.io/v1beta1 +apiVersion: scheduling.k8s.io/v1 kind: PriorityClass metadata: name: overprovisioning @@ -373,16 +396,16 @@ description: "Priority class used by overprovisioning." ``` 3. Change pod priority cutoff in CA to -10 so pause pods are taken into account during scale down -and scale up. Set flag ```expendable-pods-priority-cutoff``` to -10. If you already use priority -preemption then pods with priorities between -10 and -1 won't be best effort anymore. + and scale up. Set flag ```expendable-pods-priority-cutoff``` to -10. If you already use priority + preemption then pods with priorities between -10 and -1 won't be best effort anymore. 4. Create service account that will be used by Horizontal Cluster Proportional Autoscaler which needs -specific roles. More details [here](https://github.com/kubernetes-incubator/cluster-proportional-autoscaler/tree/master/examples#rbac-configurations) + specific roles. More details [here](https://github.com/kubernetes-incubator/cluster-proportional-autoscaler/tree/master/examples#rbac-configurations) 5. Create deployments that will reserve resources. "overprovisioning" deployment will reserve -resources and "overprovisioning-autoscaler" deployment will change the size of reserved resources. -You can use following definitions (you need to change service account for "overprovisioning-autoscaler" -deployment to the one created in the previous step): + resources and "overprovisioning-autoscaler" deployment will change the size of reserved resources. + You can use following definitions (you need to change service account for "overprovisioning-autoscaler" + deployment to the one created in the previous step): ```yaml apiVersion: apps/v1 @@ -439,6 +462,41 @@ spec: serviceAccountName: cluster-proportional-autoscaler-service-account ``` +### How can I enable/disable eviction for a specific DaemonSet + +Cluster Autoscaler will evict DaemonSets based on its configuration, which is +common for the entire cluster. It is possible, however, to specify the desired +behavior on a per pod basis. All DaemonSet pods will be evicted when they have +the following annotation. + +``` +"cluster-autoscaler.kubernetes.io/enable-ds-eviction": "true" +``` + +It is also possible to disable DaemonSet pods eviction expicitly: + + +``` +"cluster-autoscaler.kubernetes.io/enable-ds-eviction": "false" +``` + +Note that this annotation needs to be specified on DaemonSet pods, not the +DaemonSet object itself. In order to do that for all DaemonSet pods, it is +sufficient to modify the pod spec in the DaemonSet object. + +This annotation has no effect on pods that are not a part of any DaemonSet. + +### How can I enable Cluster Autoscaler to scale up when Node's max volume count is exceeded (CSI migration enabled)? + +Kubernetes scheduler will fail to schedule a Pod to a Node if the Node's max volume count is exceeded. In such case to enable Cluster Autoscaler to scale up in a Kubernetes cluster with [CSI migration](https://github.com/kubernetes/enhancements/blob/master/keps/sig-storage/625-csi-migration/README.md) enabled, the appropriate CSI related feature gates have to be specified for the Cluster Autoscaler (if the corresponding feature gates are not enabled by default). + +For example: +``` +--feature-gates=CSIMigration=true,CSIMigration{Provdider}=true,InTreePlugin{Provider}Unregister=true +``` + +For a complete list of the feature gates and their default values per Kubernetes versions, refer to the [Feature Gates documentation](https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/). + **************** # Internals @@ -471,30 +529,37 @@ If there are multiple node groups that, if increased, would help with getting so different strategies can be selected for choosing which node group is increased. Check [What are Expanders?](#what-are-expanders) section to learn more about strategies. It may take some time before the created nodes appear in Kubernetes. It almost entirely -depends on the cloud provider and the speed of node provisioning. Cluster -Autoscaler expects requested nodes to appear within 15 minutes +depends on the cloud provider and the speed of node provisioning, including the +[TLS bootstrapping process](https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet-tls-bootstrapping/). +Cluster Autoscaler expects requested nodes to appear within 15 minutes (configured by `--max-node-provision-time` flag.) After this time, if they are still unregistered, it stops considering them in simulations and may attempt to scale up a different group if the pods are still pending. It will also attempt to remove any nodes left unregistered after this time. +> Note: Cluster Autoscaler is **not** responsible for behaviour and registration +> to the cluster of the new nodes it creates. The responsibility of registering the new nodes +> into your cluster lies with the cluster provisioning tooling you use. +> Example: If you use kubeadm to provision your cluster, it is up to you to automatically +> execute `kubeadm join` at boot time via some script. + ### How does scale-down work? Every 10 seconds (configurable by `--scan-interval` flag), if no scale-up is needed, Cluster Autoscaler checks which nodes are unneeded. A node is considered for removal when **all** below conditions hold: -* The sum of cpu and memory requests of all pods running on this node is smaller +* The sum of cpu and memory requests of all pods running on this node (DaemonSet pods and Mirror pods are included by default but this is configurable with `--ignore-daemonsets-utilization` and `--ignore-mirror-pods-utilization` flags) is smaller than 50% of the node's allocatable. (Before 1.1.0, node capacity was used instead of allocatable.) Utilization threshold can be configured using `--scale-down-utilization-threshold` flag. * All pods running on the node (except these that run on all nodes by default, like manifest-run pods -or pods created by daemonsets) can be moved to other nodes. See -[What types of pods can prevent CA from removing a node?](#what-types-of-pods-can-prevent-ca-from-removing-a-node) section for more details on what pods don't fulfill this condition, even if there is space for them elsewhere. -While checking this condition, the new locations of all movable pods are memorized. -With that, Cluster Autoscaler knows where each pod can be moved, and which nodes -depend on which other nodes in terms of pod migration. Of course, it may happen that eventually -the scheduler will place the pods somewhere else. + or pods created by daemonsets) can be moved to other nodes. See + [What types of pods can prevent CA from removing a node?](#what-types-of-pods-can-prevent-ca-from-removing-a-node) section for more details on what pods don't fulfill this condition, even if there is space for them elsewhere. + While checking this condition, the new locations of all movable pods are memorized. + With that, Cluster Autoscaler knows where each pod can be moved, and which nodes + depend on which other nodes in terms of pod migration. Of course, it may happen that eventually + the scheduler will place the pods somewhere else. * It doesn't have scale-down disabled annotation (see [How can I prevent Cluster Autoscaler from scaling down a particular node?](#how-can-i-prevent-cluster-autoscaler-from-scaling-down-a-particular-node)) @@ -510,6 +575,17 @@ What happens when a non-empty node is terminated? As mentioned above, all pods s elsewhere. Cluster Autoscaler does this by evicting them and tainting the node, so they aren't scheduled there again. +DaemonSet pods may also be evicted. This can be configured separately for empty +(i.e. containing only DaemonSet pods) and non-empty nodes with +`--daemonset-eviction-for-empty-nodes` and +`--daemonset-eviction-for-occupied-nodes` flags, respectively. Note that the +default behavior is different on each flag: by default DaemonSet pods eviction +will happen only on occupied nodes. Individual DaemonSet pods can also +explicitly choose to be evicted (or not). See [How can I enable/disable eviction +for a specific +DaemonSet](#how-can-i-enabledisable-eviction-for-a-specific-daemonset) for more +details. + Example scenario: Nodes A, B, C, X, Y. @@ -607,22 +683,27 @@ Expanders can be selected by passing the name to the `--expander` flag, i.e. Currently Cluster Autoscaler has 5 expanders: * `random` - this is the default expander, and should be used when you don't have a particular -need for the node groups to scale differently. + need for the node groups to scale differently. * `most-pods` - selects the node group that would be able to schedule the most pods when scaling -up. This is useful when you are using nodeSelector to make sure certain pods land on certain nodes. -Note that this won't cause the autoscaler to select bigger nodes vs. smaller, as it can add multiple -smaller nodes at once. + up. This is useful when you are using nodeSelector to make sure certain pods land on certain nodes. + Note that this won't cause the autoscaler to select bigger nodes vs. smaller, as it can add multiple + smaller nodes at once. * `least-waste` - selects the node group that will have the least idle CPU (if tied, unused memory) -after scale-up. This is useful when you have different classes of nodes, for example, high CPU or high memory nodes, and only want to expand those when there are pending pods that need a lot of those resources. + after scale-up. This is useful when you have different classes of nodes, for example, high CPU or high memory nodes, and only want to expand those when there are pending pods that need a lot of those resources. * `price` - select the node group that will cost the least and, at the same time, whose machines -would match the cluster size. This expander is described in more details -[HERE](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/proposals/pricing.md). Currently it works only for GCE and GKE (patches welcome.) + would match the cluster size. This expander is described in more details + [HERE](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/proposals/pricing.md). Currently it works only for GCE, GKE and Equinix Metal (patches welcome.) * `priority` - selects the node group that has the highest priority assigned by the user. It's configuration is described in more details [here](expander/priority/readme.md) +From 1.23.0 onwards, multiple expanders may be passed, i.e. +`.cluster-autoscaler --expander=priority,least-waste` + +This will cause the `least-waste` expander to be used as a fallback in the event that the priority expander selects multiple node groups. In general, a list of expanders can be used, where the output of one is passed to the next and the final decision by randomly selecting one. An expander must not appear in the list more than once. + ### Does CA respect node affinity when selecting node groups to scale up? CA respects `nodeSelector` and `requiredDuringSchedulingIgnoredDuringExecution` in nodeAffinity given that you have labelled your node groups accordingly. If there is a pod that cannot be scheduled with either `nodeSelector` or `requiredDuringSchedulingIgnoredDuringExecution` specified, CA will only consider node groups that satisfy those requirements for expansion. @@ -635,59 +716,65 @@ However, CA does not consider "soft" constraints like `preferredDuringScheduling The following startup parameters are supported for cluster autoscaler: -| Parameter | Description | Default | -| --- | --- | --- | -| `cluster-name` | Autoscaled cluster name, if available | "" -| `address` | The address to expose prometheus metrics | :8085 -| `kubernetes` | Kubernetes API Server location. Leave blank for default | "" -| `kubeconfig` | Path to kubeconfig file with authorization and API Server location information | "" -| `cloud-config` | The path to the cloud provider configuration file. Empty string for no configuration file | "" -| `namespace` | Namespace in which cluster-autoscaler run | "kube-system" -| `scale-down-enabled` | Should CA scale down the cluster | true -| `scale-down-delay-after-add` | How long after scale up that scale down evaluation resumes | 10 minutes -| `scale-down-delay-after-delete` | How long after node deletion that scale down evaluation resumes, defaults to scan-interval | scan-interval -| `scale-down-delay-after-failure` | How long after scale down failure that scale down evaluation resumes | 3 minutes -| `scale-down-unneeded-time` | How long a node should be unneeded before it is eligible for scale down | 10 minutes -| `scale-down-unready-time` | How long an unready node should be unneeded before it is eligible for scale down | 20 minutes -| `scale-down-utilization-threshold` | Node utilization level, defined as sum of requested resources divided by capacity, below which a node can be considered for scale down | 0.5 -| `scale-down-non-empty-candidates-count` | Maximum number of non empty nodes considered in one iteration as candidates for scale down with drain
Lower value means better CA responsiveness but possible slower scale down latency
Higher value can affect CA performance with big clusters (hundreds of nodes)
Set to non positive value to turn this heuristic off - CA will not limit the number of nodes it considers." | 30 -| `scale-down-candidates-pool-ratio` | A ratio of nodes that are considered as additional non empty candidates for
scale down when some candidates from previous iteration are no longer valid
Lower value means better CA responsiveness but possible slower scale down latency
Higher value can affect CA performance with big clusters (hundreds of nodes)
Set to 1.0 to turn this heuristics off - CA will take all nodes as additional candidates. | 0.1 -| `scale-down-candidates-pool-min-count` | Minimum number of nodes that are considered as additional non empty candidates
for scale down when some candidates from previous iteration are no longer valid.
When calculating the pool size for additional candidates we take
`max(#nodes * scale-down-candidates-pool-ratio, scale-down-candidates-pool-min-count)` | 50 -| `scan-interval` | How often cluster is reevaluated for scale up or down | 10 seconds -| `max-nodes-total` | Maximum number of nodes in all node groups. Cluster autoscaler will not grow the cluster beyond this number. | 0 -| `cores-total` | Minimum and maximum number of cores in cluster, in the format :. Cluster autoscaler will not scale the cluster beyond these numbers. | 320000 -| `memory-total` | Minimum and maximum number of gigabytes of memory in cluster, in the format :. Cluster autoscaler will not scale the cluster beyond these numbers. | 6400000 -| `gpu-total` | Minimum and maximum number of different GPUs in cluster, in the format ::. Cluster autoscaler will not scale the cluster beyond these numbers. Can be passed multiple times. CURRENTLY THIS FLAG ONLY WORKS ON GKE. | "" -| `cloud-provider` | Cloud provider type. | gce -| `max-empty-bulk-delete` | Maximum number of empty nodes that can be deleted at the same time. | 10 -| `max-graceful-termination-sec` | Maximum number of seconds CA waits for pod termination when trying to scale down a node. | 600 -| `max-total-unready-percentage` | Maximum percentage of unready nodes in the cluster. After this is exceeded, CA halts operations | 45 -| `ok-total-unready-count` | Number of allowed unready nodes, irrespective of max-total-unready-percentage | 3 -| `max-node-provision-time` | Maximum time CA waits for node to be provisioned | 15 minutes -| `nodes` | sets min,max size and other configuration data for a node group in a format accepted by cloud provider. Can be used multiple times. Format: :: | "" -| `node-group-auto-discovery` | One or more definition(s) of node group auto-discovery.
A definition is expressed `:[[=]]`
The `aws`, `gce`, and `azure` cloud providers are currently supported. AWS matches by ASG tags, e.g. `asg:tag=tagKey,anotherTagKey`
GCE matches by IG name prefix, and requires you to specify min and max nodes per IG, e.g. `mig:namePrefix=pfx,min=0,max=10`
Azure matches by tags on VMSS, e.g. `label:foo=bar`, and will auto-detect `min` and `max` tags on the VMSS to set scaling limits.
Can be used multiple times | "" -| `estimator` | Type of resource estimator to be used in scale up | binpacking -| `expander` | Type of node group expander to be used in scale up. | random -| `write-status-configmap` | Should CA write status information to a configmap | true -| `status-config-map-name` | The name of the status ConfigMap that CA writes | cluster-autoscaler-status -| `max-inactivity` | Maximum time from last recorded autoscaler activity before automatic restart | 10 minutes -| `max-failing-time` | Maximum time from last recorded successful autoscaler run before automatic restart | 15 minutes -| `balance-similar-node-groups` | Detect similar node groups and balance the number of nodes between them | false -| `balancing-ignore-label` | Define a node label that should be ignored when considering node group similarity. One label per flag occurrence. | "" -| `node-autoprovisioning-enabled` | Should CA autoprovision node groups when needed | false -| `max-autoprovisioned-node-group-count` | The maximum number of autoprovisioned groups in the cluster | 15 -| `unremovable-node-recheck-timeout` | The timeout before we check again a node that couldn't be removed before | 5 minutes -| `expendable-pods-priority-cutoff` | Pods with priority below cutoff will be expendable. They can be killed without any consideration during scale down and they don't cause scale up. Pods with null priority (PodPriority disabled) are non expendable | -10 -| `regional` | Cluster is regional | false -| `leader-elect` | Start a leader election client and gain leadership before executing the main loop.
Enable this when running replicated components for high availability | true -| `leader-elect-lease-duration` | The duration that non-leader candidates will wait after observing a leadership
renewal until attempting to acquire leadership of a led but unrenewed leader slot.
This is effectively the maximum duration that a leader can be stopped before it is replaced by another candidate.
This is only applicable if leader election is enabled | 15 seconds -| `leader-elect-renew-deadline` | The interval between attempts by the active cluster-autoscaler to renew a leadership slot before it stops leading.
This must be less than or equal to the lease duration.
This is only applicable if leader election is enabled | 10 seconds -| `leader-elect-retry-period` | The duration the clients should wait between attempting acquisition and renewal of a leadership.
This is only applicable if leader election is enabled | 2 seconds -| `leader-elect-resource-lock` | The type of resource object that is used for locking during leader election.
Supported options are `endpoints` (default) and `configmaps` | "endpoints" -| `aws-use-static-instance-list` | Should CA fetch instance types in runtime or use a static list. AWS only | false -| `skip-nodes-with-system-pods` | If true cluster autoscaler will never delete nodes with pods from kube-system (except for DaemonSet or mirror pods) | true -| `skip-nodes-with-local-storage`| If true cluster autoscaler will never delete nodes with pods with local storage, e.g. EmptyDir or HostPath | true -| `min-replica-count` | Minimum number or replicas that a replica set or replication controller should have to allow their pods deletion in scale down | 0 +| Parameter | Description | Default | +|-----------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------| +| `cluster-name` | Autoscaled cluster name, if available | "" | +| `address` | The address to expose prometheus metrics | :8085 | +| `kubernetes` | Kubernetes API Server location. Leave blank for default | "" | +| `kubeconfig` | Path to kubeconfig file with authorization and API Server location information | "" | +| `cloud-config` | The path to the cloud provider configuration file. Empty string for no configuration file | "" | +| `namespace` | Namespace in which cluster-autoscaler run | "kube-system" | +| `scale-down-enabled` | Should CA scale down the cluster | true | +| `scale-down-delay-after-add` | How long after scale up that scale down evaluation resumes | 10 minutes | +| `scale-down-delay-after-delete` | How long after node deletion that scale down evaluation resumes, defaults to scan-interval | scan-interval | +| `scale-down-delay-after-failure` | How long after scale down failure that scale down evaluation resumes | 3 minutes | +| `scale-down-unneeded-time` | How long a node should be unneeded before it is eligible for scale down | 10 minutes | +| `scale-down-unready-time` | How long an unready node should be unneeded before it is eligible for scale down | 20 minutes | +| `scale-down-utilization-threshold` | Node utilization level, defined as sum of requested resources divided by capacity, below which a node can be considered for scale down | 0.5 | +| `scale-down-non-empty-candidates-count` | Maximum number of non empty nodes considered in one iteration as candidates for scale down with drain
Lower value means better CA responsiveness but possible slower scale down latency
Higher value can affect CA performance with big clusters (hundreds of nodes)
Set to non positive value to turn this heuristic off - CA will not limit the number of nodes it considers." | 30 | +| `scale-down-candidates-pool-ratio` | A ratio of nodes that are considered as additional non empty candidates for
scale down when some candidates from previous iteration are no longer valid
Lower value means better CA responsiveness but possible slower scale down latency
Higher value can affect CA performance with big clusters (hundreds of nodes)
Set to 1.0 to turn this heuristics off - CA will take all nodes as additional candidates. | 0.1 | +| `scale-down-candidates-pool-min-count` | Minimum number of nodes that are considered as additional non empty candidates
for scale down when some candidates from previous iteration are no longer valid.
When calculating the pool size for additional candidates we take
`max(#nodes * scale-down-candidates-pool-ratio, scale-down-candidates-pool-min-count)` | 50 | +| `scan-interval` | How often cluster is reevaluated for scale up or down | 10 seconds | +| `max-nodes-total` | Maximum number of nodes in all node groups. Cluster autoscaler will not grow the cluster beyond this number. | 0 | +| `cores-total` | Minimum and maximum number of cores in cluster, in the format :. Cluster autoscaler will not scale the cluster beyond these numbers. | 320000 | +| `memory-total` | Minimum and maximum number of gigabytes of memory in cluster, in the format :. Cluster autoscaler will not scale the cluster beyond these numbers. | 6400000 | +| `gpu-total` | Minimum and maximum number of different GPUs in cluster, in the format ::. Cluster autoscaler will not scale the cluster beyond these numbers. Can be passed multiple times. CURRENTLY THIS FLAG ONLY WORKS ON GKE. | "" | +| `cloud-provider` | Cloud provider type. | gce | +| `max-empty-bulk-delete` | Maximum number of empty nodes that can be deleted at the same time. | 10 | +| `max-graceful-termination-sec` | Maximum number of seconds CA waits for pod termination when trying to scale down a node. | 600 | +| `max-total-unready-percentage` | Maximum percentage of unready nodes in the cluster. After this is exceeded, CA halts operations | 45 | +| `ok-total-unready-count` | Number of allowed unready nodes, irrespective of max-total-unready-percentage | 3 | +| `max-node-provision-time` | Maximum time CA waits for node to be provisioned | 15 minutes | +| `nodes` | sets min,max size and other configuration data for a node group in a format accepted by cloud provider. Can be used multiple times. Format: :: | "" | +| `node-group-auto-discovery` | One or more definition(s) of node group auto-discovery.
A definition is expressed `:[[=]]`
The `aws`, `gce`, and `azure` cloud providers are currently supported. AWS matches by ASG tags, e.g. `asg:tag=tagKey,anotherTagKey`
GCE matches by IG name prefix, and requires you to specify min and max nodes per IG, e.g. `mig:namePrefix=pfx,min=0,max=10`
Azure matches by tags on VMSS, e.g. `label:foo=bar`, and will auto-detect `min` and `max` tags on the VMSS to set scaling limits.
Can be used multiple times | "" | +| `emit-per-nodegroup-metrics` | If true, emit per node group metrics. | false | +| `estimator` | Type of resource estimator to be used in scale up | binpacking | +| `expander` | Type of node group expander to be used in scale up. | random | +| `write-status-configmap` | Should CA write status information to a configmap | true | +| `status-config-map-name` | The name of the status ConfigMap that CA writes | cluster-autoscaler-status | +| `max-inactivity` | Maximum time from last recorded autoscaler activity before automatic restart | 10 minutes | +| `max-failing-time` | Maximum time from last recorded successful autoscaler run before automatic restart | 15 minutes | +| `balance-similar-node-groups` | Detect similar node groups and balance the number of nodes between them | false | +| `balancing-ignore-label` | Define a node label that should be ignored when considering node group similarity. One label per flag occurrence. | "" | +| `node-autoprovisioning-enabled` | Should CA autoprovision node groups when needed | false | +| `max-autoprovisioned-node-group-count` | The maximum number of autoprovisioned groups in the cluster | 15 | +| `unremovable-node-recheck-timeout` | The timeout before we check again a node that couldn't be removed before | 5 minutes | +| `expendable-pods-priority-cutoff` | Pods with priority below cutoff will be expendable. They can be killed without any consideration during scale down and they don't cause scale up. Pods with null priority (PodPriority disabled) are non expendable | -10 | +| `regional` | Cluster is regional | false | +| `leader-elect` | Start a leader election client and gain leadership before executing the main loop.
Enable this when running replicated components for high availability | true | +| `leader-elect-lease-duration` | The duration that non-leader candidates will wait after observing a leadership
renewal until attempting to acquire leadership of a led but unrenewed leader slot.
This is effectively the maximum duration that a leader can be stopped before it is replaced by another candidate.
This is only applicable if leader election is enabled | 15 seconds | +| `leader-elect-renew-deadline` | The interval between attempts by the active cluster-autoscaler to renew a leadership slot before it stops leading.
This must be less than or equal to the lease duration.
This is only applicable if leader election is enabled | 10 seconds | +| `leader-elect-retry-period` | The duration the clients should wait between attempting acquisition and renewal of a leadership.
This is only applicable if leader election is enabled | 2 seconds | +| `leader-elect-resource-lock` | The type of resource object that is used for locking during leader election.
Supported options are `endpoints` (default) and `configmaps` | "endpoints" | +| `aws-use-static-instance-list` | Should CA fetch instance types in runtime or use a static list. AWS only | false | +| `skip-nodes-with-system-pods` | If true cluster autoscaler will never delete nodes with pods from kube-system (except for DaemonSet or mirror pods) | true | +| `skip-nodes-with-local-storage` | If true cluster autoscaler will never delete nodes with pods with local storage, e.g. EmptyDir or HostPath | true | +| `min-replica-count` | Minimum number or replicas that a replica set or replication controller should have to allow their pods deletion in scale down | 0 | +| `daemonset-eviction-for-empty-nodes` | Whether DaemonSet pods will be gracefully terminated from empty nodes | false | +| `daemonset-eviction-for-occupied-nodes` | Whether DaemonSet pods will be gracefully terminated from non-empty nodes | true | +| `feature-gates` | A set of key=value pairs that describe feature gates for alpha/experimental features. | "" | +| `cordon-node-before-terminating` | Should CA cordon nodes before terminating during downscale process | false | +| `record-duplicated-events` | Enable the autoscaler to print duplicated events within a 5 minute window. | false | # Troubleshooting: @@ -711,6 +798,8 @@ CA doesn't remove underutilized nodes if they are running pods [that it shouldn' * using large custom value for `--scale-down-delay-after-delete` or `--scan-interval`, which delays CA action. +* make sure `--scale-down-enabled` parameter in command is not set to false + ### How to set PDBs to enable CA to move kube-system pods? By default, kube-system pods prevent CA from removing nodes on which they are running. Users can manually add PDBs for the kube-system pods that can be safely rescheduled elsewhere: @@ -722,15 +811,15 @@ kubectl create poddisruptionbudget --namespace=kube-system --selector Here's how to do it for some common pods: * kube-dns can safely be rescheduled as long as there are supposed to be at least 2 of these pods. In 1.7, this will always be -the case. For 1.6 and earlier, edit kube-dns-autoscaler config map as described -[here](https://kubernetes.io/docs/tasks/administer-cluster/dns-horizontal-autoscaling/#tuning-autoscaling-parameters), -adding preventSinglePointFailure parameter. For example: + the case. For 1.6 and earlier, edit kube-dns-autoscaler config map as described + [here](https://kubernetes.io/docs/tasks/administer-cluster/dns-horizontal-autoscaling/#tuning-autoscaling-parameters), + adding preventSinglePointFailure parameter. For example: ``` linear:'{"coresPerReplica":256,"nodesPerReplica":16,"preventSinglePointFailure":true}' ``` * Metrics Server is best left alone, as restarting it causes the loss of metrics for >1 minute, as well as metrics -in dashboard from the last 15 minutes. Metrics Server downtime also means effective HPA downtime as it relies on metrics. Add PDB for it only if you're sure you don't mind. + in dashboard from the last 15 minutes. Metrics Server downtime also means effective HPA downtime as it relies on metrics. Add PDB for it only if you're sure you don't mind. ### I have a couple of pending pods, but there was no scale-up? @@ -772,7 +861,7 @@ Most likely it's due to a problem with the cluster. Steps to debug: * Check if cluster autoscaler is up and running. In version 0.5 and later, it periodically publishes the kube-system/cluster-autoscaler-status config map. Check last update time annotation. It should be no more than 3 min (usually 10 sec old). -* Check in the above config map if cluster and node groups are in the healthy state. If not, check if there are unready nodes. +* Check in the above config map if cluster and node groups are in the healthy state. If not, check if there are unready nodes. If some nodes appear unready despite being Ready in the Node object, check `resourceUnready` count. If there are any nodes marked as `resourceUnready`, it is most likely a problem with the device driver failing to install a new resource (e.g. GPU). `resourceUnready` count is only available in CA version 1.24 and later. If both the cluster and CA appear healthy: @@ -857,6 +946,21 @@ Depending on how long scale-ups have been failing, it may wait up to 30 minutes # Developer: +### What go version should be used to compile CA? + +Cluster Autoscaler generally tries to use the same go version that is used by embedded Kubernetes code. +For example CA 1.21 will use the same go version as Kubernetes 1.21. Only the officially used go +version is supported and CA may not compile using other versions. + +The source of truth for the used go version is builder/Dockerfile. + +Warning: do NOT rely on go version specified in go.mod file. It is only meant to control go mod +behavior and is not indicative of the go version actually used by CA. In particular go 1.17 changes go mod +behavior in a way that is incompatible with existing Kubernetes tooling. +Following [Kubernetes example](https://github.com/kubernetes/kubernetes/pull/105563#issuecomment-960915506) +we have decided to pin version specified in go.mod to 1.16 for now (even though both Kubernetes +and CA no longer compile using go 1.16). + ### How can I run e2e tests? 1. Set up environment and build e2e.go as described in the [Kubernetes docs](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-testing/e2e-tests.md#building-and-running-the-tests). @@ -867,7 +971,7 @@ Depending on how long scale-ups have been failing, it may wait up to 30 minutes export KUBE_ENABLE_CLUSTER_AUTOSCALER=true export KUBE_AUTOSCALER_ENABLE_SCALE_DOWN=true ``` - This is the minimum number of nodes required for all e2e tests to pass. The tests should also pass if you set higher maximum nodes limit. + This is the minimum number of nodes required for all e2e tests to pass. The tests should also pass if you set higher maximum nodes limit. 3. Run `go run hack/e2e.go -- --verbose-commands --up` to bring up your cluster. 4. SSH to the control plane (previously referred to as master) node and edit `/etc/kubernetes/manifests/cluster-autoscaler.manifest` (you will need sudo for this). * If you want to test your custom changes set `image` to point at your own CA image. @@ -876,9 +980,9 @@ Depending on how long scale-ups have been failing, it may wait up to 30 minutes ```sh go run hack/e2e.go -- --verbose-commands --test --test_args="--ginkgo.focus=\[Feature:ClusterSizeAutoscaling" ``` - It will take >1 hour to run the full suite. You may want to redirect output to file, as there will be plenty of it. + It will take >1 hour to run the full suite. You may want to redirect output to file, as there will be plenty of it. - Test runner may be missing default credentials. On GCE they can be provided with: + Test runner may be missing default credentials. On GCE they can be provided with: ```sh gcloud beta auth application-default login ``` @@ -902,13 +1006,13 @@ To test your PR: 1. Run Cluster Autoscaler e2e tests if you can. We are running our e2e tests on GCE and we can't guarantee the tests are passing on every cloud provider. 2. If you can't run e2e we ask you to do a following manual test at the -minimum, using Cluster-Autoscaler image containing your changes and using -configuration required to activate them: - i. Create a deployment. Scale it up, so that some pods don't fit onto existing - nodes. Wait for new nodes to be added by Cluster Autoscaler and confirm all - pods have been scheduled successfully. - ii. Scale the deployment down to a single replica and confirm that the - cluster scales down. + minimum, using Cluster-Autoscaler image containing your changes and using + configuration required to activate them: + i. Create a deployment. Scale it up, so that some pods don't fit onto existing + nodes. Wait for new nodes to be added by Cluster Autoscaler and confirm all + pods have been scheduled successfully. + ii. Scale the deployment down to a single replica and confirm that the + cluster scales down. 3. Run a manual test following the basic use case of your change. Confirm that nodes are added or removed as expected. Once again, we ask you to use common sense to decide what needs to be tested. @@ -926,24 +1030,22 @@ unexpected problems coming from version incompatibilities. To sync the repositories' vendored k8s libraries, we have a script that takes a released version of k8s and updates the `replace` directives of each k8s -sub-library. +sub-library. It can be used with custom kubernetes fork, by default it uses +`git@github.com:kubernetes/kubernetes.git`. Example execution looks like this: ``` -./hack/update-vendor.sh 1.20.0-alpha.1 +./hack/update-vendor.sh 1.20.0-alpha.1 git@github.com:kubernetes/kubernetes.git ``` -Caveats: - - `update-vendor.sh` is called directly in shell (no docker is used) therefore its operation may differ from environment to environment. - - It is important that go version, which isn in use in the shell in which `update-vendor.sh` is called, matches the `go ` directive specified in `go.mod` file - in `kubernetes/kubernetes` revision against which revendoring is done. - - `update-vendor.sh` automatically runs unit tests as part of verification process. If one needs to suppress that, it can be done by overriding `VERIFY_COMMAND` variable (`VERIFY_COMMAND=true ./hack/update-vendor.sh ...`) - - If one wants to only add new libraries to `go.mod-extra`, but not change the base `go.mod`, `-r` should be used with kubernetes/kubernets revision, which was used last time `update-vendor.sh` was called. One can determine that revision by looking at `git log` in Cluster Autoscaler repository. Following command will do the trick `git log | grep "Updating vendor against"`. - +If you need to update vendor to an unreleased commit of Kubernetes, you can use the breakglass script: +``` +./hack/submodule-k8s.sh git@github.com:kubernetes/kubernetes.git +``` # In the context of Gardener: -### How do I rebase this fork of autoscaler with upstream? +### How do I rebase this fork of autoscaler with upstream? Please consider reading the answer [above](#how-can-i-update-ca-dependencies-particularly-k8siokubernetes) beforehand of updating the dependencies for better understanding. @@ -995,7 +1097,7 @@ git rebase master Resolve the rebase-conflicts by appropriately accepting the incoming changes or the current changes. -Tip: Accept all the incoming changes for the go.mod and modules.txt file. This file will anyways be re-generated in the next step. +Tip: Accept all the incoming changes for the go.mod and modules.txt file. This file will anyways be re-generated in the next step. Once all the rebase-conflicts are resolved, execute following script: @@ -1003,10 +1105,10 @@ Once all the rebase-conflicts are resolved, execute following script: VERIFY_COMMAND=true ./hack/update-vendor.sh ``` -The script automatically runs the unit-tests for all the providers and of the core-logic, it can be disabled by +The script automatically runs the unit-tests for all the providers and of the core-logic, it can be disabled by setting the `VERIFY_COMMAND=true` while runniing the script. -The script shall create a directory under the `/tmp`, and logs of the execution-progress is also available there. +The script shall create a directory under the `/tmp`, and logs of the execution-progress is also available there. Once script is successfully executed, execute following commands to confirm the correctness. ``` # You must see a new commit created by the script containing the commit-hash. @@ -1054,7 +1156,7 @@ Resolve the merge-conflicts by appropriately accepting the incoming changes or t #### Step 3: - For syncing with kubernetes/autoscaler < v1.21.0 + For syncing with kubernetes/autoscaler < v1.21.0 This fork of the autoscaler vendors the [machine-controller-manager](https://github.com/gardener/machine-controller-manager) aka MCM from Gardener project. As the MCM itself vendors the `k8s.io` in it, we need to make following change to the [`update-vendor`](https://github.com/gardener/autoscaler/blob/master/cluster-autoscaler/hack/update-vendor.sh) script: @@ -1090,10 +1192,10 @@ Once all the merge-conflicts are resolved, execute following script: VERIFY_COMMAND=true ./hack/update-vendor.sh ``` -The script automatically runs the unit-tests for all the providers and of the core-logic, it can be disabled by +The script automatically runs the unit-tests for all the providers and of the core-logic, it can be disabled by setting the `VERIFY_COMMAND=true` while runniing the script. -The script shall create a directory under the `/tmp`, and logs of the execution-progress is also available there. +The script shall create a directory under the `/tmp`, and logs of the execution-progress is also available there. Once script is successfully executed, execute following commands to confirm the correctness. ``` # You must see a new commit created by the script containing the commit-hash. @@ -1163,7 +1265,7 @@ git log | grep "Updating vendor against" # Please save commit-hash from the fir #### Step 3: -Update the [`go.mod-extra`](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/go.mod-extra) file to reflect the following: +Update the [`go.mod`](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/go.mod) file to reflect the following: ``` require ( ... @@ -1171,7 +1273,7 @@ require ( ) ``` -If you are interested in vendoring MCM from the local-system or personal fork for testing, you can also add the `replace` section in `go.mod-extra` as shown below. +If you are interested in vendoring MCM from the local-system or personal fork for testing, you can also add the `replace` section in `go.mod` as shown below. ``` // Replace the <$GOPATH> with the actual go-path. @@ -1184,37 +1286,15 @@ replace ( ) ``` -#### Step 4(Optional): +#### Step 4: This fork of the autoscaler vendors the [machine-controller-manager](https://github.com/gardener/machine-controller-manager) aka MCM from Gardener project. As the MCM itself vendors the `k8s.io` in it, we need to make following change to the [`update-vendor`](https://github.com/gardener/autoscaler/blob/master/cluster-autoscaler/hack/update-vendor.sh) script: -Disable the check of implicit-dependencies of go.mod by commenting out following code in the update-vendor script. - +Run the breakglass script providing the k8s commit id you saved ``` -# if [[ "${IMPLICIT_FOUND}" == "true" ]]; then -# err_rerun "Implicit dependencies missing from go.mod-extra" -# fi +./hack/submodule-k8s.sh git@github.com:kubernetes/kubernetes.git ``` -Populate the `K8S_REV` variable in the script with the commit-hash you saved above, as `-r` flag of the original-script in the command-line doesn't work for few environments. Eg. - -``` ---K8S_REV="master" -++K8S_REV="3eb90c19d0cf90b756c3e08e32c6495b91e0aeed" -``` - -#### Step 5: - -Execute the script below: - -``` -VERIFY_COMMAND=true ./hack/update-vendor.sh -``` - -The script automatically runs the unit-tests for all the providers and of the core-logic, it can be disabled by -setting the `VERIFY_COMMAND=true` while runniing the script. - -The script shall create a directory under the `/tmp`, and logs of the execution-progress is also available there. Once script is successfully executed, execute following commands to confirm the correctness. ``` # You must see a new commit created by the script containing the commit-hash. diff --git a/cluster-autoscaler/SYNC-CHANGES/SYNC-CHANGES-1.21.md b/cluster-autoscaler/SYNC-CHANGES/SYNC-CHANGES-1.21.md index 2cf920fad6ac..0b0a897ef27c 100644 --- a/cluster-autoscaler/SYNC-CHANGES/SYNC-CHANGES-1.21.md +++ b/cluster-autoscaler/SYNC-CHANGES/SYNC-CHANGES-1.21.md @@ -7,6 +7,13 @@ - [During merging](#during-merging) - [During vendoring k8s](#during-vendoring-k8s) - [Others](#others) +- [v1.21.1](#v1211) + - [Synced with which upstream CA](#synced-with-which-upstream-ca-1) + - [Changes made](#changes-made-1) + - [To FAQ](#to-faq-1) + - [During merging](#during-merging-1) + - [During vendoring k8s](#during-vendoring-k8s-1) + - [Others](#others-1) # v1.21.0 @@ -38,3 +45,28 @@ - cluster-autoscaler/cloudprovider/builder/builder_all.go - cluster-autoscaler/cloudprovider/mcm - cluster-autoscaler/integration + + +# v1.21.1 + + +## Synced with which upstream CA + +[v1.21.3](https://github.com/kubernetes/autoscaler/tree/cluster-autoscaler-1.21.3/cluster-autoscaler) + +## Changes made + +### To FAQ + +- included new questions and answers +- included new steps to vendor new MCM version + +### During merging + +- included new ec2 instances in `cluster-autoscaler/cloudprovider/aws/ec2_instance_types.go` + +### During vendoring k8s +Still vendoring k8s 1.21.0 in this fork but the upstream 1.21.3 is vendoring k8s 1.25.0 + +### Others +_None_ \ No newline at end of file diff --git a/cluster-autoscaler/cloudprovider/aws/auto_scaling_test.go b/cluster-autoscaler/cloudprovider/aws/auto_scaling_test.go index 90e5dba7e261..8b85887574e8 100644 --- a/cluster-autoscaler/cloudprovider/aws/auto_scaling_test.go +++ b/cluster-autoscaler/cloudprovider/aws/auto_scaling_test.go @@ -27,22 +27,22 @@ import ( "github.com/stretchr/testify/require" ) -func TestMoreThen50Groups(t *testing.T) { +func TestMoreThen100Groups(t *testing.T) { service := &AutoScalingMock{} autoScalingWrapper := &autoScalingWrapper{ autoScaling: service, } - // Generate 51 ASG names - names := make([]string, 51) + // Generate 101 ASG names + names := make([]string, 101) for i := 0; i < len(names); i++ { names[i] = fmt.Sprintf("asg-%d", i) } - // First batch, first 50 elements + // First batch, first 100 elements service.On("DescribeAutoScalingGroupsPages", &autoscaling.DescribeAutoScalingGroupsInput{ - AutoScalingGroupNames: aws.StringSlice(names[:50]), + AutoScalingGroupNames: aws.StringSlice(names[:100]), MaxRecords: aws.Int64(maxRecordsReturnedByAPI), }, mock.AnythingOfType("func(*autoscaling.DescribeAutoScalingGroupsOutput, bool) bool"), @@ -51,10 +51,10 @@ func TestMoreThen50Groups(t *testing.T) { fn(testNamedDescribeAutoScalingGroupsOutput("asg-1", 1, "test-instance-id"), false) }).Return(nil) - // Second batch, element 51 + // Second batch, element 101 service.On("DescribeAutoScalingGroupsPages", &autoscaling.DescribeAutoScalingGroupsInput{ - AutoScalingGroupNames: aws.StringSlice([]string{"asg-50"}), + AutoScalingGroupNames: aws.StringSlice([]string{"asg-100"}), MaxRecords: aws.Int64(maxRecordsReturnedByAPI), }, mock.AnythingOfType("func(*autoscaling.DescribeAutoScalingGroupsOutput, bool) bool"), diff --git a/cluster-autoscaler/cloudprovider/aws/aws_cloud_provider.go b/cluster-autoscaler/cloudprovider/aws/aws_cloud_provider.go index 821b124ee698..e2aee7a36048 100644 --- a/cluster-autoscaler/cloudprovider/aws/aws_cloud_provider.go +++ b/cluster-autoscaler/cloudprovider/aws/aws_cloud_provider.go @@ -362,7 +362,10 @@ func BuildAWS(opts config.AutoscalingOptions, do cloudprovider.NodeGroupDiscover generatedInstanceTypes, err := GenerateEC2InstanceTypes(region) if err != nil { - klog.Fatalf("Failed to generate AWS EC2 Instance Types: %v", err) + klog.Errorf("Failed to generate AWS EC2 Instance Types: %v, falling back to static list with last update time: %s", err, lastUpdateTime) + } + if generatedInstanceTypes == nil { + generatedInstanceTypes = map[string]*InstanceType{} } // fallback on the static list if we miss any instance types in the generated output // credits to: https://github.com/lyft/cni-ipvlan-vpc-k8s/pull/80 diff --git a/cluster-autoscaler/cloudprovider/aws/aws_cloud_provider_test.go b/cluster-autoscaler/cloudprovider/aws/aws_cloud_provider_test.go index 587833779335..ad93facf0755 100644 --- a/cluster-autoscaler/cloudprovider/aws/aws_cloud_provider_test.go +++ b/cluster-autoscaler/cloudprovider/aws/aws_cloud_provider_test.go @@ -17,6 +17,7 @@ limitations under the License. package aws import ( + "os" "testing" "github.com/aws/aws-sdk-go/aws" @@ -26,6 +27,7 @@ import ( "github.com/stretchr/testify/mock" apiv1 "k8s.io/api/core/v1" "k8s.io/autoscaler/cluster-autoscaler/cloudprovider" + "k8s.io/autoscaler/cluster-autoscaler/config" ) type AutoScalingMock struct { @@ -148,6 +150,23 @@ func TestBuildAwsCloudProvider(t *testing.T) { assert.NoError(t, err) } +func TestInstanceTypeFallback(t *testing.T) { + resourceLimiter := cloudprovider.NewResourceLimiter( + map[string]int64{cloudprovider.ResourceNameCores: 1, cloudprovider.ResourceNameMemory: 10000000}, + map[string]int64{cloudprovider.ResourceNameCores: 10, cloudprovider.ResourceNameMemory: 100000000}) + + do := cloudprovider.NodeGroupDiscoveryOptions{} + opts := config.AutoscalingOptions{} + + os.Setenv("AWS_REGION", "non-existent-region") + defer os.Unsetenv("AWS_REGION") + + // This test ensures that no klog.Fatalf calls occur when constructing the AWS cloud provider. Specifically it is + // intended to ensure that instance type fallback works correctly in the event of an error enumerating instance + // types. + _ = BuildAWS(opts, do, resourceLimiter) +} + func TestName(t *testing.T) { provider := testProvider(t, testAwsManager) assert.Equal(t, provider.Name(), cloudprovider.AwsProviderName) diff --git a/cluster-autoscaler/cloudprovider/aws/aws_manager.go b/cluster-autoscaler/cloudprovider/aws/aws_manager.go index d9f387e34851..1db5c6be9af3 100644 --- a/cluster-autoscaler/cloudprovider/aws/aws_manager.go +++ b/cluster-autoscaler/cloudprovider/aws/aws_manager.go @@ -49,7 +49,7 @@ const ( operationWaitTimeout = 5 * time.Second operationPollInterval = 100 * time.Millisecond maxRecordsReturnedByAPI = 100 - maxAsgNamesPerDescribe = 50 + maxAsgNamesPerDescribe = 100 refreshInterval = 1 * time.Minute autoDiscovererTypeASG = "asg" asgAutoDiscovererKeyTag = "tag" @@ -312,7 +312,7 @@ func (m *AwsManager) getAsgTemplate(asg *asg) (*asgTemplate, error) { region := az[0 : len(az)-1] if len(asg.AvailabilityZones) > 1 { - klog.Warningf("Found multiple availability zones for ASG %q; using %s\n", asg.Name, az) + klog.V(4).Infof("Found multiple availability zones for ASG %q; using %s for %s label\n", asg.Name, az, apiv1.LabelFailureDomainBetaZone) } instanceTypeName, err := m.buildInstanceType(asg) diff --git a/cluster-autoscaler/cloudprovider/aws/aws_util.go b/cluster-autoscaler/cloudprovider/aws/aws_util.go index 6085ffe89e64..ded250ad7eef 100644 --- a/cluster-autoscaler/cloudprovider/aws/aws_util.go +++ b/cluster-autoscaler/cloudprovider/aws/aws_util.go @@ -20,21 +20,26 @@ import ( "encoding/json" "errors" "fmt" - "github.com/aws/aws-sdk-go/aws/endpoints" - "io/ioutil" - klog "k8s.io/klog/v2" + "io" "net/http" "os" "regexp" "strconv" "strings" + + "github.com/aws/aws-sdk-go/aws" + "github.com/aws/aws-sdk-go/aws/ec2metadata" + "github.com/aws/aws-sdk-go/aws/endpoints" + "github.com/aws/aws-sdk-go/aws/session" + + klog "k8s.io/klog/v2" ) var ( - ec2MetaDataServiceUrl = "http://169.254.169.254/latest/dynamic/instance-identity/document" + ec2MetaDataServiceUrl = "http://169.254.169.254" ec2PricingServiceUrlTemplate = "https://pricing.us-east-1.amazonaws.com/offers/v1.0/aws/AmazonEC2/current/%s/index.json" ec2PricingServiceUrlTemplateCN = "https://pricing.cn-north-1.amazonaws.com.cn/offers/v1.0/cn/AmazonEC2/current/%s/index.json" - staticListLastUpdateTime = "2020-12-07" + staticListLastUpdateTime = "2022-06-02" ) type response struct { @@ -82,16 +87,9 @@ func GenerateEC2InstanceTypes(region string) (map[string]*InstanceType, error) { defer res.Body.Close() - body, err := ioutil.ReadAll(res.Body) + unmarshalled, err := unmarshalProductsResponse(res.Body) if err != nil { - klog.Warningf("Error parsing %s skipping...\n", url) - continue - } - - var unmarshalled = response{} - err = json.Unmarshal(body, &unmarshalled) - if err != nil { - klog.Warningf("Error unmarshalling %s, skip...\n", url) + klog.Warningf("Error parsing %s skipping...\n%s\n", url, err) continue } @@ -127,6 +125,58 @@ func GetStaticEC2InstanceTypes() (map[string]*InstanceType, string) { return InstanceTypes, staticListLastUpdateTime } +func unmarshalProductsResponse(r io.Reader) (*response, error) { + dec := json.NewDecoder(r) + t, err := dec.Token() + if err != nil { + return nil, err + } + if delim, ok := t.(json.Delim); !ok || delim.String() != "{" { + return nil, errors.New("Invalid products json") + } + + unmarshalled := response{map[string]product{}} + + for dec.More() { + t, err = dec.Token() + if err != nil { + return nil, err + } + + if t == "products" { + tt, err := dec.Token() + if err != nil { + return nil, err + } + if delim, ok := tt.(json.Delim); !ok || delim.String() != "{" { + return nil, errors.New("Invalid products json") + } + for dec.More() { + productCode, err := dec.Token() + if err != nil { + return nil, err + } + + prod := product{} + if err = dec.Decode(&prod); err != nil { + return nil, err + } + unmarshalled.Products[productCode.(string)] = prod + } + } + } + + t, err = dec.Token() + if err != nil { + return nil, err + } + if delim, ok := t.(json.Delim); !ok || delim.String() != "}" { + return nil, errors.New("Invalid products json") + } + + return &unmarshalled, nil +} + func parseMemory(memory string) int64 { reg, err := regexp.Compile("[^0-9\\.]+") if err != nil { @@ -155,26 +205,13 @@ func GetCurrentAwsRegion() (string, error) { region, present := os.LookupEnv("AWS_REGION") if !present { - klog.V(1).Infof("fetching %s\n", ec2MetaDataServiceUrl) - res, err := http.Get(ec2MetaDataServiceUrl) - if err != nil { - return "", fmt.Errorf("Error fetching %s", ec2MetaDataServiceUrl) - } - - defer res.Body.Close() - - body, err := ioutil.ReadAll(res.Body) + c := aws.NewConfig(). + WithEndpoint(ec2MetaDataServiceUrl) + sess, err := session.NewSession() if err != nil { - return "", fmt.Errorf("Error parsing %s", ec2MetaDataServiceUrl) + return "", fmt.Errorf("failed to create session") } - - var unmarshalled = map[string]string{} - err = json.Unmarshal(body, &unmarshalled) - if err != nil { - klog.Warningf("Error unmarshalling %s, skip...\n", ec2MetaDataServiceUrl) - } - - region = unmarshalled["region"] + return ec2metadata.New(sess, c).Region() } return region, nil diff --git a/cluster-autoscaler/cloudprovider/aws/aws_util_test.go b/cluster-autoscaler/cloudprovider/aws/aws_util_test.go index 6027babd8900..e29860b41f6c 100644 --- a/cluster-autoscaler/cloudprovider/aws/aws_util_test.go +++ b/cluster-autoscaler/cloudprovider/aws/aws_util_test.go @@ -17,12 +17,14 @@ limitations under the License. package aws import ( - "github.com/stretchr/testify/assert" "net/http" "net/http/httptest" "os" "strconv" + "strings" "testing" + + "github.com/stretchr/testify/assert" ) func TestGetStaticEC2InstanceTypes(t *testing.T) { @@ -111,3 +113,118 @@ func TestGetCurrentAwsRegionWithRegionEnv(t *testing.T) { assert.Nil(t, err) assert.Equal(t, region, result) } + +func TestUnmarshalProductsResponse(t *testing.T) { + body := ` +{ + "products": { + "VVD8BG8WWFD3DAZN" : { + "sku" : "VVD8BG8WWFD3DAZN", + "productFamily" : "Compute Instance", + "attributes" : { + "servicecode" : "AmazonEC2", + "location" : "US East (N. Virginia)", + "locationType" : "AWS Region", + "instanceType" : "r5b.4xlarge", + "currentGeneration" : "Yes", + "instanceFamily" : "Memory optimized", + "vcpu" : "16", + "physicalProcessor" : "Intel Xeon Platinum 8259 (Cascade Lake)", + "clockSpeed" : "3.1 GHz", + "memory" : "128 GiB", + "storage" : "EBS only", + "networkPerformance" : "Up to 10 Gigabit", + "processorArchitecture" : "64-bit", + "tenancy" : "Shared", + "operatingSystem" : "Linux", + "licenseModel" : "No License required", + "usagetype" : "UnusedBox:r5b.4xlarge", + "operation" : "RunInstances:0004", + "availabilityzone" : "NA", + "capacitystatus" : "UnusedCapacityReservation", + "classicnetworkingsupport" : "false", + "dedicatedEbsThroughput" : "10 Gbps", + "ecu" : "NA", + "enhancedNetworkingSupported" : "Yes", + "instancesku" : "G4NFAXD9TGJM3RY8", + "intelAvxAvailable" : "Yes", + "intelAvx2Available" : "No", + "intelTurboAvailable" : "No", + "marketoption" : "OnDemand", + "normalizationSizeFactor" : "32", + "preInstalledSw" : "SQL Std", + "servicename" : "Amazon Elastic Compute Cloud", + "vpcnetworkingsupport" : "true" + } + }, + "C36QEQQQJ8ZR7N32" : { + "sku" : "C36QEQQQJ8ZR7N32", + "productFamily" : "Compute Instance", + "attributes" : { + "servicecode" : "AmazonEC2", + "location" : "US East (N. Virginia)", + "locationType" : "AWS Region", + "instanceType" : "d3en.8xlarge", + "currentGeneration" : "Yes", + "instanceFamily" : "Storage optimized", + "vcpu" : "32", + "physicalProcessor" : "Intel Xeon Platinum 8259 (Cascade Lake)", + "clockSpeed" : "3.1 GHz", + "memory" : "128 GiB", + "storage" : "16 x 14000 HDD", + "networkPerformance" : "50 Gigabit", + "processorArchitecture" : "64-bit", + "tenancy" : "Dedicated", + "operatingSystem" : "SUSE", + "licenseModel" : "No License required", + "usagetype" : "DedicatedRes:d3en.8xlarge", + "operation" : "RunInstances:000g", + "availabilityzone" : "NA", + "capacitystatus" : "AllocatedCapacityReservation", + "classicnetworkingsupport" : "false", + "dedicatedEbsThroughput" : "5000 Mbps", + "ecu" : "NA", + "enhancedNetworkingSupported" : "Yes", + "instancesku" : "2XW3BCEZ83WMGFJY", + "intelAvxAvailable" : "Yes", + "intelAvx2Available" : "Yes", + "intelTurboAvailable" : "Yes", + "marketoption" : "OnDemand", + "normalizationSizeFactor" : "64", + "preInstalledSw" : "NA", + "processorFeatures" : "AVX; AVX2; Intel AVX; Intel AVX2; Intel AVX512; Intel Turbo", + "servicename" : "Amazon Elastic Compute Cloud", + "vpcnetworkingsupport" : "true" + } + } + } +} +` + r := strings.NewReader(body) + resp, err := unmarshalProductsResponse(r) + assert.Nil(t, err) + assert.Len(t, resp.Products, 2) + assert.NotNil(t, resp.Products["VVD8BG8WWFD3DAZN"]) + assert.NotNil(t, resp.Products["C36QEQQQJ8ZR7N32"]) + assert.Equal(t, resp.Products["VVD8BG8WWFD3DAZN"].Attributes.InstanceType, "r5b.4xlarge") + assert.Equal(t, resp.Products["C36QEQQQJ8ZR7N32"].Attributes.InstanceType, "d3en.8xlarge") + + invalidJsonTests := map[string]string{ + "[": "[", + "]": "]", + "}": "}", + "{": "{", + "Plain text": "invalid", + "List": "[]", + "Invalid products ([])": `{"products":[]}`, + "Invalid product ([])": `{"products":{"zz":[]}}`, + } + for name, body := range invalidJsonTests { + t.Run(name, func(t *testing.T) { + r := strings.NewReader(body) + resp, err := unmarshalProductsResponse(r) + assert.NotNil(t, err) + assert.Nil(t, resp) + }) + } +} diff --git a/cluster-autoscaler/cloudprovider/aws/ec2_instance_types.go b/cluster-autoscaler/cloudprovider/aws/ec2_instance_types.go index 0d8b29173e52..517f2763e608 100644 --- a/cluster-autoscaler/cloudprovider/aws/ec2_instance_types.go +++ b/cluster-autoscaler/cloudprovider/aws/ec2_instance_types.go @@ -418,6 +418,78 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 10752, GPU: 0, }, + "c6a": { + InstanceType: "c6a", + VCPU: 192, + MemoryMb: 0, + GPU: 0, + }, + "c6a.12xlarge": { + InstanceType: "c6a.12xlarge", + VCPU: 48, + MemoryMb: 98304, + GPU: 0, + }, + "c6a.16xlarge": { + InstanceType: "c6a.16xlarge", + VCPU: 64, + MemoryMb: 131072, + GPU: 0, + }, + "c6a.24xlarge": { + InstanceType: "c6a.24xlarge", + VCPU: 96, + MemoryMb: 196608, + GPU: 0, + }, + "c6a.2xlarge": { + InstanceType: "c6a.2xlarge", + VCPU: 8, + MemoryMb: 16384, + GPU: 0, + }, + "c6a.32xlarge": { + InstanceType: "c6a.32xlarge", + VCPU: 128, + MemoryMb: 262144, + GPU: 0, + }, + "c6a.48xlarge": { + InstanceType: "c6a.48xlarge", + VCPU: 192, + MemoryMb: 393216, + GPU: 0, + }, + "c6a.4xlarge": { + InstanceType: "c6a.4xlarge", + VCPU: 16, + MemoryMb: 32768, + GPU: 0, + }, + "c6a.8xlarge": { + InstanceType: "c6a.8xlarge", + VCPU: 32, + MemoryMb: 65536, + GPU: 0, + }, + "c6a.large": { + InstanceType: "c6a.large", + VCPU: 2, + MemoryMb: 4096, + GPU: 0, + }, + "c6a.metal": { + InstanceType: "c6a.metal", + VCPU: 192, + MemoryMb: 393216, + GPU: 0, + }, + "c6a.xlarge": { + InstanceType: "c6a.xlarge", + VCPU: 4, + MemoryMb: 8192, + GPU: 0, + }, "c6g": { InstanceType: "c6g", VCPU: 64, @@ -538,6 +610,252 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 8192, GPU: 0, }, + "c6gn": { + InstanceType: "c6gn", + VCPU: 64, + MemoryMb: 0, + GPU: 0, + }, + "c6gn.12xlarge": { + InstanceType: "c6gn.12xlarge", + VCPU: 48, + MemoryMb: 98304, + GPU: 0, + }, + "c6gn.16xlarge": { + InstanceType: "c6gn.16xlarge", + VCPU: 64, + MemoryMb: 131072, + GPU: 0, + }, + "c6gn.2xlarge": { + InstanceType: "c6gn.2xlarge", + VCPU: 8, + MemoryMb: 16384, + GPU: 0, + }, + "c6gn.4xlarge": { + InstanceType: "c6gn.4xlarge", + VCPU: 16, + MemoryMb: 32768, + GPU: 0, + }, + "c6gn.8xlarge": { + InstanceType: "c6gn.8xlarge", + VCPU: 32, + MemoryMb: 65536, + GPU: 0, + }, + "c6gn.large": { + InstanceType: "c6gn.large", + VCPU: 2, + MemoryMb: 4096, + GPU: 0, + }, + "c6gn.medium": { + InstanceType: "c6gn.medium", + VCPU: 1, + MemoryMb: 2048, + GPU: 0, + }, + "c6gn.metal": { + InstanceType: "c6gn.metal", + VCPU: 64, + MemoryMb: 131072, + GPU: 0, + }, + "c6gn.xlarge": { + InstanceType: "c6gn.xlarge", + VCPU: 4, + MemoryMb: 8192, + GPU: 0, + }, + "c6i": { + InstanceType: "c6i", + VCPU: 128, + MemoryMb: 0, + GPU: 0, + }, + "c6i.12xlarge": { + InstanceType: "c6i.12xlarge", + VCPU: 48, + MemoryMb: 98304, + GPU: 0, + }, + "c6i.16xlarge": { + InstanceType: "c6i.16xlarge", + VCPU: 64, + MemoryMb: 131072, + GPU: 0, + }, + "c6i.24xlarge": { + InstanceType: "c6i.24xlarge", + VCPU: 96, + MemoryMb: 196608, + GPU: 0, + }, + "c6i.2xlarge": { + InstanceType: "c6i.2xlarge", + VCPU: 8, + MemoryMb: 16384, + GPU: 0, + }, + "c6i.32xlarge": { + InstanceType: "c6i.32xlarge", + VCPU: 128, + MemoryMb: 262144, + GPU: 0, + }, + "c6i.4xlarge": { + InstanceType: "c6i.4xlarge", + VCPU: 16, + MemoryMb: 32768, + GPU: 0, + }, + "c6i.8xlarge": { + InstanceType: "c6i.8xlarge", + VCPU: 32, + MemoryMb: 65536, + GPU: 0, + }, + "c6i.large": { + InstanceType: "c6i.large", + VCPU: 2, + MemoryMb: 4096, + GPU: 0, + }, + "c6i.metal": { + InstanceType: "c6i.metal", + VCPU: 128, + MemoryMb: 262144, + GPU: 0, + }, + "c6i.xlarge": { + InstanceType: "c6i.xlarge", + VCPU: 4, + MemoryMb: 8192, + GPU: 0, + }, + "c6id": { + InstanceType: "c6id", + VCPU: 128, + MemoryMb: 0, + GPU: 0, + }, + "c6id.12xlarge": { + InstanceType: "c6id.12xlarge", + VCPU: 48, + MemoryMb: 98304, + GPU: 0, + }, + "c6id.16xlarge": { + InstanceType: "c6id.16xlarge", + VCPU: 64, + MemoryMb: 131072, + GPU: 0, + }, + "c6id.24xlarge": { + InstanceType: "c6id.24xlarge", + VCPU: 96, + MemoryMb: 196608, + GPU: 0, + }, + "c6id.2xlarge": { + InstanceType: "c6id.2xlarge", + VCPU: 8, + MemoryMb: 16384, + GPU: 0, + }, + "c6id.32xlarge": { + InstanceType: "c6id.32xlarge", + VCPU: 128, + MemoryMb: 262144, + GPU: 0, + }, + "c6id.4xlarge": { + InstanceType: "c6id.4xlarge", + VCPU: 16, + MemoryMb: 32768, + GPU: 0, + }, + "c6id.8xlarge": { + InstanceType: "c6id.8xlarge", + VCPU: 32, + MemoryMb: 65536, + GPU: 0, + }, + "c6id.large": { + InstanceType: "c6id.large", + VCPU: 2, + MemoryMb: 4096, + GPU: 0, + }, + "c6id.metal": { + InstanceType: "c6id.metal", + VCPU: 128, + MemoryMb: 262144, + GPU: 0, + }, + "c6id.xlarge": { + InstanceType: "c6id.xlarge", + VCPU: 4, + MemoryMb: 8192, + GPU: 0, + }, + "c7g": { + InstanceType: "c7g", + VCPU: 64, + MemoryMb: 0, + GPU: 0, + }, + "c7g.12xlarge": { + InstanceType: "c7g.12xlarge", + VCPU: 48, + MemoryMb: 98304, + GPU: 0, + }, + "c7g.16xlarge": { + InstanceType: "c7g.16xlarge", + VCPU: 64, + MemoryMb: 131072, + GPU: 0, + }, + "c7g.2xlarge": { + InstanceType: "c7g.2xlarge", + VCPU: 8, + MemoryMb: 16384, + GPU: 0, + }, + "c7g.4xlarge": { + InstanceType: "c7g.4xlarge", + VCPU: 16, + MemoryMb: 32768, + GPU: 0, + }, + "c7g.8xlarge": { + InstanceType: "c7g.8xlarge", + VCPU: 32, + MemoryMb: 65536, + GPU: 0, + }, + "c7g.large": { + InstanceType: "c7g.large", + VCPU: 2, + MemoryMb: 4096, + GPU: 0, + }, + "c7g.medium": { + InstanceType: "c7g.medium", + VCPU: 1, + MemoryMb: 2048, + GPU: 0, + }, + "c7g.xlarge": { + InstanceType: "c7g.xlarge", + VCPU: 4, + MemoryMb: 8192, + GPU: 0, + }, "cc2.8xlarge": { InstanceType: "cc2.8xlarge", VCPU: 32, @@ -640,6 +958,18 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 16384, GPU: 0, }, + "dl1": { + InstanceType: "dl1", + VCPU: 96, + MemoryMb: 0, + GPU: 0, + }, + "dl1.24xlarge": { + InstanceType: "dl1.24xlarge", + VCPU: 96, + MemoryMb: 786432, + GPU: 0, + }, "f1": { InstanceType: "f1", VCPU: 64, @@ -709,55 +1039,187 @@ var InstanceTypes = map[string]*InstanceType{ "g3s.xlarge": { InstanceType: "g3s.xlarge", VCPU: 4, - MemoryMb: 31232, + MemoryMb: 31232, + GPU: 1, + }, + "g4ad": { + InstanceType: "g4ad", + VCPU: 192, + MemoryMb: 0, + GPU: 2, + }, + "g4ad.16xlarge": { + InstanceType: "g4ad.16xlarge", + VCPU: 64, + MemoryMb: 262144, + GPU: 4, + }, + "g4ad.2xlarge": { + InstanceType: "g4ad.2xlarge", + VCPU: 8, + MemoryMb: 32768, + GPU: 1, + }, + "g4ad.4xlarge": { + InstanceType: "g4ad.4xlarge", + VCPU: 16, + MemoryMb: 65536, + GPU: 1, + }, + "g4ad.8xlarge": { + InstanceType: "g4ad.8xlarge", + VCPU: 32, + MemoryMb: 131072, + GPU: 2, + }, + "g4ad.xlarge": { + InstanceType: "g4ad.xlarge", + VCPU: 4, + MemoryMb: 16384, + GPU: 1, + }, + "g4dn": { + InstanceType: "g4dn", + VCPU: 96, + MemoryMb: 0, + GPU: 8, + }, + "g4dn.12xlarge": { + InstanceType: "g4dn.12xlarge", + VCPU: 48, + MemoryMb: 196608, + GPU: 4, + }, + "g4dn.16xlarge": { + InstanceType: "g4dn.16xlarge", + VCPU: 64, + MemoryMb: 262144, + GPU: 1, + }, + "g4dn.2xlarge": { + InstanceType: "g4dn.2xlarge", + VCPU: 8, + MemoryMb: 32768, + GPU: 1, + }, + "g4dn.4xlarge": { + InstanceType: "g4dn.4xlarge", + VCPU: 16, + MemoryMb: 65536, + GPU: 1, + }, + "g4dn.8xlarge": { + InstanceType: "g4dn.8xlarge", + VCPU: 32, + MemoryMb: 131072, + GPU: 1, + }, + "g4dn.metal": { + InstanceType: "g4dn.metal", + VCPU: 96, + MemoryMb: 393216, + GPU: 8, + }, + "g4dn.xlarge": { + InstanceType: "g4dn.xlarge", + VCPU: 4, + MemoryMb: 16384, + GPU: 1, + }, + "g5": { + InstanceType: "g5", + VCPU: 192, + MemoryMb: 0, + GPU: 8, + }, + "g5.12xlarge": { + InstanceType: "g5.12xlarge", + VCPU: 48, + MemoryMb: 196608, + GPU: 4, + }, + "g5.16xlarge": { + InstanceType: "g5.16xlarge", + VCPU: 64, + MemoryMb: 262144, + GPU: 1, + }, + "g5.24xlarge": { + InstanceType: "g5.24xlarge", + VCPU: 96, + MemoryMb: 393216, + GPU: 4, + }, + "g5.2xlarge": { + InstanceType: "g5.2xlarge", + VCPU: 8, + MemoryMb: 32768, + GPU: 1, + }, + "g5.48xlarge": { + InstanceType: "g5.48xlarge", + VCPU: 192, + MemoryMb: 786432, + GPU: 8, + }, + "g5.4xlarge": { + InstanceType: "g5.4xlarge", + VCPU: 16, + MemoryMb: 65536, + GPU: 1, + }, + "g5.8xlarge": { + InstanceType: "g5.8xlarge", + VCPU: 32, + MemoryMb: 131072, + GPU: 1, + }, + "g5.xlarge": { + InstanceType: "g5.xlarge", + VCPU: 4, + MemoryMb: 16384, GPU: 1, }, - "g4dn": { - InstanceType: "g4dn", - VCPU: 96, + "g5g": { + InstanceType: "g5g", + VCPU: 64, MemoryMb: 0, - GPU: 8, - }, - "g4dn.12xlarge": { - InstanceType: "g4dn.12xlarge", - VCPU: 48, - MemoryMb: 196608, - GPU: 4, + GPU: 2, }, - "g4dn.16xlarge": { - InstanceType: "g4dn.16xlarge", + "g5g.16xlarge": { + InstanceType: "g5g.16xlarge", VCPU: 64, - MemoryMb: 262144, - GPU: 1, + MemoryMb: 131072, + GPU: 2, }, - "g4dn.2xlarge": { - InstanceType: "g4dn.2xlarge", + "g5g.2xlarge": { + InstanceType: "g5g.2xlarge", VCPU: 8, - MemoryMb: 32768, + MemoryMb: 16384, GPU: 1, }, - "g4dn.4xlarge": { - InstanceType: "g4dn.4xlarge", + "g5g.4xlarge": { + InstanceType: "g5g.4xlarge", VCPU: 16, - MemoryMb: 65536, + MemoryMb: 32768, GPU: 1, }, - "g4dn.8xlarge": { - InstanceType: "g4dn.8xlarge", + "g5g.8xlarge": { + InstanceType: "g5g.8xlarge", VCPU: 32, - MemoryMb: 131072, + MemoryMb: 65536, GPU: 1, }, - "g4dn.metal": { - InstanceType: "g4dn.metal", - VCPU: 96, - MemoryMb: 393216, - GPU: 8, + "g5g.metal": { + InstanceType: "g5g.metal", + VCPU: 64, + MemoryMb: 131072, + GPU: 2, }, - "g4dn.xlarge": { - InstanceType: "g4dn.xlarge", + "g5g.xlarge": { + InstanceType: "g5g.xlarge", VCPU: 4, - MemoryMb: 16384, + MemoryMb: 8192, GPU: 1, }, "h1": { @@ -790,6 +1252,12 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 131072, GPU: 0, }, + "hpc6a.48xlarge": { + InstanceType: "hpc6a.48xlarge", + VCPU: 96, + MemoryMb: 393216, + GPU: 0, + }, "hs1.8xlarge": { InstanceType: "hs1.8xlarge", VCPU: 16, @@ -820,6 +1288,12 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 249856, GPU: 0, }, + "i2.large": { + InstanceType: "i2.large", + VCPU: 2, + MemoryMb: 15360, + GPU: 0, + }, "i2.xlarge": { InstanceType: "i2.xlarge", VCPU: 4, @@ -934,6 +1408,102 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 499712, GPU: 0, }, + "i4i": { + InstanceType: "i4i", + VCPU: 128, + MemoryMb: 0, + GPU: 0, + }, + "i4i.16xlarge": { + InstanceType: "i4i.16xlarge", + VCPU: 64, + MemoryMb: 524288, + GPU: 0, + }, + "i4i.2xlarge": { + InstanceType: "i4i.2xlarge", + VCPU: 8, + MemoryMb: 65536, + GPU: 0, + }, + "i4i.32xlarge": { + InstanceType: "i4i.32xlarge", + VCPU: 128, + MemoryMb: 1048576, + GPU: 0, + }, + "i4i.4xlarge": { + InstanceType: "i4i.4xlarge", + VCPU: 16, + MemoryMb: 131072, + GPU: 0, + }, + "i4i.8xlarge": { + InstanceType: "i4i.8xlarge", + VCPU: 32, + MemoryMb: 262144, + GPU: 0, + }, + "i4i.large": { + InstanceType: "i4i.large", + VCPU: 2, + MemoryMb: 16384, + GPU: 0, + }, + "i4i.metal": { + InstanceType: "i4i.metal", + VCPU: 128, + MemoryMb: 1048576, + GPU: 0, + }, + "i4i.xlarge": { + InstanceType: "i4i.xlarge", + VCPU: 4, + MemoryMb: 32768, + GPU: 0, + }, + "im4gn": { + InstanceType: "im4gn", + VCPU: 64, + MemoryMb: 0, + GPU: 0, + }, + "im4gn.16xlarge": { + InstanceType: "im4gn.16xlarge", + VCPU: 64, + MemoryMb: 262144, + GPU: 0, + }, + "im4gn.2xlarge": { + InstanceType: "im4gn.2xlarge", + VCPU: 8, + MemoryMb: 32768, + GPU: 0, + }, + "im4gn.4xlarge": { + InstanceType: "im4gn.4xlarge", + VCPU: 16, + MemoryMb: 65536, + GPU: 0, + }, + "im4gn.8xlarge": { + InstanceType: "im4gn.8xlarge", + VCPU: 32, + MemoryMb: 131072, + GPU: 0, + }, + "im4gn.large": { + InstanceType: "im4gn.large", + VCPU: 2, + MemoryMb: 8192, + GPU: 0, + }, + "im4gn.xlarge": { + InstanceType: "im4gn.xlarge", + VCPU: 4, + MemoryMb: 16384, + GPU: 0, + }, "inf1": { InstanceType: "inf1", VCPU: 96, @@ -964,6 +1534,42 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 8192, GPU: 0, }, + "is4gen.2xlarge": { + InstanceType: "is4gen.2xlarge", + VCPU: 8, + MemoryMb: 49152, + GPU: 0, + }, + "is4gen.4xlarge": { + InstanceType: "is4gen.4xlarge", + VCPU: 16, + MemoryMb: 98304, + GPU: 0, + }, + "is4gen.8xlarge": { + InstanceType: "is4gen.8xlarge", + VCPU: 32, + MemoryMb: 196608, + GPU: 0, + }, + "is4gen.large": { + InstanceType: "is4gen.large", + VCPU: 2, + MemoryMb: 12288, + GPU: 0, + }, + "is4gen.medium": { + InstanceType: "is4gen.medium", + VCPU: 1, + MemoryMb: 6144, + GPU: 0, + }, + "is4gen.xlarge": { + InstanceType: "is4gen.xlarge", + VCPU: 4, + MemoryMb: 24576, + GPU: 0, + }, "m1.large": { InstanceType: "m1.large", VCPU: 2, @@ -1433,15 +2039,15 @@ var InstanceTypes = map[string]*InstanceType{ GPU: 0, }, "m5zn.2xlarge": { + InstanceType: "m5zn.2xlarge", VCPU: 8, MemoryMb: 32768, - InstanceType: "m5zn.2xlarge", GPU: 0, }, "m5zn.3xlarge": { + InstanceType: "m5zn.3xlarge", VCPU: 12, MemoryMb: 49152, - InstanceType: "m5zn.3xlarge", GPU: 0, }, "m5zn.6xlarge": { @@ -1456,6 +2062,78 @@ var InstanceTypes = map[string]*InstanceType{ InstanceType: "m5zn.12xlarge", GPU: 0, }, + "m6a": { + InstanceType: "m6a", + VCPU: 192, + MemoryMb: 0, + GPU: 0, + }, + "m6a.12xlarge": { + InstanceType: "m6a.12xlarge", + VCPU: 48, + MemoryMb: 196608, + GPU: 0, + }, + "m6a.16xlarge": { + InstanceType: "m6a.16xlarge", + VCPU: 64, + MemoryMb: 262144, + GPU: 0, + }, + "m6a.24xlarge": { + InstanceType: "m6a.24xlarge", + VCPU: 96, + MemoryMb: 393216, + GPU: 0, + }, + "m6a.2xlarge": { + InstanceType: "m6a.2xlarge", + VCPU: 8, + MemoryMb: 32768, + GPU: 0, + }, + "m6a.32xlarge": { + InstanceType: "m6a.32xlarge", + VCPU: 128, + MemoryMb: 524288, + GPU: 0, + }, + "m6a.48xlarge": { + InstanceType: "m6a.48xlarge", + VCPU: 192, + MemoryMb: 786432, + GPU: 0, + }, + "m6a.4xlarge": { + InstanceType: "m6a.4xlarge", + VCPU: 16, + MemoryMb: 65536, + GPU: 0, + }, + "m6a.8xlarge": { + InstanceType: "m6a.8xlarge", + VCPU: 32, + MemoryMb: 131072, + GPU: 0, + }, + "m6a.large": { + InstanceType: "m6a.large", + VCPU: 2, + MemoryMb: 8192, + GPU: 0, + }, + "m6a.metal": { + InstanceType: "m6a.metal", + VCPU: 192, + MemoryMb: 786432, + GPU: 0, + }, + "m6a.xlarge": { + InstanceType: "m6a.xlarge", + VCPU: 4, + MemoryMb: 16384, + GPU: 0, + }, "m6g": { InstanceType: "m6g", VCPU: 64, @@ -1504,74 +2182,152 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 4096, GPU: 0, }, - "m6g.metal": { - InstanceType: "m6g.metal", - VCPU: 64, - MemoryMb: 262144, + "m6g.metal": { + InstanceType: "m6g.metal", + VCPU: 64, + MemoryMb: 262144, + GPU: 0, + }, + "m6g.xlarge": { + InstanceType: "m6g.xlarge", + VCPU: 4, + MemoryMb: 16384, + GPU: 0, + }, + "m6gd": { + InstanceType: "m6gd", + VCPU: 64, + MemoryMb: 0, + GPU: 0, + }, + "m6gd.12xlarge": { + InstanceType: "m6gd.12xlarge", + VCPU: 48, + MemoryMb: 196608, + GPU: 0, + }, + "m6gd.16xlarge": { + InstanceType: "m6gd.16xlarge", + VCPU: 64, + MemoryMb: 262144, + GPU: 0, + }, + "m6gd.2xlarge": { + InstanceType: "m6gd.2xlarge", + VCPU: 8, + MemoryMb: 32768, + GPU: 0, + }, + "m6gd.4xlarge": { + InstanceType: "m6gd.4xlarge", + VCPU: 16, + MemoryMb: 65536, + GPU: 0, + }, + "m6gd.8xlarge": { + InstanceType: "m6gd.8xlarge", + VCPU: 32, + MemoryMb: 131072, + GPU: 0, + }, + "m6gd.large": { + InstanceType: "m6gd.large", + VCPU: 2, + MemoryMb: 8192, + GPU: 0, + }, + "m6gd.medium": { + InstanceType: "m6gd.medium", + VCPU: 1, + MemoryMb: 4096, + GPU: 0, + }, + "m6gd.metal": { + InstanceType: "m6gd.metal", + VCPU: 64, + MemoryMb: 262144, + GPU: 0, + }, + "m6gd.xlarge": { + InstanceType: "m6gd.xlarge", + VCPU: 4, + MemoryMb: 16384, + GPU: 0, + }, + "m6i.metal": { + InstanceType: "m6i.metal", + VCPU: 128, + MemoryMb: 524288, GPU: 0, }, - "m6g.xlarge": { - InstanceType: "m6g.xlarge", + "m6i.xlarge": { + InstanceType: "m6i.xlarge", VCPU: 4, MemoryMb: 16384, GPU: 0, }, - "m6gd": { - InstanceType: "m6gd", - VCPU: 64, + "m6id": { + InstanceType: "m6id", + VCPU: 128, MemoryMb: 0, GPU: 0, }, - "m6gd.12xlarge": { - InstanceType: "m6gd.12xlarge", + "m6id.12xlarge": { + InstanceType: "m6id.12xlarge", VCPU: 48, MemoryMb: 196608, GPU: 0, }, - "m6gd.16xlarge": { - InstanceType: "m6gd.16xlarge", + "m6id.16xlarge": { + InstanceType: "m6id.16xlarge", VCPU: 64, MemoryMb: 262144, GPU: 0, }, - "m6gd.2xlarge": { - InstanceType: "m6gd.2xlarge", + "m6id.24xlarge": { + InstanceType: "m6id.24xlarge", + VCPU: 96, + MemoryMb: 393216, + GPU: 0, + }, + "m6id.2xlarge": { + InstanceType: "m6id.2xlarge", VCPU: 8, MemoryMb: 32768, GPU: 0, }, - "m6gd.4xlarge": { - InstanceType: "m6gd.4xlarge", + "m6id.32xlarge": { + InstanceType: "m6id.32xlarge", + VCPU: 128, + MemoryMb: 524288, + GPU: 0, + }, + "m6id.4xlarge": { + InstanceType: "m6id.4xlarge", VCPU: 16, MemoryMb: 65536, GPU: 0, }, - "m6gd.8xlarge": { - InstanceType: "m6gd.8xlarge", + "m6id.8xlarge": { + InstanceType: "m6id.8xlarge", VCPU: 32, MemoryMb: 131072, GPU: 0, }, - "m6gd.large": { - InstanceType: "m6gd.large", + "m6id.large": { + InstanceType: "m6id.large", VCPU: 2, MemoryMb: 8192, GPU: 0, }, - "m6gd.medium": { - InstanceType: "m6gd.medium", - VCPU: 1, - MemoryMb: 4096, - GPU: 0, - }, - "m6gd.metal": { - InstanceType: "m6gd.metal", - VCPU: 64, - MemoryMb: 262144, + "m6id.metal": { + InstanceType: "m6id.metal", + VCPU: 128, + MemoryMb: 524288, GPU: 0, }, - "m6gd.xlarge": { - InstanceType: "m6gd.xlarge", + "m6id.xlarge": { + InstanceType: "m6id.xlarge", VCPU: 4, MemoryMb: 16384, GPU: 0, @@ -1630,9 +2386,15 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 8192, GPU: 0, }, - "m6i.xlarge": { - InstanceType: "m6i.xlarge", - VCPU: 4, + "mac2": { + InstanceType: "mac2", + VCPU: 12, + MemoryMb: 0, + GPU: 0, + }, + "mac2.metal": { + InstanceType: "mac2.metal", + VCPU: 12, MemoryMb: 16384, GPU: 0, }, @@ -1696,12 +2458,30 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 786432, GPU: 8, }, + "p4d": { + InstanceType: "p4d", + VCPU: 96, + MemoryMb: 0, + GPU: 8, + }, "p4d.24xlarge": { InstanceType: "p4d.24xlarge", VCPU: 96, MemoryMb: 1179648, GPU: 8, }, + "p4de": { + InstanceType: "p4de", + VCPU: 96, + MemoryMb: 0, + GPU: 8, + }, + "p4de.24xlarge": { + InstanceType: "p4de.24xlarge", + VCPU: 96, + MemoryMb: 1179648, + GPU: 8, + }, "r3": { InstanceType: "r3", VCPU: 32, @@ -2296,6 +3076,72 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 32768, GPU: 0, }, + "r6i": { + InstanceType: "r6i", + VCPU: 128, + MemoryMb: 0, + GPU: 0, + }, + "r6i.12xlarge": { + InstanceType: "r6i.12xlarge", + VCPU: 48, + MemoryMb: 393216, + GPU: 0, + }, + "r6i.16xlarge": { + InstanceType: "r6i.16xlarge", + VCPU: 64, + MemoryMb: 524288, + GPU: 0, + }, + "r6i.24xlarge": { + InstanceType: "r6i.24xlarge", + VCPU: 96, + MemoryMb: 786432, + GPU: 0, + }, + "r6i.2xlarge": { + InstanceType: "r6i.2xlarge", + VCPU: 8, + MemoryMb: 65536, + GPU: 0, + }, + "r6i.32xlarge": { + InstanceType: "r6i.32xlarge", + VCPU: 128, + MemoryMb: 1048576, + GPU: 0, + }, + "r6i.4xlarge": { + InstanceType: "r6i.4xlarge", + VCPU: 16, + MemoryMb: 131072, + GPU: 0, + }, + "r6i.8xlarge": { + InstanceType: "r6i.8xlarge", + VCPU: 32, + MemoryMb: 262144, + GPU: 0, + }, + "r6i.large": { + InstanceType: "r6i.large", + VCPU: 2, + MemoryMb: 16384, + GPU: 0, + }, + "r6i.metal": { + InstanceType: "r6i.metal", + VCPU: 128, + MemoryMb: 1048576, + GPU: 0, + }, + "r6i.xlarge": { + InstanceType: "r6i.xlarge", + VCPU: 4, + MemoryMb: 32768, + GPU: 0, + }, "t1.micro": { InstanceType: "t1.micro", VCPU: 1, @@ -2344,6 +3190,12 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 16384, GPU: 0, }, + "t3": { + InstanceType: "t3", + VCPU: 8, + MemoryMb: 0, + GPU: 0, + }, "t3.2xlarge": { InstanceType: "t3.2xlarge", VCPU: 8, @@ -2440,6 +3292,12 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 0, GPU: 0, }, + "u-12tb1.112xlarge": { + InstanceType: "u-12tb1.112xlarge", + VCPU: 448, + MemoryMb: 12582912, + GPU: 0, + }, "u-12tb1.metal": { InstanceType: "u-12tb1.metal", VCPU: 448, @@ -2500,22 +3358,40 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 0, GPU: 0, }, - "u-9tb1.metal": { - InstanceType: "u-9tb1.metal", + "u-9tb1.112xlarge": { + InstanceType: "u-9tb1.112xlarge", VCPU: 448, MemoryMb: 9437184, GPU: 0, }, - "u-9tb1.112xlarge": { + "u-9tb1.metal": { + InstanceType: "u-9tb1.metal", VCPU: 448, MemoryMb: 9437184, - InstanceType: "u-9tb1.112xlarge", GPU: 0, }, - "u-12tb1.112xlarge": { - VCPU: 448, - MemoryMb: 12582912, - InstanceType: "u-12tb1.112xlarge", + "vt1": { + InstanceType: "vt1", + VCPU: 96, + MemoryMb: 0, + GPU: 0, + }, + "vt1.24xlarge": { + InstanceType: "vt1.24xlarge", + VCPU: 96, + MemoryMb: 196608, + GPU: 0, + }, + "vt1.3xlarge": { + InstanceType: "vt1.3xlarge", + VCPU: 12, + MemoryMb: 24576, + GPU: 0, + }, + "vt1.6xlarge": { + InstanceType: "vt1.6xlarge", + VCPU: 24, + MemoryMb: 49152, GPU: 0, }, "x1": { @@ -2578,6 +3454,192 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 124928, GPU: 0, }, + "x2gd": { + InstanceType: "x2gd", + VCPU: 64, + MemoryMb: 0, + GPU: 0, + }, + "x2gd.12xlarge": { + InstanceType: "x2gd.12xlarge", + VCPU: 48, + MemoryMb: 786432, + GPU: 0, + }, + "x2gd.16xlarge": { + InstanceType: "x2gd.16xlarge", + VCPU: 64, + MemoryMb: 1048576, + GPU: 0, + }, + "x2gd.2xlarge": { + InstanceType: "x2gd.2xlarge", + VCPU: 8, + MemoryMb: 131072, + GPU: 0, + }, + "x2gd.4xlarge": { + InstanceType: "x2gd.4xlarge", + VCPU: 16, + MemoryMb: 262144, + GPU: 0, + }, + "x2gd.8xlarge": { + InstanceType: "x2gd.8xlarge", + VCPU: 32, + MemoryMb: 524288, + GPU: 0, + }, + "x2gd.large": { + InstanceType: "x2gd.large", + VCPU: 2, + MemoryMb: 32768, + GPU: 0, + }, + "x2gd.medium": { + InstanceType: "x2gd.medium", + VCPU: 1, + MemoryMb: 16384, + GPU: 0, + }, + "x2gd.metal": { + InstanceType: "x2gd.metal", + VCPU: 64, + MemoryMb: 1048576, + GPU: 0, + }, + "x2gd.xlarge": { + InstanceType: "x2gd.xlarge", + VCPU: 4, + MemoryMb: 65536, + GPU: 0, + }, + "x2idn": { + InstanceType: "x2idn", + VCPU: 128, + MemoryMb: 0, + GPU: 0, + }, + "x2idn.16xlarge": { + InstanceType: "x2idn.16xlarge", + VCPU: 64, + MemoryMb: 1048576, + GPU: 0, + }, + "x2idn.24xlarge": { + InstanceType: "x2idn.24xlarge", + VCPU: 96, + MemoryMb: 1572864, + GPU: 0, + }, + "x2idn.32xlarge": { + InstanceType: "x2idn.32xlarge", + VCPU: 128, + MemoryMb: 2097152, + GPU: 0, + }, + "x2idn.metal": { + InstanceType: "x2idn.metal", + VCPU: 128, + MemoryMb: 2097152, + GPU: 0, + }, + "x2iedn": { + InstanceType: "x2iedn", + VCPU: 128, + MemoryMb: 0, + GPU: 0, + }, + "x2iedn.16xlarge": { + InstanceType: "x2iedn.16xlarge", + VCPU: 64, + MemoryMb: 2097152, + GPU: 0, + }, + "x2iedn.24xlarge": { + InstanceType: "x2iedn.24xlarge", + VCPU: 96, + MemoryMb: 3145728, + GPU: 0, + }, + "x2iedn.2xlarge": { + InstanceType: "x2iedn.2xlarge", + VCPU: 8, + MemoryMb: 262144, + GPU: 0, + }, + "x2iedn.32xlarge": { + InstanceType: "x2iedn.32xlarge", + VCPU: 128, + MemoryMb: 4194304, + GPU: 0, + }, + "x2iedn.4xlarge": { + InstanceType: "x2iedn.4xlarge", + VCPU: 16, + MemoryMb: 524288, + GPU: 0, + }, + "x2iedn.8xlarge": { + InstanceType: "x2iedn.8xlarge", + VCPU: 32, + MemoryMb: 1048576, + GPU: 0, + }, + "x2iedn.metal": { + InstanceType: "x2iedn.metal", + VCPU: 128, + MemoryMb: 4194304, + GPU: 0, + }, + "x2iedn.xlarge": { + InstanceType: "x2iedn.xlarge", + VCPU: 4, + MemoryMb: 131072, + GPU: 0, + }, + "x2iezn": { + InstanceType: "x2iezn", + VCPU: 48, + MemoryMb: 0, + GPU: 0, + }, + "x2iezn.12xlarge": { + InstanceType: "x2iezn.12xlarge", + VCPU: 48, + MemoryMb: 1572864, + GPU: 0, + }, + "x2iezn.2xlarge": { + InstanceType: "x2iezn.2xlarge", + VCPU: 8, + MemoryMb: 262144, + GPU: 0, + }, + "x2iezn.4xlarge": { + InstanceType: "x2iezn.4xlarge", + VCPU: 16, + MemoryMb: 524288, + GPU: 0, + }, + "x2iezn.6xlarge": { + InstanceType: "x2iezn.6xlarge", + VCPU: 24, + MemoryMb: 786432, + GPU: 0, + }, + "x2iezn.8xlarge": { + InstanceType: "x2iezn.8xlarge", + VCPU: 32, + MemoryMb: 1048576, + GPU: 0, + }, + "x2iezn.metal": { + InstanceType: "x2iezn.metal", + VCPU: 48, + MemoryMb: 1572864, + GPU: 0, + }, "z1d": { InstanceType: "z1d", VCPU: 48, diff --git a/cluster-autoscaler/cloudprovider/azure/azure_instance_types.go b/cluster-autoscaler/cloudprovider/azure/azure_instance_types.go index 09d4fd6bdd2e..8a97fbdd6757 100644 --- a/cluster-autoscaler/cloudprovider/azure/azure_instance_types.go +++ b/cluster-autoscaler/cloudprovider/azure/azure_instance_types.go @@ -328,6 +328,12 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 65536, GPU: 0, }, + "Standard_D16_v5": { + InstanceType: "Standard_D16_v5", + VCPU: 16, + MemoryMb: 65536, + GPU: 0, + }, "Standard_D16a_v3": { InstanceType: "Standard_D16a_v3", VCPU: 16, @@ -340,6 +346,12 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 65536, GPU: 0, }, + "Standard_D16ads_v5": { + InstanceType: "Standard_D16ads_v5", + VCPU: 16, + MemoryMb: 65536, + GPU: 0, + }, "Standard_D16as_v3": { InstanceType: "Standard_D16as_v3", VCPU: 16, @@ -352,18 +364,36 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 65536, GPU: 0, }, + "Standard_D16as_v5": { + InstanceType: "Standard_D16as_v5", + VCPU: 16, + MemoryMb: 65536, + GPU: 0, + }, "Standard_D16d_v4": { InstanceType: "Standard_D16d_v4", VCPU: 16, MemoryMb: 65536, GPU: 0, }, + "Standard_D16d_v5": { + InstanceType: "Standard_D16d_v5", + VCPU: 16, + MemoryMb: 65536, + GPU: 0, + }, "Standard_D16ds_v4": { InstanceType: "Standard_D16ds_v4", VCPU: 16, MemoryMb: 65536, GPU: 0, }, + "Standard_D16ds_v5": { + InstanceType: "Standard_D16ds_v5", + VCPU: 16, + MemoryMb: 65536, + GPU: 0, + }, "Standard_D16s_v3": { InstanceType: "Standard_D16s_v3", VCPU: 16, @@ -376,6 +406,12 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 65536, GPU: 0, }, + "Standard_D16s_v5": { + InstanceType: "Standard_D16s_v5", + VCPU: 16, + MemoryMb: 65536, + GPU: 0, + }, "Standard_D1_v2": { InstanceType: "Standard_D1_v2", VCPU: 1, @@ -412,6 +448,12 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 8192, GPU: 0, }, + "Standard_D2_v5": { + InstanceType: "Standard_D2_v5", + VCPU: 2, + MemoryMb: 8192, + GPU: 0, + }, "Standard_D2a_v3": { InstanceType: "Standard_D2a_v3", VCPU: 2, @@ -424,6 +466,12 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 8192, GPU: 0, }, + "Standard_D2ads_v5": { + InstanceType: "Standard_D2ads_v5", + VCPU: 2, + MemoryMb: 8192, + GPU: 0, + }, "Standard_D2as_v3": { InstanceType: "Standard_D2as_v3", VCPU: 2, @@ -436,18 +484,36 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 8192, GPU: 0, }, + "Standard_D2as_v5": { + InstanceType: "Standard_D2as_v5", + VCPU: 2, + MemoryMb: 8192, + GPU: 0, + }, "Standard_D2d_v4": { InstanceType: "Standard_D2d_v4", VCPU: 2, MemoryMb: 8192, GPU: 0, }, + "Standard_D2d_v5": { + InstanceType: "Standard_D2d_v5", + VCPU: 2, + MemoryMb: 8192, + GPU: 0, + }, "Standard_D2ds_v4": { InstanceType: "Standard_D2ds_v4", VCPU: 2, MemoryMb: 8192, GPU: 0, }, + "Standard_D2ds_v5": { + InstanceType: "Standard_D2ds_v5", + VCPU: 2, + MemoryMb: 8192, + GPU: 0, + }, "Standard_D2s_v3": { InstanceType: "Standard_D2s_v3", VCPU: 2, @@ -460,6 +526,12 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 8192, GPU: 0, }, + "Standard_D2s_v5": { + InstanceType: "Standard_D2s_v5", + VCPU: 2, + MemoryMb: 8192, + GPU: 0, + }, "Standard_D3": { InstanceType: "Standard_D3", VCPU: 4, @@ -478,6 +550,12 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 131072, GPU: 0, }, + "Standard_D32_v5": { + InstanceType: "Standard_D32_v5", + VCPU: 32, + MemoryMb: 131072, + GPU: 0, + }, "Standard_D32a_v3": { InstanceType: "Standard_D32a_v3", VCPU: 32, @@ -490,6 +568,12 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 131072, GPU: 0, }, + "Standard_D32ads_v5": { + InstanceType: "Standard_D32ads_v5", + VCPU: 32, + MemoryMb: 131072, + GPU: 0, + }, "Standard_D32as_v3": { InstanceType: "Standard_D32as_v3", VCPU: 32, @@ -502,18 +586,36 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 131072, GPU: 0, }, + "Standard_D32as_v5": { + InstanceType: "Standard_D32as_v5", + VCPU: 32, + MemoryMb: 131072, + GPU: 0, + }, "Standard_D32d_v4": { InstanceType: "Standard_D32d_v4", VCPU: 32, MemoryMb: 131072, GPU: 0, }, + "Standard_D32d_v5": { + InstanceType: "Standard_D32d_v5", + VCPU: 32, + MemoryMb: 131072, + GPU: 0, + }, "Standard_D32ds_v4": { InstanceType: "Standard_D32ds_v4", VCPU: 32, MemoryMb: 131072, GPU: 0, }, + "Standard_D32ds_v5": { + InstanceType: "Standard_D32ds_v5", + VCPU: 32, + MemoryMb: 131072, + GPU: 0, + }, "Standard_D32s_v3": { InstanceType: "Standard_D32s_v3", VCPU: 32, @@ -526,6 +628,12 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 131072, GPU: 0, }, + "Standard_D32s_v5": { + InstanceType: "Standard_D32s_v5", + VCPU: 32, + MemoryMb: 131072, + GPU: 0, + }, "Standard_D3_v2": { InstanceType: "Standard_D3_v2", VCPU: 4, @@ -556,6 +664,12 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 196608, GPU: 0, }, + "Standard_D48_v5": { + InstanceType: "Standard_D48_v5", + VCPU: 48, + MemoryMb: 196608, + GPU: 0, + }, "Standard_D48a_v3": { InstanceType: "Standard_D48a_v3", VCPU: 48, @@ -568,6 +682,12 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 196608, GPU: 0, }, + "Standard_D48ads_v5": { + InstanceType: "Standard_D48ads_v5", + VCPU: 48, + MemoryMb: 196608, + GPU: 0, + }, "Standard_D48as_v3": { InstanceType: "Standard_D48as_v3", VCPU: 48, @@ -580,18 +700,36 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 196608, GPU: 0, }, + "Standard_D48as_v5": { + InstanceType: "Standard_D48as_v5", + VCPU: 48, + MemoryMb: 196608, + GPU: 0, + }, "Standard_D48d_v4": { InstanceType: "Standard_D48d_v4", VCPU: 48, MemoryMb: 196608, GPU: 0, }, + "Standard_D48d_v5": { + InstanceType: "Standard_D48d_v5", + VCPU: 48, + MemoryMb: 196608, + GPU: 0, + }, "Standard_D48ds_v4": { InstanceType: "Standard_D48ds_v4", VCPU: 48, MemoryMb: 196608, GPU: 0, }, + "Standard_D48ds_v5": { + InstanceType: "Standard_D48ds_v5", + VCPU: 48, + MemoryMb: 196608, + GPU: 0, + }, "Standard_D48s_v3": { InstanceType: "Standard_D48s_v3", VCPU: 48, @@ -604,6 +742,12 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 196608, GPU: 0, }, + "Standard_D48s_v5": { + InstanceType: "Standard_D48s_v5", + VCPU: 48, + MemoryMb: 196608, + GPU: 0, + }, "Standard_D4_v2": { InstanceType: "Standard_D4_v2", VCPU: 8, @@ -628,6 +772,12 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 16384, GPU: 0, }, + "Standard_D4_v5": { + InstanceType: "Standard_D4_v5", + VCPU: 4, + MemoryMb: 16384, + GPU: 0, + }, "Standard_D4a_v3": { InstanceType: "Standard_D4a_v3", VCPU: 4, @@ -640,6 +790,12 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 16384, GPU: 0, }, + "Standard_D4ads_v5": { + InstanceType: "Standard_D4ads_v5", + VCPU: 4, + MemoryMb: 16384, + GPU: 0, + }, "Standard_D4as_v3": { InstanceType: "Standard_D4as_v3", VCPU: 4, @@ -652,18 +808,36 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 16384, GPU: 0, }, + "Standard_D4as_v5": { + InstanceType: "Standard_D4as_v5", + VCPU: 4, + MemoryMb: 16384, + GPU: 0, + }, "Standard_D4d_v4": { InstanceType: "Standard_D4d_v4", VCPU: 4, MemoryMb: 16384, GPU: 0, }, + "Standard_D4d_v5": { + InstanceType: "Standard_D4d_v5", + VCPU: 4, + MemoryMb: 16384, + GPU: 0, + }, "Standard_D4ds_v4": { InstanceType: "Standard_D4ds_v4", VCPU: 4, MemoryMb: 16384, GPU: 0, }, + "Standard_D4ds_v5": { + InstanceType: "Standard_D4ds_v5", + VCPU: 4, + MemoryMb: 16384, + GPU: 0, + }, "Standard_D4s_v3": { InstanceType: "Standard_D4s_v3", VCPU: 4, @@ -676,6 +850,12 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 16384, GPU: 0, }, + "Standard_D4s_v5": { + InstanceType: "Standard_D4s_v5", + VCPU: 4, + MemoryMb: 16384, + GPU: 0, + }, "Standard_D5_v2": { InstanceType: "Standard_D5_v2", VCPU: 16, @@ -700,6 +880,12 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 262144, GPU: 0, }, + "Standard_D64_v5": { + InstanceType: "Standard_D64_v5", + VCPU: 64, + MemoryMb: 262144, + GPU: 0, + }, "Standard_D64a_v3": { InstanceType: "Standard_D64a_v3", VCPU: 64, @@ -712,6 +898,12 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 262144, GPU: 0, }, + "Standard_D64ads_v5": { + InstanceType: "Standard_D64ads_v5", + VCPU: 64, + MemoryMb: 262144, + GPU: 0, + }, "Standard_D64as_v3": { InstanceType: "Standard_D64as_v3", VCPU: 64, @@ -724,18 +916,36 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 262144, GPU: 0, }, + "Standard_D64as_v5": { + InstanceType: "Standard_D64as_v5", + VCPU: 64, + MemoryMb: 262144, + GPU: 0, + }, "Standard_D64d_v4": { InstanceType: "Standard_D64d_v4", VCPU: 64, MemoryMb: 262144, GPU: 0, }, + "Standard_D64d_v5": { + InstanceType: "Standard_D64d_v5", + VCPU: 64, + MemoryMb: 262144, + GPU: 0, + }, "Standard_D64ds_v4": { InstanceType: "Standard_D64ds_v4", VCPU: 64, MemoryMb: 262144, GPU: 0, }, + "Standard_D64ds_v5": { + InstanceType: "Standard_D64ds_v5", + VCPU: 64, + MemoryMb: 262144, + GPU: 0, + }, "Standard_D64s_v3": { InstanceType: "Standard_D64s_v3", VCPU: 64, @@ -748,6 +958,12 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 262144, GPU: 0, }, + "Standard_D64s_v5": { + InstanceType: "Standard_D64s_v5", + VCPU: 64, + MemoryMb: 262144, + GPU: 0, + }, "Standard_D8_v3": { InstanceType: "Standard_D8_v3", VCPU: 8, @@ -760,6 +976,12 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 32768, GPU: 0, }, + "Standard_D8_v5": { + InstanceType: "Standard_D8_v5", + VCPU: 8, + MemoryMb: 32768, + GPU: 0, + }, "Standard_D8a_v3": { InstanceType: "Standard_D8a_v3", VCPU: 8, @@ -772,6 +994,12 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 32768, GPU: 0, }, + "Standard_D8ads_v5": { + InstanceType: "Standard_D8ads_v5", + VCPU: 8, + MemoryMb: 32768, + GPU: 0, + }, "Standard_D8as_v3": { InstanceType: "Standard_D8as_v3", VCPU: 8, @@ -784,18 +1012,36 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 32768, GPU: 0, }, + "Standard_D8as_v5": { + InstanceType: "Standard_D8as_v5", + VCPU: 8, + MemoryMb: 32768, + GPU: 0, + }, "Standard_D8d_v4": { InstanceType: "Standard_D8d_v4", VCPU: 8, MemoryMb: 32768, GPU: 0, }, + "Standard_D8d_v5": { + InstanceType: "Standard_D8d_v5", + VCPU: 8, + MemoryMb: 32768, + GPU: 0, + }, "Standard_D8ds_v4": { InstanceType: "Standard_D8ds_v4", VCPU: 8, MemoryMb: 32768, GPU: 0, }, + "Standard_D8ds_v5": { + InstanceType: "Standard_D8ds_v5", + VCPU: 8, + MemoryMb: 32768, + GPU: 0, + }, "Standard_D8s_v3": { InstanceType: "Standard_D8s_v3", VCPU: 8, @@ -808,94 +1054,328 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 32768, GPU: 0, }, + "Standard_D8s_v5": { + InstanceType: "Standard_D8s_v5", + VCPU: 8, + MemoryMb: 32768, + GPU: 0, + }, + "Standard_D96_v5": { + InstanceType: "Standard_D96_v5", + VCPU: 96, + MemoryMb: 393216, + GPU: 0, + }, "Standard_D96a_v4": { InstanceType: "Standard_D96a_v4", VCPU: 96, MemoryMb: 393216, GPU: 0, }, + "Standard_D96ads_v5": { + InstanceType: "Standard_D96ads_v5", + VCPU: 96, + MemoryMb: 393216, + GPU: 0, + }, "Standard_D96as_v4": { InstanceType: "Standard_D96as_v4", VCPU: 96, MemoryMb: 393216, GPU: 0, }, - "Standard_DC1s_v2": { - InstanceType: "Standard_DC1s_v2", - VCPU: 1, - MemoryMb: 4096, + "Standard_D96as_v5": { + InstanceType: "Standard_D96as_v5", + VCPU: 96, + MemoryMb: 393216, GPU: 0, }, - "Standard_DC2s": { - InstanceType: "Standard_DC2s", - VCPU: 2, - MemoryMb: 8192, + "Standard_D96d_v5": { + InstanceType: "Standard_D96d_v5", + VCPU: 96, + MemoryMb: 393216, GPU: 0, }, - "Standard_DC2s_v2": { - InstanceType: "Standard_DC2s_v2", - VCPU: 2, - MemoryMb: 8192, + "Standard_D96ds_v5": { + InstanceType: "Standard_D96ds_v5", + VCPU: 96, + MemoryMb: 393216, GPU: 0, }, - "Standard_DC4s": { - InstanceType: "Standard_DC4s", - VCPU: 4, - MemoryMb: 16384, + "Standard_D96s_v5": { + InstanceType: "Standard_D96s_v5", + VCPU: 96, + MemoryMb: 393216, GPU: 0, }, - "Standard_DC4s_v2": { - InstanceType: "Standard_DC4s_v2", - VCPU: 4, - MemoryMb: 16384, + "Standard_DC16ads_v5": { + InstanceType: "Standard_DC16ads_v5", + VCPU: 16, + MemoryMb: 65536, GPU: 0, }, - "Standard_DC8_v2": { - InstanceType: "Standard_DC8_v2", - VCPU: 8, - MemoryMb: 32768, + "Standard_DC16as_v5": { + InstanceType: "Standard_DC16as_v5", + VCPU: 16, + MemoryMb: 65536, GPU: 0, }, - "Standard_DS1": { - InstanceType: "Standard_DS1", - VCPU: 1, - MemoryMb: 3072, + "Standard_DC16ds_v3": { + InstanceType: "Standard_DC16ds_v3", + VCPU: 16, + MemoryMb: 131072, GPU: 0, }, - "Standard_DS11": { - InstanceType: "Standard_DS11", - VCPU: 2, - MemoryMb: 14336, + "Standard_DC16s_v3": { + InstanceType: "Standard_DC16s_v3", + VCPU: 16, + MemoryMb: 131072, GPU: 0, }, - "Standard_DS11-1_v2": { - InstanceType: "Standard_DS11-1_v2", - VCPU: 2, - MemoryMb: 14336, + "Standard_DC1ds_v3": { + InstanceType: "Standard_DC1ds_v3", + VCPU: 1, + MemoryMb: 8192, GPU: 0, }, - "Standard_DS11_v2": { - InstanceType: "Standard_DS11_v2", - VCPU: 2, - MemoryMb: 14336, + "Standard_DC1s_v2": { + InstanceType: "Standard_DC1s_v2", + VCPU: 1, + MemoryMb: 4096, GPU: 0, }, - "Standard_DS11_v2_Promo": { - InstanceType: "Standard_DS11_v2_Promo", - VCPU: 2, - MemoryMb: 14336, + "Standard_DC1s_v3": { + InstanceType: "Standard_DC1s_v3", + VCPU: 1, + MemoryMb: 8192, GPU: 0, }, - "Standard_DS12": { - InstanceType: "Standard_DS12", - VCPU: 4, - MemoryMb: 28672, + "Standard_DC24ds_v3": { + InstanceType: "Standard_DC24ds_v3", + VCPU: 24, + MemoryMb: 196608, GPU: 0, }, - "Standard_DS12-1_v2": { - InstanceType: "Standard_DS12-1_v2", - VCPU: 4, - MemoryMb: 28672, + "Standard_DC24s_v3": { + InstanceType: "Standard_DC24s_v3", + VCPU: 24, + MemoryMb: 196608, + GPU: 0, + }, + "Standard_DC2ads_v5": { + InstanceType: "Standard_DC2ads_v5", + VCPU: 2, + MemoryMb: 8192, + GPU: 0, + }, + "Standard_DC2as_v5": { + InstanceType: "Standard_DC2as_v5", + VCPU: 2, + MemoryMb: 8192, + GPU: 0, + }, + "Standard_DC2ds_v3": { + InstanceType: "Standard_DC2ds_v3", + VCPU: 2, + MemoryMb: 16384, + GPU: 0, + }, + "Standard_DC2s": { + InstanceType: "Standard_DC2s", + VCPU: 2, + MemoryMb: 8192, + GPU: 0, + }, + "Standard_DC2s_v2": { + InstanceType: "Standard_DC2s_v2", + VCPU: 2, + MemoryMb: 8192, + GPU: 0, + }, + "Standard_DC2s_v3": { + InstanceType: "Standard_DC2s_v3", + VCPU: 2, + MemoryMb: 16384, + GPU: 0, + }, + "Standard_DC32ads_v5": { + InstanceType: "Standard_DC32ads_v5", + VCPU: 32, + MemoryMb: 131072, + GPU: 0, + }, + "Standard_DC32as_v5": { + InstanceType: "Standard_DC32as_v5", + VCPU: 32, + MemoryMb: 131072, + GPU: 0, + }, + "Standard_DC32ds_v3": { + InstanceType: "Standard_DC32ds_v3", + VCPU: 32, + MemoryMb: 262144, + GPU: 0, + }, + "Standard_DC32s_v3": { + InstanceType: "Standard_DC32s_v3", + VCPU: 32, + MemoryMb: 262144, + GPU: 0, + }, + "Standard_DC48ads_v5": { + InstanceType: "Standard_DC48ads_v5", + VCPU: 48, + MemoryMb: 196608, + GPU: 0, + }, + "Standard_DC48as_v5": { + InstanceType: "Standard_DC48as_v5", + VCPU: 48, + MemoryMb: 196608, + GPU: 0, + }, + "Standard_DC48ds_v3": { + InstanceType: "Standard_DC48ds_v3", + VCPU: 48, + MemoryMb: 393216, + GPU: 0, + }, + "Standard_DC48s_v3": { + InstanceType: "Standard_DC48s_v3", + VCPU: 48, + MemoryMb: 393216, + GPU: 0, + }, + "Standard_DC4ads_v5": { + InstanceType: "Standard_DC4ads_v5", + VCPU: 4, + MemoryMb: 16384, + GPU: 0, + }, + "Standard_DC4as_v5": { + InstanceType: "Standard_DC4as_v5", + VCPU: 4, + MemoryMb: 16384, + GPU: 0, + }, + "Standard_DC4ds_v3": { + InstanceType: "Standard_DC4ds_v3", + VCPU: 4, + MemoryMb: 32768, + GPU: 0, + }, + "Standard_DC4s": { + InstanceType: "Standard_DC4s", + VCPU: 4, + MemoryMb: 16384, + GPU: 0, + }, + "Standard_DC4s_v2": { + InstanceType: "Standard_DC4s_v2", + VCPU: 4, + MemoryMb: 16384, + GPU: 0, + }, + "Standard_DC4s_v3": { + InstanceType: "Standard_DC4s_v3", + VCPU: 4, + MemoryMb: 32768, + GPU: 0, + }, + "Standard_DC64ads_v5": { + InstanceType: "Standard_DC64ads_v5", + VCPU: 64, + MemoryMb: 262144, + GPU: 0, + }, + "Standard_DC64as_v5": { + InstanceType: "Standard_DC64as_v5", + VCPU: 64, + MemoryMb: 262144, + GPU: 0, + }, + "Standard_DC8_v2": { + InstanceType: "Standard_DC8_v2", + VCPU: 8, + MemoryMb: 32768, + GPU: 0, + }, + "Standard_DC8ads_v5": { + InstanceType: "Standard_DC8ads_v5", + VCPU: 8, + MemoryMb: 32768, + GPU: 0, + }, + "Standard_DC8as_v5": { + InstanceType: "Standard_DC8as_v5", + VCPU: 8, + MemoryMb: 32768, + GPU: 0, + }, + "Standard_DC8ds_v3": { + InstanceType: "Standard_DC8ds_v3", + VCPU: 8, + MemoryMb: 65536, + GPU: 0, + }, + "Standard_DC8s_v3": { + InstanceType: "Standard_DC8s_v3", + VCPU: 8, + MemoryMb: 65536, + GPU: 0, + }, + "Standard_DC96ads_v5": { + InstanceType: "Standard_DC96ads_v5", + VCPU: 96, + MemoryMb: 393216, + GPU: 0, + }, + "Standard_DC96as_v5": { + InstanceType: "Standard_DC96as_v5", + VCPU: 96, + MemoryMb: 393216, + GPU: 0, + }, + "Standard_DS1": { + InstanceType: "Standard_DS1", + VCPU: 1, + MemoryMb: 3072, + GPU: 0, + }, + "Standard_DS11": { + InstanceType: "Standard_DS11", + VCPU: 2, + MemoryMb: 14336, + GPU: 0, + }, + "Standard_DS11-1_v2": { + InstanceType: "Standard_DS11-1_v2", + VCPU: 2, + MemoryMb: 14336, + GPU: 0, + }, + "Standard_DS11_v2": { + InstanceType: "Standard_DS11_v2", + VCPU: 2, + MemoryMb: 14336, + GPU: 0, + }, + "Standard_DS11_v2_Promo": { + InstanceType: "Standard_DS11_v2_Promo", + VCPU: 2, + MemoryMb: 14336, + GPU: 0, + }, + "Standard_DS12": { + InstanceType: "Standard_DS12", + VCPU: 4, + MemoryMb: 28672, + GPU: 0, + }, + "Standard_DS12-1_v2": { + InstanceType: "Standard_DS12-1_v2", + VCPU: 4, + MemoryMb: 28672, GPU: 0, }, "Standard_DS12-2_v2": { @@ -1054,18 +1534,60 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 57344, GPU: 0, }, + "Standard_E104i_v5": { + InstanceType: "Standard_E104i_v5", + VCPU: 104, + MemoryMb: 688128, + GPU: 0, + }, + "Standard_E104id_v5": { + InstanceType: "Standard_E104id_v5", + VCPU: 104, + MemoryMb: 688128, + GPU: 0, + }, + "Standard_E104ids_v5": { + InstanceType: "Standard_E104ids_v5", + VCPU: 104, + MemoryMb: 688128, + GPU: 0, + }, + "Standard_E104is_v5": { + InstanceType: "Standard_E104is_v5", + VCPU: 104, + MemoryMb: 688128, + GPU: 0, + }, + "Standard_E16-4ads_v5": { + InstanceType: "Standard_E16-4ads_v5", + VCPU: 16, + MemoryMb: 131072, + GPU: 0, + }, "Standard_E16-4as_v4": { InstanceType: "Standard_E16-4as_v4", VCPU: 16, MemoryMb: 131072, GPU: 0, }, + "Standard_E16-4as_v5": { + InstanceType: "Standard_E16-4as_v5", + VCPU: 16, + MemoryMb: 131072, + GPU: 0, + }, "Standard_E16-4ds_v4": { InstanceType: "Standard_E16-4ds_v4", VCPU: 16, MemoryMb: 131072, GPU: 0, }, + "Standard_E16-4ds_v5": { + InstanceType: "Standard_E16-4ds_v5", + VCPU: 16, + MemoryMb: 131072, + GPU: 0, + }, "Standard_E16-4s_v3": { InstanceType: "Standard_E16-4s_v3", VCPU: 16, @@ -1078,18 +1600,42 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 131072, GPU: 0, }, + "Standard_E16-4s_v5": { + InstanceType: "Standard_E16-4s_v5", + VCPU: 16, + MemoryMb: 131072, + GPU: 0, + }, + "Standard_E16-8ads_v5": { + InstanceType: "Standard_E16-8ads_v5", + VCPU: 16, + MemoryMb: 131072, + GPU: 0, + }, "Standard_E16-8as_v4": { InstanceType: "Standard_E16-8as_v4", VCPU: 16, MemoryMb: 131072, GPU: 0, }, + "Standard_E16-8as_v5": { + InstanceType: "Standard_E16-8as_v5", + VCPU: 16, + MemoryMb: 131072, + GPU: 0, + }, "Standard_E16-8ds_v4": { InstanceType: "Standard_E16-8ds_v4", VCPU: 16, MemoryMb: 131072, GPU: 0, }, + "Standard_E16-8ds_v5": { + InstanceType: "Standard_E16-8ds_v5", + VCPU: 16, + MemoryMb: 131072, + GPU: 0, + }, "Standard_E16-8s_v3": { InstanceType: "Standard_E16-8s_v3", VCPU: 16, @@ -1102,6 +1648,12 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 131072, GPU: 0, }, + "Standard_E16-8s_v5": { + InstanceType: "Standard_E16-8s_v5", + VCPU: 16, + MemoryMb: 131072, + GPU: 0, + }, "Standard_E16_v3": { InstanceType: "Standard_E16_v3", VCPU: 16, @@ -1114,30 +1666,60 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 131072, GPU: 0, }, + "Standard_E16_v5": { + InstanceType: "Standard_E16_v5", + VCPU: 16, + MemoryMb: 131072, + GPU: 0, + }, "Standard_E16a_v4": { InstanceType: "Standard_E16a_v4", VCPU: 16, MemoryMb: 131072, GPU: 0, }, + "Standard_E16ads_v5": { + InstanceType: "Standard_E16ads_v5", + VCPU: 16, + MemoryMb: 131072, + GPU: 0, + }, "Standard_E16as_v4": { InstanceType: "Standard_E16as_v4", VCPU: 16, MemoryMb: 131072, GPU: 0, }, + "Standard_E16as_v5": { + InstanceType: "Standard_E16as_v5", + VCPU: 16, + MemoryMb: 131072, + GPU: 0, + }, "Standard_E16d_v4": { InstanceType: "Standard_E16d_v4", VCPU: 16, MemoryMb: 131072, GPU: 0, }, + "Standard_E16d_v5": { + InstanceType: "Standard_E16d_v5", + VCPU: 16, + MemoryMb: 131072, + GPU: 0, + }, "Standard_E16ds_v4": { InstanceType: "Standard_E16ds_v4", VCPU: 16, MemoryMb: 131072, GPU: 0, }, + "Standard_E16ds_v5": { + InstanceType: "Standard_E16ds_v5", + VCPU: 16, + MemoryMb: 131072, + GPU: 0, + }, "Standard_E16s_v3": { InstanceType: "Standard_E16s_v3", VCPU: 16, @@ -1150,6 +1732,12 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 131072, GPU: 0, }, + "Standard_E16s_v5": { + InstanceType: "Standard_E16s_v5", + VCPU: 16, + MemoryMb: 131072, + GPU: 0, + }, "Standard_E20_v3": { InstanceType: "Standard_E20_v3", VCPU: 20, @@ -1162,30 +1750,60 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 163840, GPU: 0, }, + "Standard_E20_v5": { + InstanceType: "Standard_E20_v5", + VCPU: 20, + MemoryMb: 163840, + GPU: 0, + }, "Standard_E20a_v4": { InstanceType: "Standard_E20a_v4", VCPU: 20, MemoryMb: 163840, GPU: 0, }, + "Standard_E20ads_v5": { + InstanceType: "Standard_E20ads_v5", + VCPU: 20, + MemoryMb: 163840, + GPU: 0, + }, "Standard_E20as_v4": { InstanceType: "Standard_E20as_v4", VCPU: 20, MemoryMb: 163840, GPU: 0, }, + "Standard_E20as_v5": { + InstanceType: "Standard_E20as_v5", + VCPU: 20, + MemoryMb: 163840, + GPU: 0, + }, "Standard_E20d_v4": { InstanceType: "Standard_E20d_v4", VCPU: 20, MemoryMb: 163840, GPU: 0, }, + "Standard_E20d_v5": { + InstanceType: "Standard_E20d_v5", + VCPU: 20, + MemoryMb: 163840, + GPU: 0, + }, "Standard_E20ds_v4": { InstanceType: "Standard_E20ds_v4", VCPU: 20, MemoryMb: 163840, GPU: 0, }, + "Standard_E20ds_v5": { + InstanceType: "Standard_E20ds_v5", + VCPU: 20, + MemoryMb: 163840, + GPU: 0, + }, "Standard_E20s_v3": { InstanceType: "Standard_E20s_v3", VCPU: 20, @@ -1198,6 +1816,12 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 163840, GPU: 0, }, + "Standard_E20s_v5": { + InstanceType: "Standard_E20s_v5", + VCPU: 20, + MemoryMb: 163840, + GPU: 0, + }, "Standard_E2_v3": { InstanceType: "Standard_E2_v3", VCPU: 2, @@ -1210,30 +1834,60 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 16384, GPU: 0, }, + "Standard_E2_v5": { + InstanceType: "Standard_E2_v5", + VCPU: 2, + MemoryMb: 16384, + GPU: 0, + }, "Standard_E2a_v4": { InstanceType: "Standard_E2a_v4", VCPU: 2, MemoryMb: 16384, GPU: 0, }, + "Standard_E2ads_v5": { + InstanceType: "Standard_E2ads_v5", + VCPU: 2, + MemoryMb: 16384, + GPU: 0, + }, "Standard_E2as_v4": { InstanceType: "Standard_E2as_v4", VCPU: 2, MemoryMb: 16384, GPU: 0, }, + "Standard_E2as_v5": { + InstanceType: "Standard_E2as_v5", + VCPU: 2, + MemoryMb: 16384, + GPU: 0, + }, "Standard_E2d_v4": { InstanceType: "Standard_E2d_v4", VCPU: 2, MemoryMb: 16384, GPU: 0, }, + "Standard_E2d_v5": { + InstanceType: "Standard_E2d_v5", + VCPU: 2, + MemoryMb: 16384, + GPU: 0, + }, "Standard_E2ds_v4": { InstanceType: "Standard_E2ds_v4", VCPU: 2, MemoryMb: 16384, GPU: 0, }, + "Standard_E2ds_v5": { + InstanceType: "Standard_E2ds_v5", + VCPU: 2, + MemoryMb: 16384, + GPU: 0, + }, "Standard_E2s_v3": { InstanceType: "Standard_E2s_v3", VCPU: 2, @@ -1246,18 +1900,42 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 16384, GPU: 0, }, + "Standard_E2s_v5": { + InstanceType: "Standard_E2s_v5", + VCPU: 2, + MemoryMb: 16384, + GPU: 0, + }, + "Standard_E32-16ads_v5": { + InstanceType: "Standard_E32-16ads_v5", + VCPU: 32, + MemoryMb: 262144, + GPU: 0, + }, "Standard_E32-16as_v4": { InstanceType: "Standard_E32-16as_v4", VCPU: 32, MemoryMb: 262144, GPU: 0, }, + "Standard_E32-16as_v5": { + InstanceType: "Standard_E32-16as_v5", + VCPU: 32, + MemoryMb: 262144, + GPU: 0, + }, "Standard_E32-16ds_v4": { InstanceType: "Standard_E32-16ds_v4", VCPU: 32, MemoryMb: 262144, GPU: 0, }, + "Standard_E32-16ds_v5": { + InstanceType: "Standard_E32-16ds_v5", + VCPU: 32, + MemoryMb: 262144, + GPU: 0, + }, "Standard_E32-16s_v3": { InstanceType: "Standard_E32-16s_v3", VCPU: 32, @@ -1270,18 +1948,42 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 262144, GPU: 0, }, + "Standard_E32-16s_v5": { + InstanceType: "Standard_E32-16s_v5", + VCPU: 32, + MemoryMb: 262144, + GPU: 0, + }, + "Standard_E32-8ads_v5": { + InstanceType: "Standard_E32-8ads_v5", + VCPU: 32, + MemoryMb: 262144, + GPU: 0, + }, "Standard_E32-8as_v4": { InstanceType: "Standard_E32-8as_v4", VCPU: 32, MemoryMb: 262144, GPU: 0, }, + "Standard_E32-8as_v5": { + InstanceType: "Standard_E32-8as_v5", + VCPU: 32, + MemoryMb: 262144, + GPU: 0, + }, "Standard_E32-8ds_v4": { InstanceType: "Standard_E32-8ds_v4", VCPU: 32, MemoryMb: 262144, GPU: 0, }, + "Standard_E32-8ds_v5": { + InstanceType: "Standard_E32-8ds_v5", + VCPU: 32, + MemoryMb: 262144, + GPU: 0, + }, "Standard_E32-8s_v3": { InstanceType: "Standard_E32-8s_v3", VCPU: 32, @@ -1294,6 +1996,12 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 262144, GPU: 0, }, + "Standard_E32-8s_v5": { + InstanceType: "Standard_E32-8s_v5", + VCPU: 32, + MemoryMb: 262144, + GPU: 0, + }, "Standard_E32_v3": { InstanceType: "Standard_E32_v3", VCPU: 32, @@ -1306,30 +2014,60 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 262144, GPU: 0, }, + "Standard_E32_v5": { + InstanceType: "Standard_E32_v5", + VCPU: 32, + MemoryMb: 262144, + GPU: 0, + }, "Standard_E32a_v4": { InstanceType: "Standard_E32a_v4", VCPU: 32, MemoryMb: 262144, GPU: 0, }, + "Standard_E32ads_v5": { + InstanceType: "Standard_E32ads_v5", + VCPU: 32, + MemoryMb: 262144, + GPU: 0, + }, "Standard_E32as_v4": { InstanceType: "Standard_E32as_v4", VCPU: 32, MemoryMb: 262144, GPU: 0, }, + "Standard_E32as_v5": { + InstanceType: "Standard_E32as_v5", + VCPU: 32, + MemoryMb: 262144, + GPU: 0, + }, "Standard_E32d_v4": { InstanceType: "Standard_E32d_v4", VCPU: 32, MemoryMb: 262144, GPU: 0, }, + "Standard_E32d_v5": { + InstanceType: "Standard_E32d_v5", + VCPU: 32, + MemoryMb: 262144, + GPU: 0, + }, "Standard_E32ds_v4": { InstanceType: "Standard_E32ds_v4", VCPU: 32, MemoryMb: 262144, GPU: 0, }, + "Standard_E32ds_v5": { + InstanceType: "Standard_E32ds_v5", + VCPU: 32, + MemoryMb: 262144, + GPU: 0, + }, "Standard_E32s_v3": { InstanceType: "Standard_E32s_v3", VCPU: 32, @@ -1342,18 +2080,42 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 262144, GPU: 0, }, + "Standard_E32s_v5": { + InstanceType: "Standard_E32s_v5", + VCPU: 32, + MemoryMb: 262144, + GPU: 0, + }, + "Standard_E4-2ads_v5": { + InstanceType: "Standard_E4-2ads_v5", + VCPU: 4, + MemoryMb: 32768, + GPU: 0, + }, "Standard_E4-2as_v4": { InstanceType: "Standard_E4-2as_v4", VCPU: 4, MemoryMb: 32768, GPU: 0, }, + "Standard_E4-2as_v5": { + InstanceType: "Standard_E4-2as_v5", + VCPU: 4, + MemoryMb: 32768, + GPU: 0, + }, "Standard_E4-2ds_v4": { InstanceType: "Standard_E4-2ds_v4", VCPU: 4, MemoryMb: 32768, GPU: 0, }, + "Standard_E4-2ds_v5": { + InstanceType: "Standard_E4-2ds_v5", + VCPU: 4, + MemoryMb: 32768, + GPU: 0, + }, "Standard_E4-2s_v3": { InstanceType: "Standard_E4-2s_v3", VCPU: 4, @@ -1366,6 +2128,12 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 32768, GPU: 0, }, + "Standard_E4-2s_v5": { + InstanceType: "Standard_E4-2s_v5", + VCPU: 4, + MemoryMb: 32768, + GPU: 0, + }, "Standard_E48_v3": { InstanceType: "Standard_E48_v3", VCPU: 48, @@ -1378,30 +2146,60 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 393216, GPU: 0, }, + "Standard_E48_v5": { + InstanceType: "Standard_E48_v5", + VCPU: 48, + MemoryMb: 393216, + GPU: 0, + }, "Standard_E48a_v4": { InstanceType: "Standard_E48a_v4", VCPU: 48, MemoryMb: 393216, GPU: 0, }, + "Standard_E48ads_v5": { + InstanceType: "Standard_E48ads_v5", + VCPU: 48, + MemoryMb: 393216, + GPU: 0, + }, "Standard_E48as_v4": { InstanceType: "Standard_E48as_v4", VCPU: 48, MemoryMb: 393216, GPU: 0, }, + "Standard_E48as_v5": { + InstanceType: "Standard_E48as_v5", + VCPU: 48, + MemoryMb: 393216, + GPU: 0, + }, "Standard_E48d_v4": { InstanceType: "Standard_E48d_v4", VCPU: 48, MemoryMb: 393216, GPU: 0, }, + "Standard_E48d_v5": { + InstanceType: "Standard_E48d_v5", + VCPU: 48, + MemoryMb: 393216, + GPU: 0, + }, "Standard_E48ds_v4": { InstanceType: "Standard_E48ds_v4", VCPU: 48, MemoryMb: 393216, GPU: 0, }, + "Standard_E48ds_v5": { + InstanceType: "Standard_E48ds_v5", + VCPU: 48, + MemoryMb: 393216, + GPU: 0, + }, "Standard_E48s_v3": { InstanceType: "Standard_E48s_v3", VCPU: 48, @@ -1414,6 +2212,12 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 393216, GPU: 0, }, + "Standard_E48s_v5": { + InstanceType: "Standard_E48s_v5", + VCPU: 48, + MemoryMb: 393216, + GPU: 0, + }, "Standard_E4_v3": { InstanceType: "Standard_E4_v3", VCPU: 4, @@ -1426,30 +2230,60 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 32768, GPU: 0, }, + "Standard_E4_v5": { + InstanceType: "Standard_E4_v5", + VCPU: 4, + MemoryMb: 32768, + GPU: 0, + }, "Standard_E4a_v4": { InstanceType: "Standard_E4a_v4", VCPU: 4, MemoryMb: 32768, GPU: 0, }, + "Standard_E4ads_v5": { + InstanceType: "Standard_E4ads_v5", + VCPU: 4, + MemoryMb: 32768, + GPU: 0, + }, "Standard_E4as_v4": { InstanceType: "Standard_E4as_v4", VCPU: 4, MemoryMb: 32768, GPU: 0, }, + "Standard_E4as_v5": { + InstanceType: "Standard_E4as_v5", + VCPU: 4, + MemoryMb: 32768, + GPU: 0, + }, "Standard_E4d_v4": { InstanceType: "Standard_E4d_v4", VCPU: 4, MemoryMb: 32768, GPU: 0, }, + "Standard_E4d_v5": { + InstanceType: "Standard_E4d_v5", + VCPU: 4, + MemoryMb: 32768, + GPU: 0, + }, "Standard_E4ds_v4": { InstanceType: "Standard_E4ds_v4", VCPU: 4, MemoryMb: 32768, GPU: 0, }, + "Standard_E4ds_v5": { + InstanceType: "Standard_E4ds_v5", + VCPU: 4, + MemoryMb: 32768, + GPU: 0, + }, "Standard_E4s_v3": { InstanceType: "Standard_E4s_v3", VCPU: 4, @@ -1462,18 +2296,42 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 32768, GPU: 0, }, + "Standard_E4s_v5": { + InstanceType: "Standard_E4s_v5", + VCPU: 4, + MemoryMb: 32768, + GPU: 0, + }, + "Standard_E64-16ads_v5": { + InstanceType: "Standard_E64-16ads_v5", + VCPU: 64, + MemoryMb: 524288, + GPU: 0, + }, "Standard_E64-16as_v4": { InstanceType: "Standard_E64-16as_v4", VCPU: 64, MemoryMb: 524288, GPU: 0, }, + "Standard_E64-16as_v5": { + InstanceType: "Standard_E64-16as_v5", + VCPU: 64, + MemoryMb: 524288, + GPU: 0, + }, "Standard_E64-16ds_v4": { InstanceType: "Standard_E64-16ds_v4", VCPU: 64, MemoryMb: 516096, GPU: 0, }, + "Standard_E64-16ds_v5": { + InstanceType: "Standard_E64-16ds_v5", + VCPU: 64, + MemoryMb: 524288, + GPU: 0, + }, "Standard_E64-16s_v3": { InstanceType: "Standard_E64-16s_v3", VCPU: 64, @@ -1486,18 +2344,42 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 516096, GPU: 0, }, + "Standard_E64-16s_v5": { + InstanceType: "Standard_E64-16s_v5", + VCPU: 64, + MemoryMb: 524288, + GPU: 0, + }, + "Standard_E64-32ads_v5": { + InstanceType: "Standard_E64-32ads_v5", + VCPU: 64, + MemoryMb: 524288, + GPU: 0, + }, "Standard_E64-32as_v4": { InstanceType: "Standard_E64-32as_v4", VCPU: 64, MemoryMb: 524288, GPU: 0, }, + "Standard_E64-32as_v5": { + InstanceType: "Standard_E64-32as_v5", + VCPU: 64, + MemoryMb: 524288, + GPU: 0, + }, "Standard_E64-32ds_v4": { InstanceType: "Standard_E64-32ds_v4", VCPU: 64, MemoryMb: 516096, GPU: 0, }, + "Standard_E64-32ds_v5": { + InstanceType: "Standard_E64-32ds_v5", + VCPU: 64, + MemoryMb: 524288, + GPU: 0, + }, "Standard_E64-32s_v3": { InstanceType: "Standard_E64-32s_v3", VCPU: 64, @@ -1510,6 +2392,12 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 516096, GPU: 0, }, + "Standard_E64-32s_v5": { + InstanceType: "Standard_E64-32s_v5", + VCPU: 64, + MemoryMb: 524288, + GPU: 0, + }, "Standard_E64_v3": { InstanceType: "Standard_E64_v3", VCPU: 64, @@ -1522,30 +2410,60 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 516096, GPU: 0, }, + "Standard_E64_v5": { + InstanceType: "Standard_E64_v5", + VCPU: 64, + MemoryMb: 524288, + GPU: 0, + }, "Standard_E64a_v4": { InstanceType: "Standard_E64a_v4", VCPU: 64, MemoryMb: 524288, GPU: 0, }, + "Standard_E64ads_v5": { + InstanceType: "Standard_E64ads_v5", + VCPU: 64, + MemoryMb: 524288, + GPU: 0, + }, "Standard_E64as_v4": { InstanceType: "Standard_E64as_v4", VCPU: 64, MemoryMb: 524288, GPU: 0, }, + "Standard_E64as_v5": { + InstanceType: "Standard_E64as_v5", + VCPU: 64, + MemoryMb: 524288, + GPU: 0, + }, "Standard_E64d_v4": { InstanceType: "Standard_E64d_v4", VCPU: 64, MemoryMb: 516096, GPU: 0, }, + "Standard_E64d_v5": { + InstanceType: "Standard_E64d_v5", + VCPU: 64, + MemoryMb: 524288, + GPU: 0, + }, "Standard_E64ds_v4": { InstanceType: "Standard_E64ds_v4", VCPU: 64, MemoryMb: 516096, GPU: 0, }, + "Standard_E64ds_v5": { + InstanceType: "Standard_E64ds_v5", + VCPU: 64, + MemoryMb: 524288, + GPU: 0, + }, "Standard_E64i_v3": { InstanceType: "Standard_E64i_v3", VCPU: 64, @@ -1570,18 +2488,42 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 516096, GPU: 0, }, + "Standard_E64s_v5": { + InstanceType: "Standard_E64s_v5", + VCPU: 64, + MemoryMb: 524288, + GPU: 0, + }, + "Standard_E8-2ads_v5": { + InstanceType: "Standard_E8-2ads_v5", + VCPU: 8, + MemoryMb: 65536, + GPU: 0, + }, "Standard_E8-2as_v4": { InstanceType: "Standard_E8-2as_v4", VCPU: 8, MemoryMb: 65536, GPU: 0, }, + "Standard_E8-2as_v5": { + InstanceType: "Standard_E8-2as_v5", + VCPU: 8, + MemoryMb: 65536, + GPU: 0, + }, "Standard_E8-2ds_v4": { InstanceType: "Standard_E8-2ds_v4", VCPU: 8, MemoryMb: 65536, GPU: 0, }, + "Standard_E8-2ds_v5": { + InstanceType: "Standard_E8-2ds_v5", + VCPU: 8, + MemoryMb: 65536, + GPU: 0, + }, "Standard_E8-2s_v3": { InstanceType: "Standard_E8-2s_v3", VCPU: 8, @@ -1594,18 +2536,42 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 65536, GPU: 0, }, + "Standard_E8-2s_v5": { + InstanceType: "Standard_E8-2s_v5", + VCPU: 8, + MemoryMb: 65536, + GPU: 0, + }, + "Standard_E8-4ads_v5": { + InstanceType: "Standard_E8-4ads_v5", + VCPU: 8, + MemoryMb: 65536, + GPU: 0, + }, "Standard_E8-4as_v4": { InstanceType: "Standard_E8-4as_v4", VCPU: 8, MemoryMb: 65536, GPU: 0, }, + "Standard_E8-4as_v5": { + InstanceType: "Standard_E8-4as_v5", + VCPU: 8, + MemoryMb: 65536, + GPU: 0, + }, "Standard_E8-4ds_v4": { InstanceType: "Standard_E8-4ds_v4", VCPU: 8, MemoryMb: 65536, GPU: 0, }, + "Standard_E8-4ds_v5": { + InstanceType: "Standard_E8-4ds_v5", + VCPU: 8, + MemoryMb: 65536, + GPU: 0, + }, "Standard_E8-4s_v3": { InstanceType: "Standard_E8-4s_v3", VCPU: 8, @@ -1618,6 +2584,12 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 65536, GPU: 0, }, + "Standard_E8-4s_v5": { + InstanceType: "Standard_E8-4s_v5", + VCPU: 8, + MemoryMb: 65536, + GPU: 0, + }, "Standard_E80ids_v4": { InstanceType: "Standard_E80ids_v4", VCPU: 80, @@ -1642,30 +2614,60 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 65536, GPU: 0, }, + "Standard_E8_v5": { + InstanceType: "Standard_E8_v5", + VCPU: 8, + MemoryMb: 65536, + GPU: 0, + }, "Standard_E8a_v4": { InstanceType: "Standard_E8a_v4", VCPU: 8, MemoryMb: 65536, GPU: 0, }, + "Standard_E8ads_v5": { + InstanceType: "Standard_E8ads_v5", + VCPU: 8, + MemoryMb: 65536, + GPU: 0, + }, "Standard_E8as_v4": { InstanceType: "Standard_E8as_v4", VCPU: 8, MemoryMb: 65536, GPU: 0, }, + "Standard_E8as_v5": { + InstanceType: "Standard_E8as_v5", + VCPU: 8, + MemoryMb: 65536, + GPU: 0, + }, "Standard_E8d_v4": { InstanceType: "Standard_E8d_v4", VCPU: 8, MemoryMb: 65536, GPU: 0, }, + "Standard_E8d_v5": { + InstanceType: "Standard_E8d_v5", + VCPU: 8, + MemoryMb: 65536, + GPU: 0, + }, "Standard_E8ds_v4": { InstanceType: "Standard_E8ds_v4", VCPU: 8, MemoryMb: 65536, GPU: 0, }, + "Standard_E8ds_v5": { + InstanceType: "Standard_E8ds_v5", + VCPU: 8, + MemoryMb: 65536, + GPU: 0, + }, "Standard_E8s_v3": { InstanceType: "Standard_E8s_v3", VCPU: 8, @@ -1678,30 +2680,240 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 65536, GPU: 0, }, + "Standard_E8s_v5": { + InstanceType: "Standard_E8s_v5", + VCPU: 8, + MemoryMb: 65536, + GPU: 0, + }, + "Standard_E96-24ads_v5": { + InstanceType: "Standard_E96-24ads_v5", + VCPU: 96, + MemoryMb: 688128, + GPU: 0, + }, "Standard_E96-24as_v4": { InstanceType: "Standard_E96-24as_v4", VCPU: 96, MemoryMb: 688128, GPU: 0, }, + "Standard_E96-24as_v5": { + InstanceType: "Standard_E96-24as_v5", + VCPU: 96, + MemoryMb: 688128, + GPU: 0, + }, + "Standard_E96-24ds_v5": { + InstanceType: "Standard_E96-24ds_v5", + VCPU: 96, + MemoryMb: 688128, + GPU: 0, + }, + "Standard_E96-24s_v5": { + InstanceType: "Standard_E96-24s_v5", + VCPU: 96, + MemoryMb: 688128, + GPU: 0, + }, + "Standard_E96-48ads_v5": { + InstanceType: "Standard_E96-48ads_v5", + VCPU: 96, + MemoryMb: 688128, + GPU: 0, + }, "Standard_E96-48as_v4": { InstanceType: "Standard_E96-48as_v4", VCPU: 96, MemoryMb: 688128, GPU: 0, }, + "Standard_E96-48as_v5": { + InstanceType: "Standard_E96-48as_v5", + VCPU: 96, + MemoryMb: 688128, + GPU: 0, + }, + "Standard_E96-48ds_v5": { + InstanceType: "Standard_E96-48ds_v5", + VCPU: 96, + MemoryMb: 688128, + GPU: 0, + }, + "Standard_E96-48s_v5": { + InstanceType: "Standard_E96-48s_v5", + VCPU: 96, + MemoryMb: 688128, + GPU: 0, + }, + "Standard_E96_v5": { + InstanceType: "Standard_E96_v5", + VCPU: 96, + MemoryMb: 688128, + GPU: 0, + }, "Standard_E96a_v4": { InstanceType: "Standard_E96a_v4", VCPU: 96, MemoryMb: 688128, GPU: 0, }, + "Standard_E96ads_v5": { + InstanceType: "Standard_E96ads_v5", + VCPU: 96, + MemoryMb: 688128, + GPU: 0, + }, "Standard_E96as_v4": { InstanceType: "Standard_E96as_v4", VCPU: 96, MemoryMb: 688128, GPU: 0, }, + "Standard_E96as_v5": { + InstanceType: "Standard_E96as_v5", + VCPU: 96, + MemoryMb: 688128, + GPU: 0, + }, + "Standard_E96d_v5": { + InstanceType: "Standard_E96d_v5", + VCPU: 96, + MemoryMb: 688128, + GPU: 0, + }, + "Standard_E96ds_v5": { + InstanceType: "Standard_E96ds_v5", + VCPU: 96, + MemoryMb: 688128, + GPU: 0, + }, + "Standard_E96s_v5": { + InstanceType: "Standard_E96s_v5", + VCPU: 96, + MemoryMb: 688128, + GPU: 0, + }, + "Standard_EC16ads_v5": { + InstanceType: "Standard_EC16ads_v5", + VCPU: 16, + MemoryMb: 131072, + GPU: 0, + }, + "Standard_EC16as_v5": { + InstanceType: "Standard_EC16as_v5", + VCPU: 16, + MemoryMb: 131072, + GPU: 0, + }, + "Standard_EC20ads_v5": { + InstanceType: "Standard_EC20ads_v5", + VCPU: 20, + MemoryMb: 163840, + GPU: 0, + }, + "Standard_EC20as_v5": { + InstanceType: "Standard_EC20as_v5", + VCPU: 20, + MemoryMb: 163840, + GPU: 0, + }, + "Standard_EC2ads_v5": { + InstanceType: "Standard_EC2ads_v5", + VCPU: 2, + MemoryMb: 16384, + GPU: 0, + }, + "Standard_EC2as_v5": { + InstanceType: "Standard_EC2as_v5", + VCPU: 2, + MemoryMb: 16384, + GPU: 0, + }, + "Standard_EC32ads_v5": { + InstanceType: "Standard_EC32ads_v5", + VCPU: 32, + MemoryMb: 196608, + GPU: 0, + }, + "Standard_EC32as_v5": { + InstanceType: "Standard_EC32as_v5", + VCPU: 32, + MemoryMb: 196608, + GPU: 0, + }, + "Standard_EC48ads_v5": { + InstanceType: "Standard_EC48ads_v5", + VCPU: 48, + MemoryMb: 393216, + GPU: 0, + }, + "Standard_EC48as_v5": { + InstanceType: "Standard_EC48as_v5", + VCPU: 48, + MemoryMb: 393216, + GPU: 0, + }, + "Standard_EC4ads_v5": { + InstanceType: "Standard_EC4ads_v5", + VCPU: 4, + MemoryMb: 32768, + GPU: 0, + }, + "Standard_EC4as_v5": { + InstanceType: "Standard_EC4as_v5", + VCPU: 4, + MemoryMb: 32768, + GPU: 0, + }, + "Standard_EC64ads_v5": { + InstanceType: "Standard_EC64ads_v5", + VCPU: 64, + MemoryMb: 524288, + GPU: 0, + }, + "Standard_EC64as_v5": { + InstanceType: "Standard_EC64as_v5", + VCPU: 64, + MemoryMb: 524288, + GPU: 0, + }, + "Standard_EC8ads_v5": { + InstanceType: "Standard_EC8ads_v5", + VCPU: 8, + MemoryMb: 65536, + GPU: 0, + }, + "Standard_EC8as_v5": { + InstanceType: "Standard_EC8as_v5", + VCPU: 8, + MemoryMb: 65536, + GPU: 0, + }, + "Standard_EC96ads_v5": { + InstanceType: "Standard_EC96ads_v5", + VCPU: 96, + MemoryMb: 688128, + GPU: 0, + }, + "Standard_EC96as_v5": { + InstanceType: "Standard_EC96as_v5", + VCPU: 96, + MemoryMb: 688128, + GPU: 0, + }, + "Standard_EC96iads_v5": { + InstanceType: "Standard_EC96iads_v5", + VCPU: 96, + MemoryMb: 688128, + GPU: 0, + }, + "Standard_EC96ias_v5": { + InstanceType: "Standard_EC96ias_v5", + VCPU: 96, + MemoryMb: 688128, + GPU: 0, + }, "Standard_F1": { InstanceType: "Standard_F1", VCPU: 1, @@ -1810,6 +3022,36 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 16384, GPU: 0, }, + "Standard_FX12mds": { + InstanceType: "Standard_FX12mds", + VCPU: 12, + MemoryMb: 258048, + GPU: 0, + }, + "Standard_FX24mds": { + InstanceType: "Standard_FX24mds", + VCPU: 24, + MemoryMb: 516096, + GPU: 0, + }, + "Standard_FX36mds": { + InstanceType: "Standard_FX36mds", + VCPU: 36, + MemoryMb: 774144, + GPU: 0, + }, + "Standard_FX48mds": { + InstanceType: "Standard_FX48mds", + VCPU: 48, + MemoryMb: 1032192, + GPU: 0, + }, + "Standard_FX4mds": { + InstanceType: "Standard_FX4mds", + VCPU: 4, + MemoryMb: 86016, + GPU: 0, + }, "Standard_G1": { InstanceType: "Standard_G1", VCPU: 2, @@ -1966,10 +3208,34 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 114688, GPU: 0, }, + "Standard_HB120-16rs_v3": { + InstanceType: "Standard_HB120-16rs_v3", + VCPU: 120, + MemoryMb: 458752, + GPU: 0, + }, + "Standard_HB120-32rs_v3": { + InstanceType: "Standard_HB120-32rs_v3", + VCPU: 120, + MemoryMb: 458752, + GPU: 0, + }, + "Standard_HB120-64rs_v3": { + InstanceType: "Standard_HB120-64rs_v3", + VCPU: 120, + MemoryMb: 458752, + GPU: 0, + }, + "Standard_HB120-96rs_v3": { + InstanceType: "Standard_HB120-96rs_v3", + VCPU: 120, + MemoryMb: 458752, + GPU: 0, + }, "Standard_HB120rs_v2": { InstanceType: "Standard_HB120rs_v2", VCPU: 120, - MemoryMb: 479232, + MemoryMb: 466944, GPU: 0, }, "Standard_HB120rs_v3": { @@ -1981,13 +3247,13 @@ var InstanceTypes = map[string]*InstanceType{ "Standard_HB60rs": { InstanceType: "Standard_HB60rs", VCPU: 60, - MemoryMb: 228352, + MemoryMb: 233472, GPU: 0, }, "Standard_HC44rs": { InstanceType: "Standard_HC44rs", VCPU: 44, - MemoryMb: 334848, + MemoryMb: 360448, GPU: 0, }, "Standard_L16s": { @@ -2068,6 +3334,18 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 3891200, GPU: 0, }, + "Standard_M128dms_v2": { + InstanceType: "Standard_M128dms_v2", + VCPU: 128, + MemoryMb: 3985408, + GPU: 0, + }, + "Standard_M128ds_v2": { + InstanceType: "Standard_M128ds_v2", + VCPU: 128, + MemoryMb: 2097152, + GPU: 0, + }, "Standard_M128m": { InstanceType: "Standard_M128m", VCPU: 128, @@ -2080,12 +3358,24 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 3891200, GPU: 0, }, + "Standard_M128ms_v2": { + InstanceType: "Standard_M128ms_v2", + VCPU: 128, + MemoryMb: 3985408, + GPU: 0, + }, "Standard_M128s": { InstanceType: "Standard_M128s", VCPU: 128, MemoryMb: 2048000, GPU: 0, }, + "Standard_M128s_v2": { + InstanceType: "Standard_M128s_v2", + VCPU: 128, + MemoryMb: 2097152, + GPU: 0, + }, "Standard_M16-4ms": { InstanceType: "Standard_M16-4ms", VCPU: 16, @@ -2104,6 +3394,30 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 447488, GPU: 0, }, + "Standard_M192idms_v2": { + InstanceType: "Standard_M192idms_v2", + VCPU: 192, + MemoryMb: 4194304, + GPU: 0, + }, + "Standard_M192ids_v2": { + InstanceType: "Standard_M192ids_v2", + VCPU: 192, + MemoryMb: 2097152, + GPU: 0, + }, + "Standard_M192ims_v2": { + InstanceType: "Standard_M192ims_v2", + VCPU: 192, + MemoryMb: 4194304, + GPU: 0, + }, + "Standard_M192is_v2": { + InstanceType: "Standard_M192is_v2", + VCPU: 192, + MemoryMb: 2097152, + GPU: 0, + }, "Standard_M208ms_v2": { InstanceType: "Standard_M208ms_v2", VCPU: 208, @@ -2128,6 +3442,12 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 896000, GPU: 0, }, + "Standard_M32dms_v2": { + InstanceType: "Standard_M32dms_v2", + VCPU: 32, + MemoryMb: 896000, + GPU: 0, + }, "Standard_M32ls": { InstanceType: "Standard_M32ls", VCPU: 32, @@ -2140,6 +3460,12 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 896000, GPU: 0, }, + "Standard_M32ms_v2": { + InstanceType: "Standard_M32ms_v2", + VCPU: 32, + MemoryMb: 896000, + GPU: 0, + }, "Standard_M32ts": { InstanceType: "Standard_M32ts", VCPU: 32, @@ -2188,6 +3514,18 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 1792000, GPU: 0, }, + "Standard_M64dms_v2": { + InstanceType: "Standard_M64dms_v2", + VCPU: 64, + MemoryMb: 1835008, + GPU: 0, + }, + "Standard_M64ds_v2": { + InstanceType: "Standard_M64ds_v2", + VCPU: 64, + MemoryMb: 1048576, + GPU: 0, + }, "Standard_M64ls": { InstanceType: "Standard_M64ls", VCPU: 64, @@ -2206,12 +3544,24 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 1792000, GPU: 0, }, + "Standard_M64ms_v2": { + InstanceType: "Standard_M64ms_v2", + VCPU: 64, + MemoryMb: 1835008, + GPU: 0, + }, "Standard_M64s": { InstanceType: "Standard_M64s", VCPU: 64, MemoryMb: 1024000, GPU: 0, }, + "Standard_M64s_v2": { + InstanceType: "Standard_M64s_v2", + VCPU: 64, + MemoryMb: 1048576, + GPU: 0, + }, "Standard_M8-2ms": { InstanceType: "Standard_M8-2ms", VCPU: 8, @@ -2257,7 +3607,7 @@ var InstanceTypes = map[string]*InstanceType{ "Standard_NC16as_T4_v3": { InstanceType: "Standard_NC16as_T4_v3", VCPU: 16, - MemoryMb: 114688, + MemoryMb: 112640, GPU: 1, }, "Standard_NC24": { @@ -2323,7 +3673,7 @@ var InstanceTypes = map[string]*InstanceType{ "Standard_NC64as_T4_v3": { InstanceType: "Standard_NC64as_T4_v3", VCPU: 64, - MemoryMb: 458752, + MemoryMb: 450560, GPU: 4, }, "Standard_NC6_Promo": { @@ -2344,6 +3694,12 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 114688, GPU: 1, }, + "Standard_NC8as_T4_v3": { + InstanceType: "Standard_NC8as_T4_v3", + VCPU: 8, + MemoryMb: 57344, + GPU: 1, + }, "Standard_ND12s": { InstanceType: "Standard_ND12s", VCPU: 12, @@ -2362,22 +3718,22 @@ var InstanceTypes = map[string]*InstanceType{ MemoryMb: 458752, GPU: 4, }, + "Standard_ND40rs_v2": { + InstanceType: "Standard_ND40rs_v2", + VCPU: 40, + MemoryMb: 688128, + GPU: 8, + }, "Standard_ND6s": { InstanceType: "Standard_ND6s", VCPU: 6, MemoryMb: 114688, GPU: 1, }, - "Standard_NC8as_T4_v3": { - InstanceType: "Standard_NC8as_T4_v3", - VCPU: 8, - MemoryMb: 57344, - GPU: 1, - }, - "Standard_ND40rs_v2": { - InstanceType: "Standard_ND40rs_v2", - VCPU: 40, - MemoryMb: 688128, + "Standard_ND96amsr_A100_v4": { + InstanceType: "Standard_ND96amsr_A100_v4", + VCPU: 96, + MemoryMb: 1970176, GPU: 8, }, "Standard_NV12": { diff --git a/cluster-autoscaler/cloudprovider/azure/azure_instance_types/gen.go b/cluster-autoscaler/cloudprovider/azure/azure_instance_types/gen.go index ca2495f8c4a9..567406262f2e 100644 --- a/cluster-autoscaler/cloudprovider/azure/azure_instance_types/gen.go +++ b/cluster-autoscaler/cloudprovider/azure/azure_instance_types/gen.go @@ -1,3 +1,4 @@ +//go:build ignore // +build ignore /* @@ -32,7 +33,7 @@ import ( ) var packageTemplate = template.Must(template.New("").Parse(`/* -Copyright 2018 The Kubernetes Authors. +Copyright The Kubernetes Authors. Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. diff --git a/cluster-autoscaler/cloudprovider/azure/azure_scale_set.go b/cluster-autoscaler/cloudprovider/azure/azure_scale_set.go index e5d49bad31da..e64f230e23ed 100644 --- a/cluster-autoscaler/cloudprovider/azure/azure_scale_set.go +++ b/cluster-autoscaler/cloudprovider/azure/azure_scale_set.go @@ -340,7 +340,7 @@ func (scaleSet *ScaleSet) Belongs(node *apiv1.Node) (bool, error) { } // DeleteInstances deletes the given instances. All instances must be controlled by the same ASG. -func (scaleSet *ScaleSet) DeleteInstances(instances []*azureRef) error { +func (scaleSet *ScaleSet) DeleteInstances(instances []*azureRef, hasUnregisteredNodes bool) error { if len(instances) == 0 { return nil } @@ -405,9 +405,12 @@ func (scaleSet *ScaleSet) DeleteInstances(instances []*azureRef) error { // Proactively decrement scale set size so that we don't // go below minimum node count if cache data is stale - scaleSet.sizeMutex.Lock() - scaleSet.curSize -= int64(len(instanceIDs)) - scaleSet.sizeMutex.Unlock() + // only do it for non-unregistered nodes + if !hasUnregisteredNodes { + scaleSet.sizeMutex.Lock() + scaleSet.curSize -= int64(len(instanceIDs)) + scaleSet.sizeMutex.Unlock() + } // Proactively set the status of the instances to be deleted in cache for _, instance := range instancesToDelete { @@ -432,6 +435,7 @@ func (scaleSet *ScaleSet) DeleteNodes(nodes []*apiv1.Node) error { } refs := make([]*azureRef, 0, len(nodes)) + hasUnregisteredNodes := false for _, node := range nodes { belongs, err := scaleSet.Belongs(node) if err != nil { @@ -442,13 +446,16 @@ func (scaleSet *ScaleSet) DeleteNodes(nodes []*apiv1.Node) error { return fmt.Errorf("%s belongs to a different asg than %s", node.Name, scaleSet.Id()) } + if node.Annotations[cloudprovider.FakeNodeReasonAnnotation] == cloudprovider.FakeNodeUnregistered { + hasUnregisteredNodes = true + } ref := &azureRef{ Name: node.Spec.ProviderID, } refs = append(refs, ref) } - return scaleSet.DeleteInstances(refs) + return scaleSet.DeleteInstances(refs, hasUnregisteredNodes) } // Id returns ScaleSet id. diff --git a/cluster-autoscaler/cloudprovider/azure/azure_scale_set_test.go b/cluster-autoscaler/cloudprovider/azure/azure_scale_set_test.go index 53291a8d48a0..dcb6185a1974 100644 --- a/cluster-autoscaler/cloudprovider/azure/azure_scale_set_test.go +++ b/cluster-autoscaler/cloudprovider/azure/azure_scale_set_test.go @@ -18,6 +18,7 @@ package azure import ( "fmt" + metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" "net/http" "testing" "time" @@ -346,6 +347,82 @@ func TestDeleteNodes(t *testing.T) { assert.Equal(t, instance2.Status.State, cloudprovider.InstanceDeleting) } +func TestDeleteNodeUnregistered(t *testing.T) { + ctrl := gomock.NewController(t) + defer ctrl.Finish() + + manager := newTestAzureManager(t) + vmssName := "test-asg" + var vmssCapacity int64 = 2 + + expectedScaleSets := []compute.VirtualMachineScaleSet{ + { + Name: &vmssName, + Sku: &compute.Sku{ + Capacity: &vmssCapacity, + }, + }, + } + expectedVMSSVMs := newTestVMSSVMList(2) + + mockVMSSClient := mockvmssclient.NewMockInterface(ctrl) + mockVMSSClient.EXPECT().List(gomock.Any(), manager.config.ResourceGroup).Return(expectedScaleSets, nil).Times(2) + mockVMSSClient.EXPECT().DeleteInstancesAsync(gomock.Any(), manager.config.ResourceGroup, gomock.Any(), gomock.Any()).Return(nil, nil) + mockVMSSClient.EXPECT().WaitForAsyncOperationResult(gomock.Any(), gomock.Any()).Return(&http.Response{StatusCode: http.StatusOK}, nil).AnyTimes() + manager.azClient.virtualMachineScaleSetsClient = mockVMSSClient + mockVMSSVMClient := mockvmssvmclient.NewMockInterface(ctrl) + mockVMSSVMClient.EXPECT().List(gomock.Any(), manager.config.ResourceGroup, "test-asg", gomock.Any()).Return(expectedVMSSVMs, nil).AnyTimes() + manager.azClient.virtualMachineScaleSetVMsClient = mockVMSSVMClient + err := manager.forceRefresh() + assert.NoError(t, err) + + resourceLimiter := cloudprovider.NewResourceLimiter( + map[string]int64{cloudprovider.ResourceNameCores: 1, cloudprovider.ResourceNameMemory: 10000000}, + map[string]int64{cloudprovider.ResourceNameCores: 10, cloudprovider.ResourceNameMemory: 100000000}) + provider, err := BuildAzureCloudProvider(manager, resourceLimiter) + assert.NoError(t, err) + + registered := manager.RegisterNodeGroup( + newTestScaleSet(manager, "test-asg")) + manager.explicitlyConfigured["test-asg"] = true + assert.True(t, registered) + err = manager.forceRefresh() + assert.NoError(t, err) + + scaleSet, ok := provider.NodeGroups()[0].(*ScaleSet) + assert.True(t, ok) + + targetSize, err := scaleSet.TargetSize() + assert.NoError(t, err) + assert.Equal(t, 2, targetSize) + + // annotate node with unregistered annotation + annotations := make(map[string]string) + annotations[cloudprovider.FakeNodeReasonAnnotation] = cloudprovider.FakeNodeUnregistered + nodesToDelete := []*apiv1.Node{ + { + ObjectMeta: metav1.ObjectMeta{ + Annotations: annotations, + }, + Spec: apiv1.NodeSpec{ + ProviderID: "azure://" + fmt.Sprintf(fakeVirtualMachineScaleSetVMID, 0), + }, + }, + } + err = scaleSet.DeleteNodes(nodesToDelete) + assert.NoError(t, err) + + // Ensure the the cached size has NOT been proactively decremented + targetSize, err = scaleSet.TargetSize() + assert.NoError(t, err) + assert.Equal(t, 2, targetSize) + + // Ensure that the status for the instances is Deleting + instance0, found := scaleSet.getInstanceByProviderID("azure://" + fmt.Sprintf(fakeVirtualMachineScaleSetVMID, 0)) + assert.True(t, found, true) + assert.Equal(t, instance0.Status.State, cloudprovider.InstanceDeleting) +} + func TestDeleteNoConflictRequest(t *testing.T) { ctrl := gomock.NewController(t) defer ctrl.Finish() diff --git a/cluster-autoscaler/cloudprovider/azure/azure_util_test.go b/cluster-autoscaler/cloudprovider/azure/azure_util_test.go index 24d3b6c48fa8..50538ddc2c57 100644 --- a/cluster-autoscaler/cloudprovider/azure/azure_util_test.go +++ b/cluster-autoscaler/cloudprovider/azure/azure_util_test.go @@ -36,12 +36,6 @@ import ( "k8s.io/legacy-cloud-providers/azure/retry" ) -const ( - testAccountName = "account" - storageAccountClientErrMsg = "Server failed to authenticate the request. Make sure the value of Authorization " + - "header is formed correctly including the signature" -) - func GetTestAzureUtil(t *testing.T) *AzUtil { return &AzUtil{manager: newTestAzureManager(t)} } @@ -305,26 +299,6 @@ func TestIsAzureRequestsThrottled(t *testing.T) { } } -func TestDeleteBlob(t *testing.T) { - ctrl := gomock.NewController(t) - defer ctrl.Finish() - - azUtil := GetTestAzureUtil(t) - mockSAClient := mockstorageaccountclient.NewMockInterface(ctrl) - mockSAClient.EXPECT().ListKeys( - gomock.Any(), - azUtil.manager.config.ResourceGroup, - testAccountName).Return(storage.AccountListKeysResult{ - Keys: &[]storage.AccountKey{ - {Value: to.StringPtr("dmFsdWUK")}, - }, - }, nil) - azUtil.manager.azClient.storageAccountsClient = mockSAClient - - err := azUtil.DeleteBlob(testAccountName, "vhd", "blob") - assert.True(t, strings.Contains(err.Error(), storageAccountClientErrMsg)) -} - func TestDeleteVirtualMachine(t *testing.T) { ctrl := gomock.NewController(t) defer ctrl.Finish() diff --git a/cluster-autoscaler/cloudprovider/cloud_provider.go b/cluster-autoscaler/cloudprovider/cloud_provider.go index 2828588aa3cb..e754105e292b 100644 --- a/cluster-autoscaler/cloudprovider/cloud_provider.go +++ b/cluster-autoscaler/cloudprovider/cloud_provider.go @@ -256,6 +256,16 @@ func (c InstanceErrorClass) String() string { } } +const ( + // FakeNodeReasonAnnotation is an annotation added to the fake placeholder nodes CA has created + // Note that this don't map to real nodes in k8s and are merely used for error handling + FakeNodeReasonAnnotation = "k8s.io/cluster-autoscaler/fake-node-reason" + // FakeNodeUnregistered represents a node that is identified by CA as unregistered + FakeNodeUnregistered = "unregistered" + // FakeNodeCreateError represents a node that is identified by CA as a created node with errors + FakeNodeCreateError = "create-error" +) + // PricingModel contains information about the node price and how it changes in time. type PricingModel interface { // NodePrice returns a price of running the given node for a given period of time. diff --git a/cluster-autoscaler/cloudprovider/packet/README.md b/cluster-autoscaler/cloudprovider/packet/README.md index c82c19b28284..daf2fe073f60 100644 --- a/cluster-autoscaler/cloudprovider/packet/README.md +++ b/cluster-autoscaler/cloudprovider/packet/README.md @@ -79,6 +79,35 @@ affinity: - t1.small.x86 ``` +## CCM and Controller node labels + +### CCM +By default, autoscaler assumes that you have an older deprecated version of `packet-ccm` installed in your +cluster. If however, that is not the case and you've migrated to the new `cloud-provider-equinix-metal` CCM, +then this must be told to autoscaler. This can be done via setting an environment variable in the deployment: +``` +env: + - name: INSTALLED_CCM + value: cloud-provider-equinix-metal +``` +**NOTE**: As a prerequisite, ensure that all worker nodes in your cluster have the prefix `equinixmetal://` in +the Node spec `.spec.providerID`. If there are any existing worker nodes with prefix `packet://`, then drain +the node, remove the node and restart the kubelet on that worker node to re-register the node in the cluster, +this would ensure that `cloud-provider-equinix-metal` CCM sets the uuid with prefix `equinixmetal://` to the +field `.spec.ProviderID`. + +### Controller node labels + +Autoscaler assumes that control plane nodes in your cluster are identified by the label +`node-role.kubernetes.io/master`. If for some reason, this assumption is not true in your case, then set the +envirnment variable in the deployment: + +``` +env: + - name: PACKET_CONTROLLER_NODE_IDENTIFIER_LABEL + value: