Skip to content

Cluster Autoscaler 1.31.0

Compare
Choose a tag to compare
@MaciekPytel MaciekPytel released this 28 Aug 16:04
· 43 commits to cluster-autoscaler-release-1.31 since this release
8bda275

Changelog

General

  • Cluster Autoscaler can now provision nodes before all pending pods are created and marked as unschedulable by scheduler. This behavior is disabled by default and can be enabled with --enable-proactive-scaleup flag. --pod-injection-limit flag is introduced to allow fine-tuning this behavior. (#7145)
    • This functionality can significantly speed up provisioning of nodes when hundreds or thousands of pods are created at the same time as well as lead to better scale-up decisions in those cases.
    • Injecting too many pods can make CA unstable, depending on number of NodeGroups and scalability of particular cloud provider integration. --pod-injection-limit can help control this.
  • Added support for ProvisioningRequest v1 API. (#7195)
  • Allows the user to use in-cluster kubernetes configuration while self-hosting cluster-autoscaler as a pod within their cluster. (#7156)
  • Faster handling of failed scale ups, useful especially with multiple quota or stockout errors across the cluster. (#7087)
  • Bin packing will be cut short after exceeding "maxBinpackingDuration". The "maxBinpackingDuration" is set using an new flag "--max-binpacking-time". This can prevent rare cases where CA gets unresponsive in scenarios with a very large number of pods pending. (#6556)
  • Added a new least-nodes expander (#6792)

AWS

  • Fix an issue in the Kubernetes Cluster Autoscaler where actual AWS instances could be incorrectly scaled down instead of placeholders. (#6911)
  • Fix an issue with reading taints on Managed Node Groups scaled to zero, that can cause scale-up of nodes with taints that pending pods don't tolerate (#6482)

Azure

  • ACTION REQUIRED: VMSS GPU Nodes must now also include the kubernetes.azure.com/accelerator label in addition to accelerator. (#7203)
  • From now on, users should refer to https://cloud-provider-azure.sigs.k8s.io/install/configs/ for configuration interface (#6947)
  • Fixed an issue where environment variables were not being passed in when config file exists (#6947)
  • Fixed an issue where some cloud provider configurations were not being validated when UseManagedIdentityExtension is set to true (#6947)
  • Renamed several fields from config file, with old names are still acceptable and taking precedence: useWorkloadIdentityExtension to useFederatedWorkloadIdentityExtension, vmssCacheTTL to vmssCacheTTLInSeconds, vmssVmsCacheTTL to vmssVirtualMachinesCacheTTLInSeconds, enableVmssFlex to enableVmssFlexNodes (#6947)
  • Renamed several environment variables, with old names are still acceptable and taking precedence: ARM_USE_MANAGED_IDENTITY_EXTENSION to ARM_USE_FEDERATED_WORKLOAD_IDENTITY_EXTENSION, AZURE_VMSS_CACHE_TTL to AZURE_VMSS_CACHE_TTL_IN_SECONDS, AZURE_VMSS_VMS_CACHE_TTL to AZURE_VMSS_VMS_CACHE_TTL_IN_SECONDS, AZURE_ENABLE_VMSS_FLEX to AZURE_ENABLE_VMSS_FLEX_NODES (#6947)
  • Fix some cases where instance cache is outdated but not getting refreshes (#7116)
  • Support cloud provider AAD certificate authentication (#7003)
  • getVMSS api will be called when using spot instances for having better updated information (#6470)
  • The AZURE_CLUSTER_AUTOSCALER_USER_AGENT_SUFFIX variable can be used to customize the user agent for the Azure provider of cluster-autoscaler. Setting this to -my-user-agent results in a user agent like Go/go1.22.5 (amd64-linux) go-autorest/v14.2.1 cluster-autoscaler-my-user-agent/v1.31.0-alpha.2. (#7033)
  • You can now optionally specify a default min and max size for Azure VMSSs through the auto discovery tags. Explicit min and max tags on VMSSs will still be given priority over the default. (#6863).
  • Skips Azure-specific node labels that might mistakenly categorize nodegroups as different when, in reality, they are similar. (#6634)

Cluster API

  • Added configurable autoscaling options to clusterapi provider allowing users to configure e.g. --scale-down-unneeded-time on a per node group level. (#6743)

GCE

  • GCE cloud provider will use Instance.List api to list mig instances. IGM.ListManagedInstances api will be used as a fall back mechanism and for listing instances for migs that have instances in creating or deleting states. This should improve performance in clusters with a large number of NodeGroups. (#6955)

Hetzner

  • Fixed exhausted node groups not backing off for Hetzner Provider (#6750)

Images

  • registry.k8s.io/autoscaling/cluster-autoscaler:v1.31.0
  • registry.k8s.io/autoscaling/cluster-autoscaler-arm64:v1.31.0
  • registry.k8s.io/autoscaling/cluster-autoscaler-amd64:v1.31.0
  • registry.k8s.io/autoscaling/cluster-autoscaler-s390x:v1.31.0

Full Changelog: cluster-autoscaler-1.30.0...cluster-autoscaler-1.31.0