Skip to content

Commit

Permalink
Azure: Remove AKS vmType
Browse files Browse the repository at this point in the history
Signed-off-by: Jack Francis <[email protected]>
  • Loading branch information
jackfrancis committed Oct 10, 2023
1 parent e7bf3ec commit 5117ffb
Show file tree
Hide file tree
Showing 10 changed files with 11 additions and 985 deletions.
63 changes: 4 additions & 59 deletions cluster-autoscaler/cloudprovider/azure/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -81,8 +81,7 @@ k8s.io_cluster-autoscaler_node-template_autoscaling-options_scaledownunreadytime
Cluster autoscaler supports four Kubernetes cluster options on Azure:

- [**vmss**](#vmss-deployment): Autoscale VMSS instances by setting the Azure cloud provider's `vmType` parameter to `vmss` or to an empty string. This supports clusters deployed with [aks-engine][].
- [**standard**](#standard-deployment): Autoscale VMAS instances by setting the Azure cloud provider's `vmType` parameter to `standard`. This supports clusters deployed with [aks-engine][].
- [**aks**](#aks-deployment): Supports an Azure Kubernetes Service ([AKS][]) cluster.
- [**standard**](#standard-deployment): Autoscale VMAS (Virtual Machine Availability Set) VMs by setting the Azure cloud provider's `vmType` parameter to `standard`. This supports clusters deployed with [aks-engine][].

> **_NOTE_**: only the `vmss` option supports scaling down to zero nodes.
Expand Down Expand Up @@ -250,74 +249,21 @@ To run a cluster autoscaler pod with Azure managed service identity (MSI), use [

> **_WARNING_**: Cluster autoscaler depends on user-provided deployment parameters to provision new nodes. After upgrading your Kubernetes cluster, cluster autoscaler must also be redeployed with new parameters to prevent provisioning nodes with an old version.

### AKS deployment
## AKS Autoscaler

#### AKS + VMSS

Autoscaling VM scale sets with AKS is supported for Kubernetes v1.12.4 and later. The option to enable cluster autoscaler is available in the [Azure Portal][] or with the [Azure CLI][]:
Node Pool Autoscaling is a first class feature of your AKS cluster. The option to enable cluster autoscaler is available in the [Azure Portal][] or with the [Azure CLI][]:

```sh
az aks create \
--resource-group myResourceGroup \
--name myAKSCluster \
--kubernetes-version 1.13.5 \
--kubernetes-version 1.25.11 \
--node-count 1 \
--enable-vmss \
--enable-cluster-autoscaler \
--min-count 1 \
--max-count 3
```

#### AKS + Availability Set

The CLI based deployment only support VMSS and manual deployment is needed if availability set is used.

Prerequisites:

- Get Azure credentials from the [**Permissions**](#permissions) step above.
- Get the cluster name with the `az aks list` command.
- Get the name of a node pool from the value of the label **agentpool**

```sh
kubectl get nodes --show-labels
```

Make a copy of [cluster-autoscaler-aks.yaml](examples/cluster-autoscaler-aks.yaml). Fill in the placeholder values for
the `cluster-autoscaler-azure` secret data by base64-encoding each of your Azure credential fields.

- ClientID: `<base64-encoded-client-id>`
- ClientSecret: `<base64-encoded-client-secret>`
- ResourceGroup: `<base64-encoded-resource-group>` (Note: ResourceGroup is case-sensitive)
- SubscriptionID: `<base64-encoded-subscription-id>`
- TenantID: `<base64-encoded-tenant-id>`
- ClusterName: `<base64-encoded-clustername>`
- NodeResourceGroup: `<base64-encoded-node-resource-group>` (Note: node resource group is not resource group and can be obtained in the corresponding label of the nodepool)

> **_NOTE_**: Use a command such as `echo $CLIENT_ID | base64` to encode each of the fields above.

In the `cluster-autoscaler` spec, find the `image:` field and replace `{{ ca_version }}` with a specific cluster autoscaler release.

Below that, in the `command:` section, update the `--nodes=` arguments to reference your node limits and node pool name. For example, if node pool "k8s-nodepool-1" should scale from 1 to 10 nodes:

```yaml
- --nodes=1:10:k8s-nodepool-1
```

or to autoscale multiple VM scale sets:

```yaml
- --nodes=1:10:k8s-nodepool-1
- --nodes=1:10:k8s-nodepool-2
```

Then deploy cluster-autoscaler by running

```sh
kubectl create -f cluster-autoscaler-aks.yaml
```

To deploy in AKS with `Helm 3`, please refer to [helm installation tutorial][].

Please see the [AKS autoscaler documentation][] for details.

## Rate limit and back-off retries
Expand All @@ -339,7 +285,6 @@ The new version of [Azure client][] supports rate limit and back-off retries whe

> **_NOTE_**: * These rate limit configs can be set per-client. Customizing `QPS` and `Bucket` through environment variables per client is not supported.

[AKS]: https://docs.microsoft.com/azure/aks/
[AKS autoscaler documentation]: https://docs.microsoft.com/azure/aks/autoscaler
[aks-engine]: https://github.com/Azure/aks-engine
[Azure CLI]: https://docs.microsoft.com/cli/azure/install-azure-cli
Expand Down
4 changes: 2 additions & 2 deletions cluster-autoscaler/cloudprovider/azure/azure_cache.go
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ var (
// azureCache is used for caching cluster resources state.
//
// It is needed to:
// - keep track of node groups (AKS, VM and VMSS types) in the cluster,
// - keep track of node groups (VM and VMSS types) in the cluster,
// - keep track of instances and which node group they belong to,
// - limit repetitive Azure API calls.
type azureCache struct {
Expand Down Expand Up @@ -174,7 +174,7 @@ func (m *azureCache) fetchAzureResources() error {
} else {
return err
}
case vmTypeStandard, vmTypeAKS:
case vmTypeStandard:
// List all VMs in the RG.
vmResult, err := m.fetchVirtualMachines()
if err == nil {
Expand Down
7 changes: 0 additions & 7 deletions cluster-autoscaler/cloudprovider/azure/azure_client.go
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,6 @@ import (

klog "k8s.io/klog/v2"

"sigs.k8s.io/cloud-provider-azure/pkg/azureclients/containerserviceclient"
"sigs.k8s.io/cloud-provider-azure/pkg/azureclients/diskclient"
"sigs.k8s.io/cloud-provider-azure/pkg/azureclients/interfaceclient"
"sigs.k8s.io/cloud-provider-azure/pkg/azureclients/storageaccountclient"
Expand Down Expand Up @@ -151,7 +150,6 @@ type azClient struct {
interfacesClient interfaceclient.Interface
disksClient diskclient.Interface
storageAccountsClient storageaccountclient.Interface
managedKubernetesServicesClient containerserviceclient.Interface
skuClient compute.ResourceSkusClient
}

Expand Down Expand Up @@ -274,10 +272,6 @@ func newAzClient(cfg *Config, env *azure.Environment) (*azClient, error) {
disksClient := diskclient.New(diskClientConfig)
klog.V(5).Infof("Created disks client with authorizer: %v", disksClient)

aksClientConfig := azClientConfig.WithRateLimiter(cfg.KubernetesServiceRateLimit)
kubernetesServicesClient := containerserviceclient.New(aksClientConfig)
klog.V(5).Infof("Created kubernetes services client with authorizer: %v", kubernetesServicesClient)

// Reference on why selecting ResourceManagerEndpoint as baseURI -
// https://github.com/Azure/go-autorest/blob/main/autorest/azure/environments.go
skuClient := compute.NewResourceSkusClientWithBaseURI(azClientConfig.ResourceManagerEndpoint, cfg.SubscriptionID)
Expand All @@ -292,7 +286,6 @@ func newAzClient(cfg *Config, env *azure.Environment) (*azClient, error) {
deploymentsClient: deploymentsClient,
virtualMachinesClient: virtualMachinesClient,
storageAccountsClient: storageAccountsClient,
managedKubernetesServicesClient: kubernetesServicesClient,
skuClient: skuClient,
}, nil
}
16 changes: 0 additions & 16 deletions cluster-autoscaler/cloudprovider/azure/azure_config.go
Original file line number Diff line number Diff line change
Expand Up @@ -111,11 +111,6 @@ type Config struct {
Deployment string `json:"deployment" yaml:"deployment"`
DeploymentParameters map[string]interface{} `json:"deploymentParameters" yaml:"deploymentParameters"`

//Configs only for AKS
ClusterName string `json:"clusterName" yaml:"clusterName"`
//Config only for AKS
NodeResourceGroup string `json:"nodeResourceGroup" yaml:"nodeResourceGroup"`

// VMSS metadata cache TTL in seconds, only applies for vmss type
VmssCacheTTL int64 `json:"vmssCacheTTL" yaml:"vmssCacheTTL"`

Expand Down Expand Up @@ -174,8 +169,6 @@ func BuildAzureConfig(configReader io.Reader) (*Config, error) {
cfg.AADClientCertPath = os.Getenv("ARM_CLIENT_CERT_PATH")
cfg.AADClientCertPassword = os.Getenv("ARM_CLIENT_CERT_PASSWORD")
cfg.Deployment = os.Getenv("ARM_DEPLOYMENT")
cfg.ClusterName = os.Getenv("AZURE_CLUSTER_NAME")
cfg.NodeResourceGroup = os.Getenv("AZURE_NODE_RESOURCE_GROUP")

subscriptionID, err := getSubscriptionIdFromInstanceMetadata()
if err != nil {
Expand Down Expand Up @@ -474,8 +467,6 @@ func (cfg *Config) TrimSpace() {
cfg.AADClientCertPath = strings.TrimSpace(cfg.AADClientCertPath)
cfg.AADClientCertPassword = strings.TrimSpace(cfg.AADClientCertPassword)
cfg.Deployment = strings.TrimSpace(cfg.Deployment)
cfg.ClusterName = strings.TrimSpace(cfg.ClusterName)
cfg.NodeResourceGroup = strings.TrimSpace(cfg.NodeResourceGroup)
}

func (cfg *Config) validate() error {
Expand All @@ -493,13 +484,6 @@ func (cfg *Config) validate() error {
}
}

if cfg.VMType == vmTypeAKS {
// Cluster name is a mandatory param to proceed.
if cfg.ClusterName == "" {
return fmt.Errorf("cluster name not set for type %+v", cfg.VMType)
}
}

if cfg.SubscriptionID == "" {
return fmt.Errorf("subscription ID not set")
}
Expand Down
Loading

0 comments on commit 5117ffb

Please sign in to comment.