Not able to upgrade AKS cluster using terraform module - Minor version of node pool version 27 is bigger than control plane version #465

dunefro · 2023-11-01T11:05:22Z

Is there an existing issue for this?

I have searched the existing issues

Terraform Version

1.6.2

Module Version

7.4.0

AzureRM Provider Version

3.69.0

Affected Resource(s)/Data Source(s)

module.aks.azurerm_kubernetes_cluster.main

Terraform Configuration Files

resource "azurerm_user_assigned_identity" "cluster" {
  location            = var.location
  name                = var.name
  resource_group_name = var.resource_group_name
}

resource "azurerm_role_assignment" "network_contributor_cluster" {
  scope                = var.vnet_id
  role_definition_name = "Network Contributor"
  principal_id         = azurerm_user_assigned_identity.cluster.principal_id
}

module "aks" {
  source                    = "Azure/aks/azurerm"
  version                   = "7.4.0"
  resource_group_name       = var.resource_group_name
  cluster_name              = var.name
  location                  = var.location
  prefix                    = "prefix"
  workload_identity_enabled = var.workload_identity_enabled

  # agent configuration
  # agents_availability_zones = []
  agents_count         = local.intial_node_pool_min_count
  agents_max_count     = local.intial_node_pool_max_count
  agents_min_count     = local.intial_node_pool_min_count
  agents_pool_name     = "initial"
  agents_size          = var.intial_node_pool_instance_type
  agents_tags          = local.tags
  orchestrator_version = var.kubernetes_version

  # autoscaler configuration
  auto_scaler_profile_enabled                          = true
  auto_scaler_profile_expander                         = "random"
  auto_scaler_profile_max_graceful_termination_sec     = "180"
  auto_scaler_profile_max_node_provisioning_time       = "5m"
  auto_scaler_profile_max_unready_nodes                = 0
  auto_scaler_profile_scale_down_delay_after_add       = "2m"
  auto_scaler_profile_scale_down_delay_after_delete    = "30s"
  auto_scaler_profile_scale_down_unneeded              = "1m"
  auto_scaler_profile_scale_down_unready               = "2m"
  auto_scaler_profile_scale_down_utilization_threshold = "0.3"

  # cluster level configurations
  api_server_authorized_ip_ranges            = var.allowed_ip_ranges
  create_role_assignment_network_contributor = false
  enable_auto_scaling                        = true
  enable_host_encryption                     = true
  identity_ids                               = [azurerm_user_assigned_identity.cluster.id]
  identity_type                              = "UserAssigned"
  kubernetes_version                         = var.kubernetes_version

  # storage
  storage_profile_blob_driver_enabled         = var.enable_blob_driver
  storage_profile_disk_driver_enabled         = var.enable_disk_driver
  storage_profile_disk_driver_version         = var.disk_driver_version
  storage_profile_file_driver_enabled         = var.enable_file_driver
  storage_profile_snapshot_controller_enabled = var.enable_snapshot_controller
  storage_profile_enabled                     = var.enable_storage_profile

  # network configuration
  network_plugin             = var.network_plugin
  vnet_subnet_id             = var.subnet_id
  net_profile_dns_service_ip = var.dns_ip
  net_profile_service_cidr   = var.service_cidr
  net_profile_pod_cidr       = var.pod_cidr
  # net_profile_docker_bridge_cidr = "10.244.0.10"

  node_pools = local.node_pools

  oidc_issuer_enabled = var.oidc_issuer_enabled
  os_disk_size_gb     = var.disk_size

  # makes the initial node pool have a taint `CriticalAddonsOnly=true:NoSchedule`
  # helpful in scheduling important workloads 
  only_critical_addons_enabled = true

  private_cluster_enabled = var.private_cluster_enabled

  # rbac 
  rbac_aad                          = false
  role_based_access_control_enabled = false

  tags = local.tags
}

tfvars variables values

################################################################################
# Cluster
################################################################################

variable "name" {
  description = "Name of the cluster"
  type        = string
}
variable "kubernetes_version" {
  description = "Version of the kubernetes engine"
  default     = "1.27"
  type        = string
}

variable "oidc_issuer_enabled" {
  description = "Enable OIDC for the cluster"
  default     = true
  type        = bool
}

variable "disk_size" {
  description = "Disk size of the initial node pool in GB"
  default     = "100"
  type        = string
}

variable "intial_node_pool_instance_type" {
  description = "Instance size of the initial node pool"
  default     = "Standard_D2s_v5"
  type        = string
}

variable "intial_node_pool_spot_instance_type" {
  description = "Instance size of the initial node pool"
  default     = "Standard_D4s_v5"
  type        = string
}

variable "workload_identity_enabled" {
  description = "Enable workload identity in the cluster"
  default     = true
  type        = bool
}

variable "control_plane" {
  description = "Whether the cluster is control plane"
  type        = bool

}

variable "enable_storage_profile" {
  description = "Enable storage profile for the cluster. If disabled `enable_blob_driver`, `enable_file_driver`, `enable_disk_driver` and `enable_snapshot_controller` will have no impact"
  type        = bool
  default     = true
}

variable "enable_blob_driver" {
  description = "Enable blob storage provider"
  type        = bool
  default     = true
}

variable "enable_file_driver" {
  description = "Enable file storage provider"
  type        = bool
  default     = true
}

variable "enable_disk_driver" {
  description = "Enable disk storage provider"
  type        = bool
  default     = true
}

variable "disk_driver_version" {
  description = "Version of disk driver. Supported values `v1` and `v2`"
  type        = string
  default     = "v1"
}
variable "enable_snapshot_controller" {
  description = "Enable snapshot controller"
  type        = bool
  default     = true
}
################################################################################
# Network
################################################################################

variable "vnet_id" {
  description = "Vnet ID for the cluster"
  type        = string
}

variable "subnet_id" {
  description = "Subnet Id for the cluster"
  type        = string
}

variable "network_plugin" {
  description = "Network plugin to use for cluster"
  type        = string
  default     = "kubenet"
}

variable "pod_cidr" {
  description = "CIDR of the pod in cluster"
  default     = "10.244.0.0/16"
  type        = string
}

variable "service_cidr" {
  description = "CIDR of the services in cluster"
  default     = "10.255.0.0/16"
  type        = string
}

variable "dns_ip" {
  description = "IP from service CIDR used for internal DNS"
  default     = "10.255.0.10"
  type        = string
}

variable "allowed_ip_ranges" {
  description = "Allowed IP ranges to connect to the cluster"
  default     = ["0.0.0.0/0"]
  type        = list(string)
}

variable "private_cluster_enabled" {
  description = "Private cluster"
  default     = false
  type        = bool
}

################################################################################
# Generic
################################################################################

variable "resource_group_name" {
  description = "Name of the resource group"
  type        = string
}

variable "location" {
  description = "Location of the resource group"
  type        = string
}

variable "tags" {
  description = "A map of tags to add to all resources"
  type        = map(string)
  default     = {}
}


# locals.tf
locals {
  tags = merge(
    {
      "terraform-module" = "terraform-azure-cluster"
      "terraform"        = "true"
      "cluster-name"     = var.name
    },
    var.tags
  )
  intial_node_pool_min_count = var.control_plane ? 2 : 1
  intial_node_pool_max_count = var.control_plane ? 3 : 2
  node_pools = {
    spot = {
      name            = "spotpool"
      node_count      = 1
      max_count       = 20
      min_count       = 1
      os_disk_size_gb = 100
      priority        = "Spot"
      vm_size         = var.intial_node_pool_spot_instance_type

      # mandatory to pass otherwise node pool will be recreated
      enable_auto_scaling     = true
      custom_ca_trust_enabled = false
      enable_host_encryption  = false
      enable_node_public_ip   = false
      eviction_policy         = "Delete"
      orchestrator_version    = var.kubernetes_version
      node_taints = [
        "kubernetes.azure.com/scalesetpriority=spot:NoSchedule"
      ]
      tags           = local.tags
      zones          = []
      vnet_subnet_id = var.subnet_id
    }
  }
}

Debug Output/Panic Output

│ Error: updating Default Node Pool Agent Pool (Subscription: "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxx"
│ Resource Group Name: "tfy-az-prod"
│ Managed Cluster Name: "az-prod-eaus"
│ Agent Pool Name: "initial") performing CreateOrUpdate: agentpools.AgentPoolsClient#CreateOrUpdate: Failure sending request: StatusCode=0 -- Original Error: Code="NodePoolMcVersionIncompatible" Message="Node pool version 1.27.3 and control plane version 1.26.6 are incompatible. Minor version of node pool version 27 is bigger than control plane version 26. For more information, please check https://aka.ms/aks/UpgradeVersionRules"
│ 
│   with module.aks.azurerm_kubernetes_cluster.main,
│   on .terraform/modules/aks/main.tf line 17, in resource "azurerm_kubernetes_cluster" "main":
│   17: resource "azurerm_kubernetes_cluster" "main" {

Expected Behaviour

Ideally the control plane should first gets updated and then the node pool. To resolve this issue I have to first update the control plane from the portal and then update the node pool from terraform.

Actual Behaviour

There is no terraform diff for updating control plane and node pool upgrade causes version incompatibility error.

Steps to Reproduce

No response

Important Factoids

No response

References

No response

mkilchhofer · 2024-01-25T11:57:46Z

We at @swisspost have the exact same issue. I analyzed it with some of my colleagues (@hichem-belhocine will ping you @zioproto via MSteams).

We have no auto update mechanism from AKS side in place (Auto Upgrade Type = Disabled), we specify the exact patch version for both:

control-plane (var.kubernetes_version)
nodepools (var.orchestrator_version)

With TF module version 6.8.0 and the upgrade from 1.25 -> 1.26 a plan looked like this:

Everything went smooth. Then we upgraded to module version 7.5.0 and the upgrade is now triggered via an other TF provider (azapi) and the control plane variable is ignored via a lifecycle { } block. Ref:

[Breaking] - Ignore changes on kubernetes_version from outside of Terraform #336

A plan to 1.27 now looks like this (w/ module version 7.5.0):

And of course this will fail as you cannot update the nodepool to a newer version than the control-plane:

│ Error: updating Default Node Pool Agent Pool (Subscription: "<REDACTED>"
│ Resource Group Name: "rg-aks-dev-m98cn0001"
│ Managed Cluster Name: "aks-dev-m98cn0001"
│ Agent Pool Name: "system") performing CreateOrUpdate: agentpools.AgentPoolsClient#CreateOrUpdate: Failure sending request: StatusCode=0 -- Original Error: Code="NodePoolMcVersionIncompatible" Message="Node pool version 1.27.7 and control plane version 1.26.10 are incompatible. Minor version of node pool version 27 is bigger than control plane version 26. For more information, please check https://aka.ms/aks/UpgradeVersionRules"
│ 
│   with module.aks_post.module.aks.azurerm_kubernetes_cluster.main,
│   on .terraform/modules/aks_post.aks/main.tf line 17, in resource "azurerm_kubernetes_cluster" "main":
│   17: resource "azurerm_kubernetes_cluster" "main" {
│

zioproto · 2024-01-25T14:22:10Z

@dunefro I understand you are using var.orchestrator_version = var.kubernetes_version

I see the problem where after PR #336 the cluster update is handled like this:

terraform-azurerm-aks/main.tf

Lines 603 to 616 in 3851478

    
           resource "azapi_update_resource" "aks_cluster_post_create" { 
        
             type = "Microsoft.ContainerService/managedClusters@2023-01-02-preview" 
        
             body = jsonencode({ 
        
               properties = { 
        
                 kubernetesVersion = var.kubernetes_version 
        
               } 
        
             }) 
        
             resource_id = azurerm_kubernetes_cluster.main.id 
        
             lifecycle { 
        
               ignore_changes       = all 
        
               replace_triggered_by = [null_resource.kubernetes_version_keeper.id] 
        
             } 
        
           }

so the Terraform plan will attempt to update just the nodepools, because var.orchestrator_version is not ignored in the lifecycle.

terraform-azurerm-aks/main.tf

Lines 520 to 526 in 3851478

    
           lifecycle { 
        
             ignore_changes = [ 
        
               http_application_routing_enabled, 
        
               http_proxy_config[0].no_proxy, 
        
               kubernetes_version, 
        
               public_network_access_enabled, 
        
             ]

@dunefro to unblock your work you could set the following values:

kubernetes_version = 1.27
orchestrator_version =1.26

This will update the control plane first to the latest 1.27 patch version. Once Terraform apply completes you can update your values to:

kubernetes_version = 1.27
orchestrator_version =1.27

This will finish the update updating also the node pools.

@dunefro please confirm if this unblocks your work and you can upgrade successfully.

@lonegunmanb @nellyk for the long term solution I see 2 options:

Make very visible in the documentation of the module that kubernetes_version and orchestrator_version are 2 distinct variables for the reason that first it is necessary to upgrade the control plane and then the node pools.
Support the scenario where the user configures kubernetes_version = orchestrator_version during an upgrade. In this case we should add to the lifecycle ignore_changes also the var.orchestrator_version both in the azurerm_kubernetes_cluster_node_pool (extra node pools) and azurerm_kubernetes_cluster (system node pool) and handle those upgrades with a new azapi resource.

Please everyone give feedback on what is the best option for you.

zioproto · 2024-01-25T16:19:51Z

TODO:

test if the following code:

terraform-azurerm-aks/main.tf

Lines 603 to 616 in 3851478

    
           resource "azapi_update_resource" "aks_cluster_post_create" { 
        
             type = "Microsoft.ContainerService/managedClusters@2023-01-02-preview" 
        
             body = jsonencode({ 
        
               properties = { 
        
                 kubernetesVersion = var.kubernetes_version 
        
               } 
        
             }) 
        
             resource_id = azurerm_kubernetes_cluster.main.id 
        
             lifecycle { 
        
               ignore_changes       = all 
        
               replace_triggered_by = [null_resource.kubernetes_version_keeper.id] 
        
             } 
        
           }

will upgrade also the System node pool to var.kubernetes_version regardless of the value of var.orchestrator_version.

mkilchhofer · 2024-01-25T19:40:38Z

Please everyone give feedback on what is the best option for you.

I am not a fan of the approach to upgrade everything with the Azapi provider if we do hard pinning and disable autopatch.

The approach of azapi has several caveats, the main one is the non-idempotent behavior.
If we rollout only the control plane, nothing seems to happen in the first run. The null_resource with the "version keeper" is updated, but it doesn't trigger the update in the first run. Then if you trigger another run without changing any code, the upgrade finally starts. IMHO non-idempotent behavior is a no-no in the Terraform world.

I was happy with the available approach of 6.8.

Maybe we need two things:

behavior of 6.8 for no autopatch
behavior of 7.5 incase autopatch is active

zioproto · 2024-01-25T20:12:22Z

Maybe we need two things:

behavior of 6.8 for no autopatch

behavior of 7.5 incase autopatch is active

The challenge is that we cannot have a conditional lifecycle block that depends on the value of var.automatic_channel_upgrade.

We would have to create 2 independent azurerm_kubernetes_cluster resources and create only 1 conditionally. This code would be very hard to maintain because every change must be duplicated to all the azurerm_kubernetes_cluster resources.

zioproto · 2024-01-26T08:54:15Z

Next step:

@zioproto to provide a minimal repro example for the team.

zioproto · 2024-01-26T13:38:23Z

The approach of azapi has several caveats, the main one is the non-idempotent behavior. If we rollout only the control plane, nothing seems to happen in the first run. The null_resource with the "version keeper" is updated, but it doesn't trigger the update in the first run. Then if you trigger another run without changing any code, the upgrade finally starts. IMHO non-idempotent behavior is a no-no in the Terraform world.

@mkilchhofer I am not able to reproduce the non-idempotent behavior. I created PR #501 where I extend one of our current CI tests to run the upgrades. As I soon as I change the value of var.kubernetes_version the upgrade of the control plane is triggered. Could you please test ? Or could you please provide the exact steps to reproduce the non-idempotent behavior ? Thanks

mkilchhofer · 2024-01-26T22:41:21Z

Could you please test ? Or could you please provide the exact steps to reproduce the non-idempotent behavior ? Thanks

Can do some more testing but I am now one week on vacation (⛷️🎿🏔️) . I am back in the office on February 5th.

dunefro · 2024-01-30T16:29:49Z

please confirm if this unblocks your work and you can upgrade successfully.

Yes this works. Thanks @zioproto

propyless · 2024-02-21T11:38:37Z

@dunefro I understand you are using var.orchestrator_version = var.kubernetes_version

I see the problem where after PR #336 the cluster update is handled like this:

terraform-azurerm-aks/main.tf

Lines 603 to 616 in 3851478

resource "azapi_update_resource" "aks_cluster_post_create" {

type = "Microsoft.ContainerService/managedClusters@2023-01-02-preview"

body = jsonencode({

properties = {

kubernetesVersion = var.kubernetes_version

}

})

resource_id = azurerm_kubernetes_cluster.main.id

lifecycle {

ignore_changes = all

replace_triggered_by = [null_resource.kubernetes_version_keeper.id]

}

}

so the Terraform plan will attempt to update just the nodepools, because var.orchestrator_version is not ignored in the lifecycle.

terraform-azurerm-aks/main.tf

Lines 520 to 526 in 3851478

lifecycle {

ignore_changes = [

http_application_routing_enabled,

http_proxy_config[0].no_proxy,

kubernetes_version,

public_network_access_enabled,

]

@dunefro to unblock your work you could set the following values:
kubernetes_version = 1.27
orchestrator_version =1.26
This will update the control plane first to the latest 1.27 patch version. Once Terraform apply completes you can update your values to:
kubernetes_version = 1.27
orchestrator_version =1.27
This will finish the update updating also the node pools.

@dunefro please confirm if this unblocks your work and you can upgrade successfully.

@lonegunmanb @nellyk for the long term solution I see 2 options:

Make very visible in the documentation of the module that kubernetes_version and orchestrator_version are 2 distinct variables for the reason that first it is necessary to upgrade the control plane and then the node pools.

Support the scenario where the user configures kubernetes_version = orchestrator_version during an upgrade. In this case we should add to the lifecycle ignore_changes also the var.orchestrator_version both in the azurerm_kubernetes_cluster_node_pool (extra node pools) and azurerm_kubernetes_cluster (system node pool) and handle those upgrades with a new azapi resource.

Please everyone give feedback on what is the best option for you.

Hello zioproto. I think it makes sense from a user perspective with both WoW's. Since typically you do upgrade first the control plane before the workers.

For a usability PoV I prefer having to being able to set both versions to the version I want and the module to handle upgrade order. As it once worked in 6.8.0.

However today, there is no documentation over the modules dropped support for handling a seamless upgrade of both orchestrator and worker planes, causing this confusion.

I think it makes sense to be able to upgrade some node pools to a newer version, while some maybe can stay behind a version.

For us who are using the module, just the added documentation and the intended way to upgrade after the 6.8 to 7.5 upgrade is enough for us to be satisfied.

Edit: For us who use PR's to to peer review each others changes, this means we need to do two changes.. one for the controller plane, the second for, node pools... which is a bit more of a hassle.

zioproto · 2024-03-13T19:13:28Z

Could you please test ? Or could you please provide the exact steps to reproduce the non-idempotent behavior ? Thanks

Can do some more testing but I am now one week on vacation (⛷️🎿🏔️) . I am back in the office on February 5th.

@mkilchhofer friendly ping about this pending issue :) thanks

fleetwoodstack · 2024-04-08T10:57:11Z

@mkilchhofer friendly ping about this issue :)

mkilchhofer · 2024-05-30T12:47:15Z

@mkilchhofer friendly ping about this issue :)

I cannot test it anymore. We no longer use the module and manage the required TF resources ourself.

/cc: @zioproto

fleetwoodstack · 2024-05-31T09:36:03Z

Can anyone else test it?

dunefro · 2024-06-11T08:51:23Z

I was testing out the entire setup once again with the current version being 1.28 and the new_version being 1.29.
As mentioned by @propyless I started out with updating the variable kubernetes_version first to 1.29.

Somehow I am not able to detect any drift for upgrading the kubernetes version. If I try to use orchestrator_version = 1.29, there is a drift detection but only for the node pools, not the control plane. This is my AKS terraform file.

I have also tried to upgrade to AKS module version 9.0.0 and azurerm version 3.107.0, still the same results.

One more weird things i have observed in the drift is the node count of the Azure AKS node pool. If I am not specifying the node count, rather specifying the min_count and max_count for the node pool, upon each terraform apply, even if there is no manual change in terraform from my side, the terraform detects a drift to reduce the size of the node pool back to 0. This should not happen if I am using cluster_autoscaler.

It would be great if we can get some help in the official documentation on -

how to use the node pool
how to perform k8s upgrades

they should cover various scenarios where node pools are managed by cluster_autoscaler or not.

zioproto · 2024-06-11T11:58:13Z

how to use the node pool

Would this example we already test in the CI help to demostrate how to use node pools ?

https://github.com/Azure/terraform-azurerm-aks/tree/main/examples/multiple_node_pools

zioproto · 2024-06-11T12:08:43Z

@lonegunmanb could you please double check this with me ? in the resource azurerm_kubernetes_cluster_node_pool we have a each.value.node_count value.

terraform-azurerm-aks/extra_node_pool.tf

Lines 6 to 28 in 2c364a6

    
           resource "azurerm_kubernetes_cluster_node_pool" "node_pool_create_before_destroy" { 
        
             for_each = local.node_pools_create_before_destroy 
        
             kubernetes_cluster_id         = azurerm_kubernetes_cluster.main.id 
        
             name                          = "${each.value.name}${substr(md5(uuid()), 0, 4)}" 
        
             vm_size                       = each.value.vm_size 
        
             capacity_reservation_group_id = each.value.capacity_reservation_group_id 
        
             custom_ca_trust_enabled       = each.value.custom_ca_trust_enabled 
        
             enable_auto_scaling           = each.value.enable_auto_scaling 
        
             enable_host_encryption        = each.value.enable_host_encryption 
        
             enable_node_public_ip         = each.value.enable_node_public_ip 
        
             eviction_policy               = each.value.eviction_policy 
        
             fips_enabled                  = each.value.fips_enabled 
        
             gpu_instance                  = each.value.gpu_instance 
        
             host_group_id                 = each.value.host_group_id 
        
             kubelet_disk_type             = each.value.kubelet_disk_type 
        
             max_count                     = each.value.max_count 
        
             max_pods                      = each.value.max_pods 
        
             message_of_the_day            = each.value.message_of_the_day 
        
             min_count                     = each.value.min_count 
        
             mode                          = each.value.mode 
        
             node_count                    = each.value.node_count 
        
             node_labels                   = each.value.node_labels

we mark node_count as optional:

terraform-azurerm-aks/variables.tf

Lines 944 to 949 in 2c364a6

    
           variable "node_pools" { 
        
             type = map(object({ 
        
               name                          = string 
        
               node_count                    = optional(number) 
        
               tags                          = optional(map(string)) 
        
               vm_size                       = string

But in the example we have node_count = 1.

terraform-azurerm-aks/examples/multiple_node_pools/main.tf

Lines 34 to 44 in 2c364a6

    
           locals { 
        
             nodes = { 
        
               for i in range(3) : "worker${i}" => { 
        
                 name                  = substr("worker${i}${random_id.prefix.hex}", 0, 8) 
        
                 vm_size               = "Standard_D2s_v3" 
        
                 node_count            = 1 
        
                 vnet_subnet_id        = azurerm_subnet.test.id 
        
                 create_before_destroy = i % 2 == 0 
        
               } 
        
             } 
        
           }

When cluster autoscaler is enabled node_count should be null. Is possible that we have an ambiguity here ? I am afraid the value will be 0 instead of null unless explicitly set.

For the system node pool the variable is called agents_count and it must be explicitly set to null:

https://github.com/Azure/terraform-azurerm-aks/blob/2c364a6d206f6c546da70fb6b9bd0dd633a83764/variables.tf#L30-L34

@dunefro what happens if you set to null the node_count in the node_pools object ? Does it fix the state drift problem ?

zioproto · 2024-06-14T07:51:43Z

@dunefro To try to reproduce your issue I tested the example in this repo examples/multiple_node_pools with the following change

 resource "azurerm_virtual_network" "test" {
   address_space       = ["10.52.0.0/16"]
-  location            = local.resource_group.location
+  location            = "eastus"
   name                = "${random_id.prefix.hex}-vn"
   resource_group_name = local.resource_group.name
 }
@@ -36,7 +36,10 @@ locals {
     for i in range(3) : "worker${i}" => {
       name                  = substr("worker${i}${random_id.prefix.hex}", 0, 8)
       vm_size               = "Standard_D2s_v3"
-      node_count            = 1
+      min_count             = 1
+      max_count             = 3
+      enable_auto_scaling   = true
       vnet_subnet_id        = azurerm_subnet.test.id
       create_before_destroy = i % 2 == 0
     }

But I dont have Terraform state drift , and I can apply "terraform apply" over and over again without detecting any change.

Could you please provide a code example that shows the problem of terraform trying to force the nodes to 0 ? Thanks

dunefro · 2024-07-01T11:40:08Z

what happens if you set to null the node_count in the node_pools object ? Does it fix the state drift problem ?

@zioproto Yes this gets solved when I set the node_count to null, terraform doesn't detect any drift.

dunefro added the bug Something isn't working label Nov 1, 2023

github-project-automation bot added this to Azure Module Kanban Nov 1, 2023

github-project-automation bot moved this to Todo in Azure Module Kanban Nov 1, 2023

zioproto mentioned this issue Jan 26, 2024

Extend multiple_node_pools example to test AKS upgrades #501

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Not able to upgrade AKS cluster using terraform module - Minor version of node pool version 27 is bigger than control plane version #465

Not able to upgrade AKS cluster using terraform module - Minor version of node pool version 27 is bigger than control plane version #465

dunefro commented Nov 1, 2023 •

edited

Loading

mkilchhofer commented Jan 25, 2024 •

edited

Loading

zioproto commented Jan 25, 2024

zioproto commented Jan 25, 2024

mkilchhofer commented Jan 25, 2024

zioproto commented Jan 25, 2024

zioproto commented Jan 26, 2024

zioproto commented Jan 26, 2024

mkilchhofer commented Jan 26, 2024

dunefro commented Jan 30, 2024

propyless commented Feb 21, 2024 •

edited

Loading

zioproto commented Mar 13, 2024

fleetwoodstack commented Apr 8, 2024

mkilchhofer commented May 30, 2024 •

edited

Loading

fleetwoodstack commented May 31, 2024

dunefro commented Jun 11, 2024 •

edited

Loading

zioproto commented Jun 11, 2024

zioproto commented Jun 11, 2024

zioproto commented Jun 14, 2024

dunefro commented Jul 1, 2024

Not able to upgrade AKS cluster using terraform module - Minor version of node pool version 27 is bigger than control plane version #465

Not able to upgrade AKS cluster using terraform module - Minor version of node pool version 27 is bigger than control plane version #465

Comments

dunefro commented Nov 1, 2023 • edited Loading

Is there an existing issue for this?

Terraform Version

Module Version

AzureRM Provider Version

Affected Resource(s)/Data Source(s)

Terraform Configuration Files

tfvars variables values

Debug Output/Panic Output

Expected Behaviour

Actual Behaviour

Steps to Reproduce

Important Factoids

References

mkilchhofer commented Jan 25, 2024 • edited Loading

zioproto commented Jan 25, 2024

zioproto commented Jan 25, 2024

mkilchhofer commented Jan 25, 2024

zioproto commented Jan 25, 2024

zioproto commented Jan 26, 2024

zioproto commented Jan 26, 2024

mkilchhofer commented Jan 26, 2024

dunefro commented Jan 30, 2024

propyless commented Feb 21, 2024 • edited Loading

zioproto commented Mar 13, 2024

fleetwoodstack commented Apr 8, 2024

mkilchhofer commented May 30, 2024 • edited Loading

fleetwoodstack commented May 31, 2024

dunefro commented Jun 11, 2024 • edited Loading

zioproto commented Jun 11, 2024

zioproto commented Jun 11, 2024

zioproto commented Jun 14, 2024

dunefro commented Jul 1, 2024

dunefro commented Nov 1, 2023 •

edited

Loading

mkilchhofer commented Jan 25, 2024 •

edited

Loading

propyless commented Feb 21, 2024 •

edited

Loading

mkilchhofer commented May 30, 2024 •

edited

Loading

dunefro commented Jun 11, 2024 •

edited

Loading