-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
azurerm_kubernetes_cluster with multiple agent pools keeps getting re-created #4560
Comments
I have a theory: The state file shows the pools in alphabetical order by name. So in my template that was experiencing the same issue, I swapped the two pools so they were alphabetical and the subsequent The example provided by mikkoc above aligns with this theory. |
That was exactly the issue: I changed the order to alphabetical and the problem disappeared. Thanks @timio73 !! |
No problem. I would still consider this a bug as the pools should index and not be dependent on order in the template. |
Also encountered this, thanks for figuring out the problem! |
It appears something similarly problematic occurs when you use a dynamic "agent_pool_profile" {
for_each = var.agent_pools
content {
count = agent_pool_profile.value.count
name = agent_pool_profile.value.name
vm_size = agent_pool_profile.value.vm_size # az vm list-sizes --location centralus
os_type = agent_pool_profile.value.os_type
os_disk_size_gb = agent_pool_profile.value.os_disk_size_gb
}
} If you add a new ~ agent_pool_profile {
- availability_zones = [] -> null
count = 1
+ dns_prefix = (known after apply)
- enable_auto_scaling = false -> null
- max_count = 0 -> null
~ max_pods = 110 -> (known after apply)
- min_count = 0 -> null
name = "default"
- node_taints = [] -> null
os_disk_size_gb = 30
os_type = "Linux"
type = "AvailabilitySet"
vm_size = "Standard_B2s"
}
~ agent_pool_profile {
- availability_zones = [] -> null
count = 2
+ dns_prefix = (known after apply)
- enable_auto_scaling = false -> null
- max_count = 0 -> null
~ max_pods = 110 -> (known after apply)
- min_count = 0 -> null
name = "nodepool"
- node_taints = [] -> null
os_disk_size_gb = 50
os_type = "Linux"
type = "AvailabilitySet"
vm_size = "Standard_F2"
}
+ agent_pool_profile {
+ count = 2
+ dns_prefix = (known after apply)
+ fqdn = (known after apply)
+ max_pods = (known after apply)
+ name = "nodepool2" # forces replacement
+ os_disk_size_gb = 50 # forces replacement
+ os_type = "Linux" # forces replacement
+ type = "AvailabilitySet" # forces replacement
+ vm_size = "Standard_F2" # forces replacement
} EDIT: I now realize my specific issue was already reported in #3971 |
With this change, if you add a new "agent_pool_profile" block, it successfully creates the new "agent_pool_profile" without force creating a new cluster. It's important to note, by removing the `ForceNew` elements from the "agent_pool_profile" schema, a new behavior is introduced. Given a configuration that looks like this: ```hcl variable "agent_pools" { type = list(object({ count = number, name = string, os_disk_size_gb = number, os_type = string, vm_size = string, })) default = [ { count = 1, name = "nodepool1", os_disk_size_gb = 30, os_type = "Linux", vm_size = "Standard_B2s", }, { count = 1 name = "nodepool2" os_disk_size_gb = 30 os_type = "Linux" vm_size = "Standard_F2", }, ] } resource "azurerm_kubernetes_cluster" "cluster" { // Other required fields skipped for brevity dynamic "agent_pool_profile" { for_each = var.agent_pools content { count = agent_pool_profile.value.count name = agent_pool_profile.value.name vm_size = agent_pool_profile.value.vm_size # az vm list-sizes --location centralus os_type = agent_pool_profile.value.os_type os_disk_size_gb = agent_pool_profile.value.os_disk_size_gb } } } ``` The `terraform apply` is successful, but if you choose to swap your list around such that `nodepool2` comes before `nodepool1`: ```hcl variable "agent_pools" { type = list(object({ count = number, name = string, os_disk_size_gb = number, os_type = string, vm_size = string, })) default = [ { count = 1 name = "nodepool2" os_disk_size_gb = 30 os_type = "Linux" vm_size = "Standard_F2", }, { count = 1, name = "nodepool1", os_disk_size_gb = 30, os_type = "Linux", vm_size = "Standard_B2s", }, ] } ``` Subsequent `terraform plan`s work as epxected, but you can't `terraform apply`, because the request that is sent to the Azure RM is to update the agent pool profile with a name that already exists. All agent pool profile names *must* be unique. ```hcl ~ agent_pool_profile { availability_zones = [] count = 1 enable_auto_scaling = false fqdn = "dev-data-koreacentral-519d3aa3.hcp.koreacentral.azmk8s.io" max_count = 0 max_pods = 110 min_count = 0 ~ name = "nodepool1" -> "nodepool2" node_taints = [] os_disk_size_gb = 30 os_type = "Linux" type = "AvailabilitySet" vm_size = "Standard_B2s" } ~ agent_pool_profile { availability_zones = [] count = 1 enable_auto_scaling = false fqdn = "dev-data-koreacentral-519d3aa3.hcp.koreacentral.azmk8s.io" max_count = 0 max_pods = 110 min_count = 0 ~ name = "nodepool2" -> "nodepool1" node_taints = [] os_disk_size_gb = 30 os_type = "Linux" type = "AvailabilitySet" vm_size = "Standard_B2s" } ``` For my use case, it's far more practical to deal with this behavior than it is to create a new cluster every time a new agent pool profile is added to my config. This said, if someone knows an obvious fix, I'm happy to implement it. Thank you for your review 🙏
With this change, if you add a new "agent_pool_profile" block, it successfully creates the new "agent_pool_profile" without force creating a new cluster. It's important to note, by removing the `ForceNew` elements from the "agent_pool_profile" schema, a new behavior is introduced. Given a configuration that looks like this: ```hcl variable "agent_pools" { type = list(object({ count = number, name = string, os_disk_size_gb = number, os_type = string, vm_size = string, })) default = [ { count = 1, name = "nodepool1", os_disk_size_gb = 30, os_type = "Linux", vm_size = "Standard_B2s", }, { count = 1 name = "nodepool2" os_disk_size_gb = 30 os_type = "Linux" vm_size = "Standard_F2", }, ] } resource "azurerm_kubernetes_cluster" "cluster" { // Other required fields skipped for brevity dynamic "agent_pool_profile" { for_each = var.agent_pools content { count = agent_pool_profile.value.count name = agent_pool_profile.value.name vm_size = agent_pool_profile.value.vm_size # az vm list-sizes --location centralus os_type = agent_pool_profile.value.os_type os_disk_size_gb = agent_pool_profile.value.os_disk_size_gb } } } ``` The `terraform apply` is successful, but if you choose to swap your list around such that `nodepool2` comes before `nodepool1`: ```hcl variable "agent_pools" { type = list(object({ count = number, name = string, os_disk_size_gb = number, os_type = string, vm_size = string, })) default = [ { count = 1 name = "nodepool2" os_disk_size_gb = 30 os_type = "Linux" vm_size = "Standard_F2", }, { count = 1, name = "nodepool1", os_disk_size_gb = 30, os_type = "Linux", vm_size = "Standard_B2s", }, ] } ``` Subsequent `terraform plan`s work as epxected, but you can't `terraform apply`, because the request that is sent to the Azure RM is to update the agent pool profile with a name that already exists. All agent pool profile names *must* be unique. ```hcl ~ agent_pool_profile { availability_zones = [] count = 1 enable_auto_scaling = false max_count = 0 max_pods = 110 min_count = 0 ~ name = "nodepool1" -> "nodepool2" node_taints = [] os_disk_size_gb = 30 os_type = "Linux" type = "AvailabilitySet" vm_size = "Standard_B2s" } ~ agent_pool_profile { availability_zones = [] count = 1 enable_auto_scaling = false max_count = 0 max_pods = 110 min_count = 0 ~ name = "nodepool2" -> "nodepool1" node_taints = [] os_disk_size_gb = 30 os_type = "Linux" type = "AvailabilitySet" vm_size = "Standard_B2s" } ``` For my use case, it's far more practical to deal with this behavior than it is to create a new cluster every time a new agent pool profile is added to my config. This said, if someone knows an obvious fix, I'm happy to implement it. Thank you for your review 🙏
This has been released in version 1.37.0 of the provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading. As an example: provider "azurerm" {
version = "~> 1.37.0"
}
# ... other configuration ... |
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 [email protected]. Thanks! |
Community Note
Terraform (and AzureRM Provider) Version
Terraform 0.12.6
AzureRM provider 1.35.0
Affected Resource(s)
azurerm_kubernetes_cluster
Terraform Configuration Files
Debug Output
The first Terraform Apply is fine, the cluster is created, no issues whatsoever.
Panic Output
On a second Terraform run, without ANY code changes, Terraform wants to replace the whole cluster because it thinks some
agent_pool_profile
have changed, which is false. It seems like the ordering of theagent_pool_profile
is messed up in the statefile or something.Expected Behavior
Terraform should not detect any changes.
Actual Behavior
Terraform detects changes in the agent_pools and attempts to re-create the AKS cluster.
Steps to Reproduce
terraform apply
Important Factoids
References
The text was updated successfully, but these errors were encountered: