Kubernetes cluster must be replaced because of private_cluster_public_fqdn_enabled #13099

IndependerGerard · 2021-08-23T12:34:13Z

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform (and AzureRM Provider) Version

Terraform - v1.0.5
azurerm - v2.73.0

Affected Resource(s)

azurerm_kubernetes_cluster

Terraform Configuration Files

terraform {
  required_providers {
    azurerm = {
      source = "hashicorp/azurerm"
      version = "2.73.0"
    }
  }
}

provider "azurerm" {
  features {}
}

locals {
  aks_tags_dev = {
    Owner             = "__owner__"
    CreatedBy         = "__createdby__"
    EnvironmentType   = "dev"
    CostCenter        = "4600-9000"
    Website           = "__website__"
  }
}

resource "azurerm_kubernetes_cluster" "aks-web-dev-001" {
  name                                  = "aks-web-dev-001"
  location                              = "__location__"
  resource_group_name                   = "rg-web-dev-001"
  dns_prefix                            = "aks-web-dev-001-dns"
  kubernetes_version                    = "1.19.11"
  node_resource_group                   = "rg-web-dev-nodes-001"

  tags = local.aks_tags_dev

  identity {
    type             = "SystemAssigned"
  }

  default_node_pool {
    name                          = "sysb4ms001"
    vm_size                       = "Standard_B4ms"
    node_count                    = 1
    max_pods                      = 60
    only_critical_addons_enabled  = true
    orchestrator_version          = "1.19.11"
    vnet_subnet_id                = "__aksSubnetId__"

    tags = local.aks_tags_dev
  }

  linux_profile {
    admin_username = "azureuser"

    ssh_key {
      key_data = "__aksSshKey__"
    }
  }

  network_profile {
    dns_service_ip        = "10.201.0.10"
    docker_bridge_cidr    = "172.201.0.1/16"
    network_plugin        = "azure"
    network_policy        = "calico"
    service_cidr          = "10.201.0.0/16"
  }

  role_based_access_control {
    enabled = true
  }
}

Debug Output

Debug contains secrets, so would prefer a private method if necessary for sharing.

Expected Behaviour

The Costcenter tag gets replaced by the CostCenter tag without causing downtime on the cluster.

Output should be something like:

Terraform will perform the following actions:

  # azurerm_kubernetes_cluster.aks-web-dev-001 will be updated in-place
  ~ resource "azurerm_kubernetes_cluster" "aks-web-dev-001" {
    ...
      ~ tags                       = {
          + "CostCenter"      = "4600-9000"
          - "Costcenter"      = "4600-9000" -> null
            # (3 unchanged elements hidden)
        }
    }

Actual Behaviour

The whole cluster gets replaced.

Actual output is something like:

Terraform will perform the following actions:

  # azurerm_kubernetes_cluster.aks-web-dev-001 must be replaced
-/+ resource "azurerm_kubernetes_cluster" "aks-web-dev-001" {
      + private_cluster_public_fqdn_enabled = false # forces replacement
    }

The same configuration works as expected on azurerm 2.72.0

Steps to Reproduce

terraform plan

The text was updated successfully, but these errors were encountered:

apeschel · 2021-09-08T19:20:02Z

Here's the source of the problem:

terraform-provider-azurerm/internal/services/containers/kubernetes_cluster_resource.go

Line 596 in d347ded

ForceNew: true,

I think this could be fixed by simply removing the default value here?

terraform-provider-azurerm/internal/services/containers/kubernetes_cluster_resource.go

Line 595 in d347ded

Default: false,

hashicorp#13099 This default value causes clusters to be rebuilt on existing deployments. Simply removing the default value should be sufficient for preventing the rebuilds from occuring. If someone needs to explicitly set this to false for some reason, they can still do that manually. This is a preferable situation to existing users not being able to upgrade without a complete cluster rebuild.

Fix hashicorp#13099, to do in place update for `private_cluster_public_fqdn_enabled`

LaurentLesle · 2021-09-20T09:28:42Z

@apeschel I think the default must be kept to false. I agree the forceNew must set to false.
That will match the az cli behavior
az aks update -g $rg -n $CLUSTER --disable-public-fqdn

Now we have found all private aks clusters created with versions up to 2.72.0 have this feature enabled in the background. When upgrading to azurerm 2.73+ it is causing the cluster to be recreated.

At least if the forceNew is set to false that will only trigger an update of the existing cluster and set this value to false as per the default setting.

The provider documentation says "This requires that the Preview Feature Microsoft.ContainerService/EnablePrivateClusterPublicFQDN is enabled and the Resource Provider is re-registered". In our case the preview feature was not enabled and we observed the behavior described in this issue.

arnaudlh · 2021-09-20T09:29:27Z

Please fix as it impacting our customers in APAC!

apeschel · 2021-09-21T19:36:09Z

@LaurentLesle I had assumed ForceNew was set to true, because if the value is toggled from false to true, or vice versa, a cluster rebuild would be required. I might be mistaken though.

If this option is something that does require a cluster rebuild though, the the right solution would be to keep ForceNew = true and simply remove the default value, or maybe some other method.

LaurentLesle · 2021-09-22T01:36:42Z

@apeschel I took the az cli as the base reference to confirm this is an in-place upgrade of the aks cluster. So ForceNew should be set to false
az aks update -g $rg -n $CLUSTER --disable-public-fqdn

apeschel · 2021-09-22T17:05:37Z

@apeschel I took the az cli as the base reference to confirm this is an in-place upgrade of the aks cluster. So ForceNew should be set to false
az aks update -g $rg -n $CLUSTER --disable-public-fqdn

Yes, your logic here makes sense, but you're not addressing the much more likely scenario: what if you are toggling the value from false to true, or from true to false?

ForceNew indicates that any change in this field requires the resource to be destroyed and recreated.

…o longer force new (#13413) Fix #13099, to do in place update for private_cluster_public_fqdn_enabled

IndependerGerard · 2021-09-24T14:28:23Z

I can confirm that the issue on our side is fixed with azurerm - v2.78.0. Thanks to everyone involved in fixing this issue.

github-actions · 2021-10-25T02:07:43Z

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

apeschel mentioned this issue Sep 8, 2021

Backwards compatible fix for Cluster Variable #13280

Closed

katbyte added bug service/kubernetes-cluster labels Sep 10, 2021

hieumoscow added a commit to hieumoscow/terraform-provider-azurerm that referenced this issue Sep 20, 2021

Update kubernetes_cluster_resource.go

7dd0b5a

Fix hashicorp#13099, to do in place update for `private_cluster_public_fqdn_enabled`

hieumoscow mentioned this issue Sep 20, 2021

azurerm_kubernetes_cluster - private_cluster_public_fqdn_enabled is no longer force new #13413

Merged

arindamdat mentioned this issue Sep 21, 2021

Azure Kubernetes cluster gets replaced when upgrading from 1.19.11 to 1.20.9 #13438

Closed

katbyte closed this as completed in #13413 Sep 23, 2021

katbyte pushed a commit that referenced this issue Sep 23, 2021

azurerm_kubernetes_cluster - private_cluster_public_fqdn_enabled is n…

072d02f

…o longer force new (#13413) Fix #13099, to do in place update for private_cluster_public_fqdn_enabled

github-actions bot locked as resolved and limited conversation to collaborators Oct 25, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kubernetes cluster must be replaced because of private_cluster_public_fqdn_enabled #13099

Kubernetes cluster must be replaced because of private_cluster_public_fqdn_enabled #13099

IndependerGerard commented Aug 23, 2021 •

edited

Loading

apeschel commented Sep 8, 2021

LaurentLesle commented Sep 20, 2021

arnaudlh commented Sep 20, 2021

apeschel commented Sep 21, 2021 •

edited

Loading

LaurentLesle commented Sep 22, 2021

apeschel commented Sep 22, 2021

IndependerGerard commented Sep 24, 2021

github-actions bot commented Oct 25, 2021

Kubernetes cluster must be replaced because of private_cluster_public_fqdn_enabled #13099

Kubernetes cluster must be replaced because of private_cluster_public_fqdn_enabled #13099

Comments

IndependerGerard commented Aug 23, 2021 • edited Loading

Community Note

Terraform (and AzureRM Provider) Version

Affected Resource(s)

Terraform Configuration Files

Debug Output

Expected Behaviour

Actual Behaviour

Steps to Reproduce

apeschel commented Sep 8, 2021

LaurentLesle commented Sep 20, 2021

arnaudlh commented Sep 20, 2021

apeschel commented Sep 21, 2021 • edited Loading

LaurentLesle commented Sep 22, 2021

apeschel commented Sep 22, 2021

IndependerGerard commented Sep 24, 2021

github-actions bot commented Oct 25, 2021

IndependerGerard commented Aug 23, 2021 •

edited

Loading

apeschel commented Sep 21, 2021 •

edited

Loading