google_container_cluster with google_container_node_pool update causes 400 badRequest #3035

jkamenik · 2019-02-12T17:46:46Z

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment
If an issue is assigned to the "modular-magician" user, it is either in the process of being autogenerated, or is planned to be autogenerated soon. If an issue is assigned to a user, that user is claiming responsibility for the issue. If an issue is assigned to "hashibot", a community member has claimed the issue already.

Terraform Version

terraform -v
Terraform v0.11.8
+ provider.google-beta v1.20.0

Your version of Terraform is out of date! The latest version
is 0.11.11. You can update by downloading from www.terraform.io/downloads.html

Affected Resource(s)

google_container_cluster
google_container_node_pool

Terraform Configuration Files

# The main cluster
resource "google_container_cluster" "main" {
  provider = "google-beta"
  name = "${var.name}"
  description = "K8s cluster ${var.name}"
  zone = "${var.zone}"
  initial_node_count = 1

  min_master_version = "${local.kubernetes_version}"
  node_version = "${local.kubernetes_version}"

  remove_default_node_pool = true

  master_authorized_networks_config = "${var.master_authorized_networks_config}"

  enable_legacy_abac = false

  addons_config {
    horizontal_pod_autoscaling {
      disabled = false
    }

    kubernetes_dashboard {
      disabled = true
    }

    network_policy_config {
      disabled = true
    }
  }

  # Note on purpose we don't enable master_auth as it is less secure then
  # using IAM
  master_auth {
    username = ""
    password = ""
    client_certificate_config {
      issue_client_certificate = false
    }
  }

  master_authorized_networks_config = "${var.master_authorized_networks_config}"

  maintenance_policy {
    daily_maintenance_window {
      start_time = "${var.maintenance_window}"
    }
  }

  resource_labels {
    chargeline = "${lower(var.chargeline_label)}"
    owner = "${lower(var.owner_label)}"
  }

  timeouts {
    create = "${var.create_timeout}"
    update = "${var.update_timeout}"
    delete = "${var.delete_timeout}"
  }
}

resource "google_container_node_pool" "main_pool" {
  provider = "google-beta"
  name = "${join("-",list(var.name,"main"))}"
  cluster = "${google_container_cluster.main.name}"
  zone = "${var.zone}"
  initial_node_count = 1

  # if enable_auto_upgrade is true then don't supply one.
  # Otherwise use main version
  version = "${var.enable_auto_upgrade ? "" : local.kubernetes_version}"

  autoscaling {
    min_node_count = 1
    max_node_count = "${var.main_pool_max_node_count}"
  }

  management {
    auto_repair = "${var.enable_auto_repair}"
    auto_upgrade = "${var.enable_auto_upgrade}"
  }

  node_config {
    disk_size_gb = "${var.node_disk_size}"
    image_type = "${var.main_pool_image_type}"
    machine_type = "${var.main_pool_machine_type}"
    preemptible = "${var.main_pool_preemptible}"
  }

  depends_on = ["google_container_cluster.main"]
}

Debug Output

data.google_container_engine_versions.region: Refreshing state...
google_container_cluster.main: Refreshing state... (ID: johnk)
google_container_node_pool.main_pool: Refreshing state... (ID: us-central1-a/johnk/johnk-main)
module.cluster.google_container_node_pool.main_pool: Destroying... (ID: us-central1-a/johnk/johnk-main)
module.cluster.google_container_cluster.main: Modifying... (ID: johnk)
  min_master_version: "1.11.5-gke.5" => "1.11.6-gke.6"
  node_version:       "1.11.5-gke.5" => "1.11.6-gke.6"
...
module.cluster.google_container_cluster.main: Still modifying... (ID: johnk, 14m19s elapsed)
module.cluster.google_container_cluster.main: Still modifying... (ID: johnk, 14m29s elapsed)

Error: Error applying plan:

1 error(s) occurred:

* module.cluster.google_container_cluster.main: 1 error(s) occurred:

* google_container_cluster.main: googleapi: Error 400: Node_pool_id must be specified., badRequest

Full logs: https://gist.github.com/jkamenik/4fdeff4cb4341358f172910a1cfff3fd

Panic Output

N/A

Expected Behavior

Update the node-pool before updating the main cluster.

Actual Behavior

Both the node-pool and cluster are updated at the same time, and as soon as the node-pool is deleted then the main cluster updates with a 400 error.

Steps to Reproduce

terraform apply
Update the K8s version used
Make the node-pool nodes preemptable
terraform apply

Important Factoids

If the cluster has a second node-pool then it doesn't fail
This might be specific to K8s upgrades at the same time as node-pool destruction.
Applying each update individually works (order doesn't matter)

References

b/299442591

The text was updated successfully, but these errors were encountered:

jkamenik · 2019-02-13T20:02:59Z

I was able to work around this issue via

terraform apply -target "google_container_cluster.main" && \
terraform apply

Signed-off-by: Modular Magician <[email protected]>

rileykarson · 2020-02-11T23:21:50Z

This is due to the underlying API I think. Some operations are impossible on node-less clusters, which happens if your final node pool is deleted. Unfortunately, there isn't anything that can be done in the provider to mitigate this behaviour. Apply ordering is chosen by Terraform Core.

Edwinhr716 · 2023-12-11T23:23:17Z

Wasn't able to reproduce it using terraform 1.6.5. Here's what I attempted:

Code tested:


resource "google_container_cluster" "gke_cluster_2" {
    project = "project-1"
    provider = "google-beta"
    name = "test-cluster-5"
    location = "us-central1"

    min_master_version = "1.11.5-gke.5"
    initial_node_count = 1
    remove_default_node_pool = true

    //default is true, need to disable if the version is less than 1.13.0
    enable_shielded_nodes = false
}


resource "google_container_node_pool" "main_pool_2" {
    
    project = "project-1"
    provider = "google-beta"
    cluster = "test-cluster-5"
    location = "us-central1"
    initial_node_count = 1
    name = "test-node-pool-2"

    version = "1.11.5-gke.5"

    node_config {
        preemptible = false
    }

    depends_on = [ google_container_cluster.gke_cluster_2 ]

}

Steps followed:

terraform apply to create a new cluster and nodepool
Changed master_node_version to 1.11.6-gke.6 in cluster, and version to 1.11.6-gke.6 in nodepool
changed preemptible to true
terraform apply

Output log

google_container_node_pool.main_pool_2: Destroying... [id=projects/project-1v/locations/us-central1/clusters/test-cluster-5/nodePools/test-node-pool-2]
google_container_node_pool.main_pool_2: Still destroying... [id=projects/eproject-1/locatio...t-cluster-5/nodePools/test-node-pool-2, 10s elapsed]
google_container_node_pool.main_pool_2: Still destroying... [id=projects/project-1/locatio...t-cluster-5/nodePools/test-node-pool-2, 20s elapsed]
google_container_node_pool.main_pool_2: Still destroying... [id=projects/project-1/locatio...t-cluster-5/nodePools/test-node-pool-2, 30s elapsed]
google_container_node_pool.main_pool_2: Still destroying... [id=projects/project-1/locatio...t-cluster-5/nodePools/test-node-pool-2, 40s elapsed]
google_container_node_pool.main_pool_2: Still destroying... [id=projects/project-1locatio...t-cluster-5/nodePools/test-node-pool-2, 50s elapsed]
google_container_node_pool.main_pool_2: Still destroying... [id=projects/project-1/locatio...t-cluster-5/nodePools/test-node-pool-2, 1m0s elapsed]
google_container_node_pool.main_pool_2: Still destroying... [id=projects/project-1/locatio...t-cluster-5/nodePools/test-node-pool-2, 1m10s elapsed]
google_container_node_pool.main_pool_2: Still destroying... [id=projects/project-1/locatio...t-cluster-5/nodePools/test-node-pool-2, 1m20s elapsed]
google_container_node_pool.main_pool_2: Still destroying... [id=projects/project-1/locatio...t-cluster-5/nodePools/test-node-pool-2, 1m30s elapsed]
google_container_node_pool.main_pool_2: Destruction complete after 1m32s
google_container_cluster.gke_cluster_2: Modifying... [iproject-1/locations/us-central1/clusters/test-cluster-5]
...
google_container_cluster.gke_cluster_2: Still modifying... [id=projects/project-1v/locations/us-central1/clusters/test-cluster-5, 15m10s elapsed]
google_container_cluster.gke_cluster_2: Still modifying... [id=projects/project-1/locations/us-central1/clusters/test-cluster-5, 15m20s elapsed]
google_container_cluster.gke_cluster_2: Modifications complete after 15m21s [id=projects/edwinhernandez-gke-dev/locations/us-central1/clusters/test-cluster-5]
google_container_node_pool.main_pool_2: Creating...
google_container_node_pool.main_pool_2: Still creating... [10s elapsed]
google_container_node_pool.main_pool_2: Still creating... [20s elapsed]
google_container_node_pool.main_pool_2: Still creating... [30s elapsed]
google_container_node_pool.main_pool_2: Still creating... [40s elapsed]
google_container_node_pool.main_pool_2: Still creating... [50s elapsed]
google_container_node_pool.main_pool_2: Still creating... [1m0s elapsed]
google_container_node_pool.main_pool_2: Still creating... [1m10s elapsed]
google_container_node_pool.main_pool_2: Still creating... [1m20s elapsed]
google_container_node_pool.main_pool_2: Still creating... [1m30s elapsed]
google_container_node_pool.main_pool_2: Still creating... [1m40s elapsed]
google_container_node_pool.main_pool_2: Creation complete after 1m43s [id=projects/project-1/locations/us-central1/clusters/test-cluster-5/nodePools/test-node-pool-2]

Apply complete! Resources: 1 added, 1 changed, 1 destroyed.

ghost added the bug label Feb 12, 2019

modular-magician added a commit to modular-magician/terraform-provider-google that referenced this issue Jan 29, 2020

update bigquery_options to be O+C (hashicorp#3035)

364af2f

Signed-off-by: Modular Magician <[email protected]>

modular-magician added a commit that referenced this issue Jan 29, 2020

update bigquery_options to be O+C (#3035) (#5534)

e7ef6bb

Signed-off-by: Modular Magician <[email protected]>

danawillow pushed a commit that referenced this issue Jan 29, 2020

update bigquery_options to be O+C (#3035) (#5534)

c350de8

Signed-off-by: Modular Magician <[email protected]>

rileykarson added the upstream label Feb 11, 2020

rileykarson added the service/container label Jul 22, 2022

modular-magician added the forward/linked label Sep 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

google_container_cluster with google_container_node_pool update causes 400 badRequest #3035

google_container_cluster with google_container_node_pool update causes 400 badRequest #3035

jkamenik commented Feb 12, 2019 •

edited by roaks3

Loading

jkamenik commented Feb 13, 2019

rileykarson commented Feb 11, 2020

Edwinhr716 commented Dec 11, 2023

google_container_cluster with google_container_node_pool update causes 400 badRequest #3035

google_container_cluster with google_container_node_pool update causes 400 badRequest #3035

Comments

jkamenik commented Feb 12, 2019 • edited by roaks3 Loading

Community Note

Terraform Version

Affected Resource(s)

Terraform Configuration Files

Debug Output

Panic Output

Expected Behavior

Actual Behavior

Steps to Reproduce

Important Factoids

References

jkamenik commented Feb 13, 2019

rileykarson commented Feb 11, 2020

Edwinhr716 commented Dec 11, 2023

jkamenik commented Feb 12, 2019 •

edited by roaks3

Loading