terraform, gcp: node pool creation followup issue diagnosis and resolution #2768

consideRatio · 2023-07-07T07:54:24Z

When creating a cluster today using an updated master branch (with changes to terraform logic from #2758 (EDIT: previously linked to the wrong PR here), the core node pool was successfully created, but the user notebook node pools were not.

Extract from terraform apply

  # google_container_node_pool.core will be created
  + resource "google_container_node_pool" "core" {
      + location                    = "southamerica-east1"
      + name                        = "core-pool"
      + node_locations              = (known after apply)

  # google_container_node_pool.notebook["large"] will be created
  + resource "google_container_node_pool" "notebook" {
      + location                    = (known after apply)
      + name                        = "nb-large"
      + node_locations              = [
          + "southamerica-east1-c",
        ]

google_container_node_pool.core: Creation complete after 1m52s [id=projects/catalystproject-392106/locations/southamerica-east1/clusters/latam-cluster/nodePools/core-pool]

│ Error: Cannot determine zone: set in this resource, or set provider-level zone.
│ 
│   with google_container_node_pool.notebook["large"],
│   on cluster.tf line 238, in resource "google_container_node_pool" "notebook":
│  238: resource "google_container_node_pool" "notebook" {
│ 
╵

Analysis

It seems like core nodes and user nodes are configured differently, where core nodes get location set, while user nodes get node_locations set, and it seems that it causes a failure for user nodes.

I opened #2769 about the possibly non-optimial situation of having core nodes not configured to the locations where user nodes run.

Trial

It seems that what was needed for things to work was to add the following config to the user node pools, which is already configured for the core pool:

location = google_container_cluster.cluster.location

The text was updated successfully, but these errors were encountered:

consideRatio · 2023-07-07T09:49:08Z

Will get this resolved with a PR soon.

pnasrat · 2023-07-07T09:56:03Z

I thought @yuvipanda fixed that yesterday

@consideRatio see #2758

consideRatio · 2023-07-07T10:08:11Z

@pnasrat I used a local branch with #2758 merged and experienced this still. I think there can be a difference between making terraform updates vs creating something from scratch as I did now.

I assume that whenever node_locations (a list of zones) are specified, you also need to specify location (a region), and we are currently not doing that - we are only specifying node_locations (except for the core nodes where we only specify location, tracked separately in #2769).

github-project-automation bot added this to DEPRECATED Engineering and Product Backlog Jul 7, 2023

github-project-automation bot moved this to Needs Shaping / Refinement in DEPRECATED Engineering and Product Backlog Jul 7, 2023

consideRatio self-assigned this Jul 7, 2023

consideRatio mentioned this issue Jul 7, 2023

terraform, gcp: set location to cluster location for user and dask-worker node pools #2771

Merged

consideRatio closed this as completed in #2771 Jul 7, 2023

github-project-automation bot moved this from Needs Shaping / Refinement to Complete in DEPRECATED Engineering and Product Backlog Jul 7, 2023

damianavila added this to Sprint Board Jul 11, 2023

damianavila moved this to Done 🎉 in Sprint Board Jul 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

terraform, gcp: node pool creation followup issue diagnosis and resolution #2768

terraform, gcp: node pool creation followup issue diagnosis and resolution #2768

consideRatio commented Jul 7, 2023 •

edited

Loading

consideRatio commented Jul 7, 2023

pnasrat commented Jul 7, 2023 •

edited

Loading

consideRatio commented Jul 7, 2023 •

edited

Loading

terraform, gcp: node pool creation followup issue diagnosis and resolution #2768

terraform, gcp: node pool creation followup issue diagnosis and resolution #2768

Comments

consideRatio commented Jul 7, 2023 • edited Loading

Extract from terraform apply

Analysis

Trial

consideRatio commented Jul 7, 2023

pnasrat commented Jul 7, 2023 • edited Loading

consideRatio commented Jul 7, 2023 • edited Loading

consideRatio commented Jul 7, 2023 •

edited

Loading

pnasrat commented Jul 7, 2023 •

edited

Loading

consideRatio commented Jul 7, 2023 •

edited

Loading