Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

terraform, gcp: node pool creation followup issue diagnosis and resolution #2768

Closed
consideRatio opened this issue Jul 7, 2023 · 3 comments · Fixed by #2771
Closed

terraform, gcp: node pool creation followup issue diagnosis and resolution #2768

consideRatio opened this issue Jul 7, 2023 · 3 comments · Fixed by #2771
Assignees

Comments

@consideRatio
Copy link
Contributor

consideRatio commented Jul 7, 2023

When creating a cluster today using an updated master branch (with changes to terraform logic from #2758 (EDIT: previously linked to the wrong PR here), the core node pool was successfully created, but the user notebook node pools were not.

Extract from terraform apply

  # google_container_node_pool.core will be created
  + resource "google_container_node_pool" "core" {
      + location                    = "southamerica-east1"
      + name                        = "core-pool"
      + node_locations              = (known after apply)

  # google_container_node_pool.notebook["large"] will be created
  + resource "google_container_node_pool" "notebook" {
      + location                    = (known after apply)
      + name                        = "nb-large"
      + node_locations              = [
          + "southamerica-east1-c",
        ]

google_container_node_pool.core: Creation complete after 1m52s [id=projects/catalystproject-392106/locations/southamerica-east1/clusters/latam-cluster/nodePools/core-pool]

│ Error: Cannot determine zone: set in this resource, or set provider-level zone.
│ 
│   with google_container_node_pool.notebook["large"],
│   on cluster.tf line 238, in resource "google_container_node_pool" "notebook":238: resource "google_container_node_pool" "notebook" {
│ 
╵

Analysis

It seems like core nodes and user nodes are configured differently, where core nodes get location set, while user nodes get node_locations set, and it seems that it causes a failure for user nodes.

I opened #2769 about the possibly non-optimial situation of having core nodes not configured to the locations where user nodes run.

Trial

It seems that what was needed for things to work was to add the following config to the user node pools, which is already configured for the core pool:

location = google_container_cluster.cluster.location
@consideRatio
Copy link
Contributor Author

Will get this resolved with a PR soon.

@consideRatio consideRatio self-assigned this Jul 7, 2023
@pnasrat
Copy link
Contributor

pnasrat commented Jul 7, 2023

I thought @yuvipanda fixed that yesterday

@consideRatio see #2758

@consideRatio
Copy link
Contributor Author

consideRatio commented Jul 7, 2023

@pnasrat I used a local branch with #2758 merged and experienced this still. I think there can be a difference between making terraform updates vs creating something from scratch as I did now.

I assume that whenever node_locations (a list of zones) are specified, you also need to specify location (a region), and we are currently not doing that - we are only specifying node_locations (except for the core nodes where we only specify location, tracked separately in #2769).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Archived in project
2 participants