Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cluster architecture - core node pools in any zone, or in the user zones? #2769

Closed
consideRatio opened this issue Jul 7, 2023 · 4 comments · Fixed by #2777
Closed

Cluster architecture - core node pools in any zone, or in the user zones? #2769

consideRatio opened this issue Jul 7, 2023 · 4 comments · Fixed by #2777
Assignees

Comments

@consideRatio
Copy link
Contributor

In the terraform/gcp configuration, we provide node_locations for user nodes but not for the core nodes. That means that they will start in any zone in the regional cluster I think. In practice, this makes us able to end up with core nodes in another zone than the user nodes.

I suspect this is a bit inefficient, but perhaps not a big deal either. Are there costs of zone-to-zone networking etc that we want to avoid?

Changing this in the common terraform config may require re-creating the core nodes or similar, so I figure the only path towards configuring the core nodes to the zone(s) of the user nodes would have to be done cluster by cluster with a cluster-specific override until we can do it systematically in the common config.

@yuvipanda
Copy link
Member

That means that they will start in any zone in the regional cluster I think.

This should not be true, as they should instead inherit the default from the cluster's node_locations:

node_locations = var.regional_cluster ? [var.zone] : null
. I'll investigate the recent issues again to make sure this is the case.

@consideRatio
Copy link
Contributor Author

@yuvipanda ah, looking at a newly created cluster without explicitly configuring node_locations for the core node pool I conclude you are right.

node-zones

image


I draw the wrong conclusion from seeing how they were explicitly configured for the user nodes, but they are explicit there because they were changed over time perhaps?

yuvipanda added a commit to yuvipanda/pilot-hubs that referenced this issue Jul 7, 2023
If zones is not explicitly set for nodepools, it will inherit
whatever is set for the cluster itself. This makes the code
clearer so that is more obvious.

Fixes 2i2c-org#2769
@yuvipanda
Copy link
Member

In the latam cluster just created, I see:

$  terraform state show google_container_node_pool.core
# google_container_node_pool.core:
resource "google_container_node_pool" "core" {
    cluster                     = "latam-cluster"
    id                          = "projects/catalystproject-392106/locations/southamerica-east1/clusters/latam-cluster/nodePools/core-pool"
    initial_node_count          = 1
    instance_group_urls         = [
        "https://www.googleapis.com/compute/v1/projects/catalystproject-392106/zones/southamerica-east1-c/instanceGroupManagers/gke-latam-cluster-core-pool-01cfc23e-grp",
    ]
    location                    = "southamerica-east1"
    managed_instance_group_urls = [
        "https://www.googleapis.com/compute/v1/projects/catalystproject-392106/zones/southamerica-east1-c/instanceGroups/gke-latam-cluster-core-pool-01cfc23e-grp",
    ]
    name                        = "core-pool"
    node_count                  = 2
    node_locations              = [
        "southamerica-east1-c",
    ]
    project                     = "catalystproject-392106"
    version                     = "1.27.2-gke.1200"

    autoscaling {
        location_policy      = "BALANCED"
        max_node_count       = 5
        min_node_count       = 1
        total_max_node_count = 0
        total_min_node_count = 0
    }

    management {
        auto_repair  = true
        auto_upgrade = false
    }

    network_config {
        create_pod_range     = false
        enable_private_nodes = false
    }

    node_config {
        disk_size_gb      = 30
        disk_type         = "pd-balanced"
        guest_accelerator = []
        image_type        = "COS_CONTAINERD"
        labels            = {
            "hub.jupyter.org/node-purpose" = "core"
            "k8s.dask.org/node-purpose"    = "core"
        }
        local_ssd_count   = 0
        logging_variant   = "DEFAULT"
        machine_type      = "n2-highmem-2"
        metadata          = {
            "disable-legacy-endpoints" = "true"
        }
        oauth_scopes      = [
            "https://www.googleapis.com/auth/cloud-platform",
        ]
        preemptible       = false
        resource_labels   = {}
        service_account   = "[email protected]"
        spot              = false
        tags              = []
        taint             = []

        shielded_instance_config {
            enable_integrity_monitoring = true
            enable_secure_boot          = false
        }

        workload_metadata_config {
            mode = "GKE_METADATA"
        }
    }

    upgrade_settings {
        max_surge       = 1
        max_unavailable = 0
        strategy        = "SURGE"
    }
}

@yuvipanda
Copy link
Member

@consideRatio they're set that way so we could override them when necessary, as we do for GPU nodes in LEAP - the single zone was running out of GPUs often.

I opened #2777 to make this clearer in the code.

@github-project-automation github-project-automation bot moved this from Needs Shaping / Refinement to Complete in DEPRECATED Engineering and Product Backlog Jul 8, 2023
@damianavila damianavila moved this to Done 🎉 in Sprint Board Jul 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants