-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cluster architecture - core node pools in any zone, or in the user zones? #2769
Comments
This should not be true, as they should instead inherit the default from the cluster's node_locations: infrastructure/terraform/gcp/cluster.tf Line 55 in 1aac5c8
|
@yuvipanda ah, looking at a newly created cluster without explicitly configuring node_locations for the core node pool I conclude you are right. I draw the wrong conclusion from seeing how they were explicitly configured for the user nodes, but they are explicit there because they were changed over time perhaps? |
If zones is not explicitly set for nodepools, it will inherit whatever is set for the cluster itself. This makes the code clearer so that is more obvious. Fixes 2i2c-org#2769
In the latam cluster just created, I see:
|
@consideRatio they're set that way so we could override them when necessary, as we do for GPU nodes in LEAP - the single zone was running out of GPUs often. I opened #2777 to make this clearer in the code. |
In the terraform/gcp configuration, we provide
node_locations
for user nodes but not for the core nodes. That means that they will start in any zone in the regional cluster I think. In practice, this makes us able to end up with core nodes in another zone than the user nodes.I suspect this is a bit inefficient, but perhaps not a big deal either. Are there costs of zone-to-zone networking etc that we want to avoid?
Changing this in the common terraform config may require re-creating the core nodes or similar, so I figure the only path towards configuring the core nodes to the zone(s) of the user nodes would have to be done cluster by cluster with a cluster-specific override until we can do it systematically in the common config.
The text was updated successfully, but these errors were encountered: