-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce guarantee memory to avoid going beyond the allocatable value #804
Conversation
Each node has some allocatable resources [1] If you try to schedule a pod with guarantee requirements above the allocatable values, it will fail to be spawned. With the "very large" option, we are requesting 110G at minimum and that is dangerously close the the theoretical allocatable memory for the n1-standard-32 node. Then, let's give that guarantee value some breath ;-) [1] https://cloud.google.com/kubernetes-engine/docs/concepts/cluster-architecture#memory_cpu
I have manually deployed this one to meom staging and it seems to work as intended! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmmm - I gave this a shot and ran into the pod didn't trigger scale-up
issue :-/
also a couple quick comments!
@@ -75,7 +75,7 @@ hubs: | |||
description: "~32 CPU, ~128G RAM" | |||
kubespawner_override: | |||
mem_limit: 128G | |||
mem_guarantee: 110G | |||
mem_guarantee: 100G | |||
node_selector: | |||
node.kubernetes.io/instance-type: n1-standard-32 | |||
- display_name: "Huge" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there something similar we should do for "huge"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fetching info from the nodes:
small >> 5758056Ki >> 5.75 GB allocatable (mem_guarantee: 5GB)
medium >> 27155328Ki >> 27.15 GB allocatable (mem_guarantee: 25GB)
large >> 56183024Ki >> 56.18 GB allocatable (mem_guarantee: 50GB)
very large >> 114336712Ki >> 114.33G allocatable (mem_guarantee: 100GB with this PR)
huge >> 235367296Ki >> 235.37GB allocatable (mem_guarantee: 220G)
I think we are OK with the huge one but happy to discuss all the other values 😉 .
Did you try with |
Confirmed that this does work with staging! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 For updating the description of in spawn options to say ~110 GB of memory, but this LGTM and can be merged.
I will merge it as is because this is what I validated on |
Follow-up: #809 |
Each node has some allocatable resources [1]
If you try to schedule a pod with guarantee requirements above the
allocatable values, it will fail to be spawned.
With the "very large" option, we are requesting 110G at a minimum and that
is dangerously close to the theoretical allocatable memory for the
n1-standard-32 node. Then, let's give that guarantee value some
breath ;-)
[1] https://cloud.google.com/kubernetes-engine/docs/concepts/cluster-architecture#memory_cpu
More details about the whole debugging lives on the corresponding freshdesk-support thread