Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Debug node autoprovisioning: did not match Pod's node affinity #677

Closed
sgibson91 opened this issue Sep 13, 2021 · 7 comments
Closed

Debug node autoprovisioning: did not match Pod's node affinity #677

sgibson91 opened this issue Sep 13, 2021 · 7 comments
Assignees

Comments

@sgibson91
Copy link
Member

sgibson91 commented Sep 13, 2021

Description

In #670 we enabled node auto-provisioning. In practice what we are seeing when trying to create pods is the following message:

Events:
  Type     Reason             Age   From                 Message
  ----     ------             ----  ----                 -------
  Warning  FailedScheduling   38s   prod-user-scheduler  0/2 nodes are available: 2 node(s) didn't match node selector.
  Warning  FailedScheduling   38s   prod-user-scheduler  0/2 nodes are available: 2 node(s) didn't match node selector.
  Normal   NotTriggerScaleUp  38s   cluster-autoscaler   pod didn't trigger scale-up: 1 node(s) didn't match Pod's node affinity

And this is preventing any new node from coming up.

Value / benefit

We need to spin nodes up!

Implementation details

No response

Tasks to complete

No response

Updates

  • 2021-09-13 - We've decided to manually provision the Pangeo cluster for now, so we have a bit more time to debug this one. Bumped down to medium impact now.
@sgibson91
Copy link
Member Author

sgibson91 commented Sep 13, 2021

From what I'm understanding of these docs, the auto-provisioner should be creating nodes with the same tolerations/node selectors as the pod that is trying to spin up https://cloud.google.com/kubernetes-engine/docs/how-to/node-auto-provisioning#workload_separation

@sgibson91
Copy link
Member Author

I did the following:

kubectl get pod jupyter-sgibson91 -o yaml > mypod.yaml

Edited the yaml and removed the nodeSelector

kubectl apply -f mypod.yaml

And that seemed to spin up fine

@sgibson91
Copy link
Member Author

sgibson91 commented Sep 13, 2021

So, nodes can't be spun up because the node auto-provisioner is expecting to create nodes with the label node-purpose: core (which it gets from the core pool) and the pods want to be scheduled to node-purpose: user.

In the pangeo hub config file, we've tried setting the following:

singleuser:
  nodeSelector:
    hub.jupyter.org/node-purpose: ""
singleuser:
  nodeSelector: {}
singleuser:
  nodeSelector: null

But none of those were successful at removing the node selector from the user pod.

Instead, I removed the following lines from our basehub chart and that got us to a place where user pods could be scheduled and would start up, but they'd always be assigned to the core pool. It turns out the even the core pool had enough free space for even our largest machines, so I've still not successfully triggered a node auto-provisioning event yet.

https://github.com/2i2c-org/pilot-hubs/blob/658ab0bf507ab35eedd95ac45147e0b0e1babf6e/hub-templates/basehub/values.yaml#L135-L136

@tylerpotts
Copy link

@sgibson91 For what it's worth, in QHub we haven't tried node auto-provisioning. Instead we have node pools that are explicitly defined and pods get scheduled on them. Wish we could be of more help here

@sgibson91
Copy link
Member Author

Thanks @tylerpotts - that is 2i2c's default too. But when I raised the question about appropriate machine sizes for those pool in #666 we found out Pangeo are using auto-provisioning and we didn't really have any data to hand for optimising the machine sizes to expected load.

@tylerpotts
Copy link

@sgibson91 We have been recently struggling with the problem of optimizing workload to node size as well. For the most part we have been allocating a single node per user pod/dask pod which has helped somewhat when it comes to the larger scale clusters.

As far as determining the allocatable resources available on the nodes, we have quite a bit of research detailed here that you may find useful: nebari-dev/nebari#792. Unfortunately there doesn't seem to be a linear formula, as kubernetes reserves variable amounts of millicpu and RAM depending on the size of the node

@choldgraf
Copy link
Member

I've added an update to the top comment, to reflect that we're manually provisioning the Pangeo hub for now!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Status: Done 🎉
Development

No branches or pull requests

3 participants