Getting the CA to play well with a custom scheduler #1406

consideRatio · 2018-11-15T14:22:05Z

I opened this issue: kubernetes/kubernetes#71070

I understand it as that the Cluster Autoscaler is utilizing hardcoded logic from the default scheduler. Perhaps it would it be possible in the future to avoid hardcoding this but instead cooperating with the scheduler for the pod?

MaciekPytel · 2018-11-15T14:52:13Z

There were many discussions of this in the past. The problem is CA doesn't actually know anything about scheduling. It's built around importing scheduler code and using it as a sort of black box oracle. All CA decisions are based on simulations, performed by creating in memory node objects and feeding them to scheduler to see if currently pending pods would be able to schedule if a new node was added.

There were some discussions about exposing a 'dry run' API in scheduler. The problem is we need to run a ton of those scenarios, using some imaginary nodes, pretending some nodes don't exist or pretending some pods are running on different pods than they really are. Scheduler doesn't support any of that and even if it did we believe the performance impact of serializing all those objects and sending them to scheduler is prohibitive (we would need to make thousands of such requests per loop).

To sum up full support for custom scheduling would require massive changes in both scheduler and CA and we don't think it would work anyway. I think it's safe to say it's not likely to happen for general case.

Now, some forms of custom scheduling are fine - if you only touch priority functions, not predicates, than there is no problem, as CA ignores those anyway. If all you do is change scheduler config it should be possible to expose analogous config options in CA that would be forwarded to imported scheduler code. Finally if you add a new predicate function, you should be able to import it into CA, build your own image and use that.

consideRatio · 2018-11-15T14:58:35Z

@MaciekPytel this was an excellent write up! Thank you!!

consideRatio mentioned this issue Nov 15, 2018

Conflicting scale up events on pods scheduled with secondary scheduler kubernetes/kubernetes#71070

Closed

consideRatio closed this as completed Nov 15, 2018

consideRatio mentioned this issue Nov 15, 2018

WIP: A deployment story - Using GPUs on GKE jupyterhub/zero-to-jupyterhub-k8s#994

Open

This was referenced Mar 24, 2023

support overprovsioning without pending pods #5377

Closed

leave a buffer of underutilized nodes when scaling down #5611

Closed

yaroslava-serdiuk added a commit to yaroslava-serdiuk/autoscaler that referenced this issue Feb 22, 2024

Add feature gate for priorities sorting in a cohort (kubernetes#1406)

4cff47b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Getting the CA to play well with a custom scheduler #1406

Getting the CA to play well with a custom scheduler #1406

consideRatio commented Nov 15, 2018

MaciekPytel commented Nov 15, 2018

consideRatio commented Nov 15, 2018

Getting the CA to play well with a custom scheduler #1406

Getting the CA to play well with a custom scheduler #1406

Comments

consideRatio commented Nov 15, 2018

MaciekPytel commented Nov 15, 2018

consideRatio commented Nov 15, 2018