-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Get capi targetsize from cache #5025
Get capi targetsize from cache #5025
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this makes sense to me, thanks for the update @enxebre
i just have a quick question about
Also note that the scale down implementation has changed https://github.com/kubernetes/autoscaler/commits/master/cluster-autoscaler/core/scaledown.
did you have something specific in mind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with a minor comment
return 0, errors.Wrap(err, "error getting replica count") | ||
} | ||
if !found { | ||
replicas = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we set replicas = ng.scalableResource.MinSize()
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that seems reasonable to me, i guess we still need to default to 0 if there is an issue with minsize though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually I made it an error, as I don't think it is a valid case for it to not be found.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's probably better
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: arunmk, enxebre The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
This ensured that access to replicas during scale down operations were never stale by accessing the API server kubernetes#3104. This honoured that behaviour while moving to unstructured client kubernetes#3312. This regressed that behaviour while trying to reduce the API server load kubernetes#4443. This put back the never stale replicas behaviour at the cost of loading back the API server kubernetes#4634. Currently on e.g a 48 minutes cluster it does 1.4k get request to the scale subresource. This PR tries to satisfy both non stale replicas during scale down and prevent the API server from being overloaded. To achieve that it lets targetSize which is called on every autoscaling cluster state loop from come from cache. Also note that the scale down implementation has changed https://github.com/kubernetes/autoscaler/commits/master/cluster-autoscaler/core/scaledown.
a80af8e
to
b2f1823
Compare
LGTM, i'll let @arunmk add the label |
/lgtm /approve |
/lgtm |
Get capi targetsize from cache
Which component this PR applies to?
Cluster autoscaler - cluster api.
What type of PR is this?
This ensured that access to replicas during scale down operations were never stale by accessing the API server #3104.
This honoured that behaviour while moving to unstructured client #3312.
This regressed that behaviour while trying to reduce the API server load #4443.
This put back the never stale replicas behaviour at the cost of loading back the API server #4634.
Currently on e.g a 48 minutes cluster with no scaling activity it does 1.4k get request to the scale subresource.
This PR tries to satisfy both non stale replicas during scale down and prevent the API server from being overloaded. To achieve that it lets TargetSize which is called on every autoscaling cluster state loop from come from cache while getting fresh replicas at the time of perform the operations.
Also note that the scale down implementation has changed https://github.com/kubernetes/autoscaler/commits/master/cluster-autoscaler/core/scaledown.
/area provider/cluster-api
/kind bug
/kind cleanup