Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an option to trigger iterations more frequently #6589

Merged
merged 1 commit into from
Mar 15, 2024

Conversation

kawych
Copy link
Contributor

@kawych kawych commented Mar 6, 2024

What type of PR is this?

What this PR does / why we need it:

Trigger new autoscaling iterations based on two additional criteria:

  1. There are new unschedulable pods - reduces autoscaling latency slightly
  2. Last iteration was productive (there was a scale-up or scale-down) - increases autoscaling throughput in cases where there are multiple heterogeneous workloads, each requiring a separate iteration.

This also avoids burning CPU unnecessarily, which would happen if we just reduced scanInterval to some value close to zero.

This functionality is flag-guarded, disabled by default.

Does this PR introduce a user-facing change?

NONE

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Mar 6, 2024
@kawych
Copy link
Contributor Author

kawych commented Mar 6, 2024

CC @x13n

@k8s-ci-robot k8s-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Mar 6, 2024
@gjtempleton
Copy link
Member

I take it we're planning on following this up with a separate PR to make use of this functionality?

@x13n
Copy link
Member

x13n commented Mar 11, 2024

I a bit hesitant exposing internal fields in static autoscaler like this when there's plenty of existing hooks in static autoscaler already. Could this instead be implemented as scale up/down status processor?

@kawych
Copy link
Contributor Author

kawych commented Mar 11, 2024

@gjtempleton I actually planned to do this in a forked Cluster Autoscaler implementation, assuming that exposing these fields won't hurt. But I can follow up with enabling the same in this repo if the repo owners are OK with it.

@x13n can you elaborate more on what you are worried about? I can agree that exposing all of the internal state would not be a good idea, but for these specific fields it's fairly natural to show them to the external world (readonly). Additionally, the "run" function where these values would be actually used [1] can easily access outputs from cluster autoscaler, while for processors don't have any natural way of passing the results there.

[1] https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/main.go#L574

@x13n
Copy link
Member

x13n commented Mar 12, 2024

This PR only exposes two fields, so it is a bit hard to discuss how exactly it will be used. Regading the run function - we should limit the amount of logic in main, rather than keep adding to it.

But to Guy's point above: Let's maybe start with a purpose of the change and then decide how to best fulfill that purpose instead of starting by exposing internal fields of StaticAutoscaler without extra context? Maybe exposing these fields is the way to go, but it is hard to decide if that's the entire change.

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Mar 13, 2024
@kawych kawych changed the title Expose autoscaler's recent activity times Add an option to trigger iterations more frequently Mar 14, 2024
@kawych
Copy link
Contributor Author

kawych commented Mar 14, 2024

We discussed this change offline with @x13n and decided to implement a more comprehensive (flag-guarded) feature of improving autoscaling throughput and latency in the same PR.


metrics.UpdateDurationFromStart(metrics.Main, loopStart)
}
runAutoscalerOnce := func(loopStart time.Time) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason not to make it a proper function, perhaps in the new loop module?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

)

// StaticAutoscaler exposes recent autoscaler activity
type StaticAutoscaler interface {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't need to be public, this interface is meant to be consumed locally in this module. Also, StaticAutoscaler is the current implementation of the interface, but it doesn't make a lot of sense as the name - with the current set of exposed functions something along the lines of scalingTimesGetter would be more appropriate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

// Wait waits for the next autoscaling iteration
func (t *LoopTrigger) Wait(lastRun time.Time) {
sleepStart := time.Now()
defer metrics.UpdateDurationFromStart("loopWait", sleepStart)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"loopWait" should be a constant in metrics.go, same as other function labels.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

if !t.autoscaler.LastScaleUpTime().Before(lastRun) ||
!t.autoscaler.LastScaleDownDeleteTime().Before(lastRun) {
select {
case <-t.podObserver.unschedulablePodChan:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you're clearing the signal about unschedulable pod appearing here, but not about scan interval passing. Why are you treating some triggers differently than others? This will lead to wasting a loop that isn't needed every now and then, which is not terrible, but definitely something we could avoid.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're clearing just this one channel because it persists between loops, while the time.After() is re-created with each iteration. I don't think it will have the effect you're suggesting, but I'll actually test that, for now publishing the other responses.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, you're right, we don't make a call to After() at all in this branch, nvm my comment then :)

@x13n
Copy link
Member

x13n commented Mar 15, 2024

/assign

Copy link
Contributor Author

@kawych kawych left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, these are great suggestions.

Putting it on hold for now since I'll want to re-test it when approved.

/hold

)

// StaticAutoscaler exposes recent autoscaler activity
type StaticAutoscaler interface {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

// Wait waits for the next autoscaling iteration
func (t *LoopTrigger) Wait(lastRun time.Time) {
sleepStart := time.Now()
defer metrics.UpdateDurationFromStart("loopWait", sleepStart)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

if !t.autoscaler.LastScaleUpTime().Before(lastRun) ||
!t.autoscaler.LastScaleDownDeleteTime().Before(lastRun) {
select {
case <-t.podObserver.unschedulablePodChan:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're clearing just this one channel because it persists between loops, while the time.After() is re-created with each iteration. I don't think it will have the effect you're suggesting, but I'll actually test that, for now publishing the other responses.


metrics.UpdateDurationFromStart(metrics.Main, loopStart)
}
runAutoscalerOnce := func(loopStart time.Time) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 15, 2024
more frequently: based on new unschedulable pods and every time a
previous iteration was productive.
@x13n
Copy link
Member

x13n commented Mar 15, 2024

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 15, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: kawych, x13n

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 15, 2024
@kawych
Copy link
Contributor Author

kawych commented Mar 15, 2024

I re-tested it, submitting.

/unhold

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 15, 2024
@k8s-ci-robot k8s-ci-robot merged commit 109998d into kubernetes:master Mar 15, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/cluster-autoscaler cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants