-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AEP for support of in-place updates for VPA #5755
Conversation
This PR may require API review. If so, when the changes are ready, complete the pre-review checklist and request an API review. Status of requested reviews is tracked in the API Review project. |
c7ebf6c
to
dd5f6b5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, thanks for getting this Enhancement Proposal rolling! I have a few suggestions for changing the text, but mostly questions around the two UpdateMode
s and how we're handling errors and the fallback to eviction.
vertical-pod-autoscaler/enhancements/4016-in-place-updates-support/README.md
Outdated
Show resolved
Hide resolved
vertical-pod-autoscaler/enhancements/4016-in-place-updates-support/README.md
Outdated
Show resolved
Hide resolved
vertical-pod-autoscaler/enhancements/4016-in-place-updates-support/README.md
Show resolved
Hide resolved
vertical-pod-autoscaler/enhancements/4016-in-place-updates-support/README.md
Outdated
Show resolved
Hide resolved
vertical-pod-autoscaler/enhancements/4016-in-place-updates-support/README.md
Outdated
Show resolved
Hide resolved
vertical-pod-autoscaler/enhancements/4016-in-place-updates-support/README.md
Outdated
Show resolved
Hide resolved
vertical-pod-autoscaler/enhancements/4016-in-place-updates-support/README.md
Show resolved
Hide resolved
For VPAs in `InPlaceOnly` and `InPlaceOrRecreate` modes VPA Admission Controller will apply updates to starting pods, | ||
like it does for VPAs in `Initial`, `Auto`, and `Recreate` modes. | ||
|
||
### Applying Disruption-free Updates |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we doing something about the case where an in-place update becomes infeasible
, because the Pod doesn't fit on this Node anymore? Otherwise it could happen that using in-place updates creates more problems for people's workloads.
We could e.g. react to events on the Pod, which should be generated by kubelet
kubelet will generate Events on the Pod whenever a resize is accepted or rejected
I wasn't sure where/how to find which events should be generated in this case, though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW I think VPA should react to both Infeasible (this is rather easy) and also to long-lasting Deferred status: "it fits on this node" might require evicting all other Pods and it might take a long time to actually achieve this (IIUC there is no controller which will just do it for VPA).
I'd like to make sure that a long-lasting Deferred status doesn't block VPA from actuating the recommendation to other Pods in the workload.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the "VPA updater will consider that the update failed if" part I added answer your question? Or were you thinking about something else (scale down in place but then can't scale up in place to the old size)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
vertical-pod-autoscaler/enhancements/4016-in-place-updates-support/README.md
Outdated
Show resolved
Hide resolved
vertical-pod-autoscaler/enhancements/4016-in-place-updates-support/README.md
Outdated
Show resolved
Hide resolved
vertical-pod-autoscaler/enhancements/4016-in-place-updates-support/README.md
Outdated
Show resolved
Hide resolved
vertical-pod-autoscaler/enhancements/4016-in-place-updates-support/README.md
Outdated
Show resolved
Hide resolved
|
||
In `InPlaceOrRecreate` mode (but not in `InPlaceOnly` mode) VPA updater will evict pod to actuate a recommendation if it | ||
attempted to apply the recommendation in place and failed. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to have the order of updater actions explicitly stated.
I expected it would attempt in-place updates first but a) it's only my assumption b) I'm not sure it always would be the case.
I can see at least the following possibilities:
- all resources can be updated in-place and the change matches eviction criteria: attempt in-place, evict on error?
- some resources can be updated in-place, some not and the change matches eviction criteria: would VPA evict already, w/o attempting in-place update (this seems contradictory to "if necessary execute only partial updates" earlier), does this depend on which resource matches eviction criteria, other?
- all or some resources can be updated in-place but the change doesn't match eviction criteria: in-place only?
- no possibility for in-place update: go to the existing updater flow.
We also have priority ordering in updater, maybe it's worth adding a sentence how it would be changed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand what you mean by "order of updater actions". Do you mean order of operations in updaters RunOnce
?
Or do you mean since now there are up to 3 different things we could attempt for any given pod, in which of them we actually do?
The things we could do:
- Apply full update by eviction,
- Apply full update in-place, trigger some container restarts,
- Apply partial update in-place, don't restart any containers
We definitely need to try 2. before attempting 1. Attempt 1. only if 2. failed.
If both 2. and 3. are an option I think we should do 2. (there's some reason we didn't do the smaller disruption-free update earlier so let's move quickly).
I'm not updating AEP yet because I'm not sure what to put there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- for
InPlaceOrRecreate
and 3. forInPlaceOnly
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- for both
InPlaceOrRecreate
andInPlaceOnly
vertical-pod-autoscaler/enhancements/4016-in-place-updates-support/README.md
Show resolved
Hide resolved
@jbartosik , thanks for starting this! I added some comments on top of @voelzmo reviews (thanks!) - nothing major but I think it's worth discussing some finer points to avoid misunderstanding (or getting wrong expectations). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
Thanks for addressing all the comments! I've just added a few comments regarding link formatting for your consideration, otherwise this looks good to me!
(like applying different recommendations during pod initialization) will be introduced as separate enhancement | ||
proposals. | ||
|
||
[in-place update feature](https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/1287-in-place-update-pod-resources) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[in-place update feature](https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/1287-in-place-update-pod-resources) | |
[in-place update feature]: https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/1287-in-place-update-pod-resources |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in #5877
proposals. | ||
|
||
[in-place update feature](https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/1287-in-place-update-pod-resources) | ||
[available in Kubernetes 1.27.](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.27.md#api-change-3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[available in Kubernetes 1.27.](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.27.md#api-change-3) | |
[available in Kubernetes 1.27.]: https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.27.md#api-change-3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in #5877
[`UpdatePriorityCalculator.AddPod`]: https://github.com/kubernetes/autoscaler/blob/114a35961a85efdf3f36859350764e5e2c0c7013/vertical-pod-autoscaler/pkg/updater/priority/update_priority_calculator.go#L81 | ||
[by default 12h]: https://github.com/kubernetes/autoscaler/blob/114a35961a85efdf3f36859350764e5e2c0c7013/vertical-pod-autoscaler/pkg/updater/priority/update_priority_calculator.go#L35 | ||
[by default 10%]: https://github.com/kubernetes/autoscaler/blob/114a35961a85efdf3f36859350764e5e2c0c7013/vertical-pod-autoscaler/pkg/updater/priority/update_priority_calculator.go#L33 | ||
[Outside recommendation range]: https://github.com/kubernetes/autoscaler/blob/114a35961a85efdf3f36859350764e5e2c0c7013/vertical-pod-autoscaler/pkg/updater/priority/priority_processor.go#L73 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Outside recommendation range]: https://github.com/kubernetes/autoscaler/blob/114a35961a85efdf3f36859350764e5e2c0c7013/vertical-pod-autoscaler/pkg/updater/priority/priority_processor.go#L73 | |
[Outside recommended range]: https://github.com/kubernetes/autoscaler/blob/114a35961a85efdf3f36859350764e5e2c0c7013/vertical-pod-autoscaler/pkg/updater/priority/priority_processor.go#L73 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in #5877
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jbartosik, voelzmo The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Thank you all for your work on this 🤗 Excited to try it out! |
/kind documentation
/kind feature
Add Autoscaling Enhancement Proposal to add basic support for in place updates to VPA.
See also #5754 for changes I propose to eviction control enhancement proposal to work better with in place updates
#4016
@voelzmo @kgolab @pbetkier @wangchen615 please take a look