-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KEP-1287: InPlacePodVerticalScaling changes for v1.33 #5089
base: master
Are you sure you want to change the base?
Conversation
/assign @dchen1107 @thockin |
/assign @vinaykul |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: tallclair The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/cc |
@@ -216,6 +216,8 @@ PodStatus is extended to show the resources applied to the Pod and its Container | |||
* Pod.Status.ContainerStatuses[i].Resources (new field, type | |||
v1.ResourceRequirements) shows the **actual** resources held by the Pod and | |||
its Containers for running containers, and the allocated resources for non-running containers. | |||
* Pod.Status.AllocatedResources (new field) reports the aggregate pod-level allocated resources, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does introducing a new field and turning a feature on for beta conflict with normal API changes?
I would think that a new API would need to go into Alpha -> Beta -> BA.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct , InPlacePodVerticalScalingAllocatedStatus feature gate was introduced in v1.32 for this reason. Please check https://github.com/kubernetes/kubernetes/blob/7140b4910c6c1179c9778a7f3bb8037356febd58/pkg/api/pod/util.go#L806
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, good point. I'll update the KEP with these details.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I read it and I think you did add it. I just missed it with this update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was mixed up between the pod-level field, and the container level field. I dropped the podlevel field for now, and I'm proposing adding back the container-level field. We'll still need the pod-level field for resizing pod-level resources, but that will be addresses in a separate KEP update.
* NotRequired - default value; resize the Container without restart, if possible. | ||
* RestartContainer - the container requires a restart to apply new resource values. | ||
* `PreferNoRestart` - default value; resize the Container without restart, if possible. | ||
* `NotRequired` - Equivalent to `PreferNoRestart`, deprecated with v1.33. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Unnecessary whitespace
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was intentional, to group it under PreferNoRestart
`Status...Resources.Requests` values. | ||
To compute the Node resources allocated to Pods, pending resizes must be factored in. | ||
The scheduler will use the maximum of: | ||
1. Desired resources, computed from container requests in the pod spec, unless the resize is marked as `Infeasible` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Numbered list items have 1 , is this in purpose or intention was to be 1,2,3 indicating order?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are rendered as 1,2,3 in markdown.
- Metric name: `runtime_operations_duration_seconds{operation_type=container_update}` | ||
- Components exposing the metric: kubelet | ||
- Metric name: `runtime_operations_errors_total{operation_type=container_update}` | ||
- Components exposing the metric: kubelet | ||
|
||
* **What are the reasonable SLOs (Service Level Objectives) for the above SLIs?** | ||
|
||
- Using `kubelet_container_resize_requests_total`, `completed + infeasible + canceled` request count | ||
should approach `proposed` request count in steady state. | ||
- Resize requests should succeed (`apiserver_request_total{resource=pods,subresource=resize}` with non-success `code` should be low)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unnecessary close bracket
nit: perhaps rephrase to ?
Resize requests should succeed (
apiserver_request_total{resource=pods,subresource=resize}
non-successcode
rate should be low )
|
||
If we need allocated resources & limits in the pod status API, the following options have been | ||
If we need allocated limits in the pod status API, the following options have been |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since intention is to state the motivation should we rephrase to:
If we need to track allocated requests and limits in the pod status API, the following options have been
InPlacePodVerticalScaling changes for v1.33, including:
Proposed
resize status (background)NotRequired
toPreferNoRestart
, and update CRIUpdateContainerResources
contractAllocatedResources
([FG:InPlacePodVerticalScaling] Inconsistency between scheduler & kubelet admission logic kubernetes#129532)/sig node
/milestone v1.33