Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chart: Add option to install VPA #5558

Merged
merged 2 commits into from
Mar 27, 2023

Conversation

avorima
Copy link
Contributor

@avorima avorima commented Mar 2, 2023

What type of PR is this?

/kind feature

What this PR does / why we need it:

I currently install a VPA for the cluster-autoscaler alongside the chart, but it would be nice to get it all from the same source.

Does this PR introduce a user-facing change?

Add chart option to install VPA

@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Mar 2, 2023
@k8s-ci-robot k8s-ci-robot requested a review from gjtempleton March 2, 2023 14:18
@k8s-ci-robot k8s-ci-robot requested a review from MaciekPytel March 2, 2023 14:18
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Mar 2, 2023
Copy link
Member

@gjtempleton gjtempleton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/assign @gjtempleton

A couple of small requests.

@@ -11,4 +11,4 @@ name: cluster-autoscaler
sources:
- https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler
type: application
version: 9.25.0
version: 9.25.1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're adding new functionality here, so can we make this a minor version bump please?

Comment on lines 314 to 315
The chart can install a VPA for the Deployment if needed. A VPA can help minimize wasted resources when usage spikes periodically.
The following example snippet can be used to install a `autoscaling.k8s.io/v1` VPA.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we link out to the VPA docs please, there's a reasonable chance users will be reading this Readme disconnected from this repo, and may want to find out more about the VPA.

@avorima avorima requested review from gjtempleton and removed request for MaciekPytel March 7, 2023 08:59
Copy link
Contributor

@voelzmo voelzmo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Some small comments about VPA configuration settings, the API version is the only thing that I'd really recommend to change – the other two remarks are just to start a discussion. I wouldn't argue against merging this, even if you were to decide to keep these as-is.

@@ -0,0 +1,20 @@
{{- if .Values.vpa.enabled -}}
apiVersion: "autoscaling.k8s.io/{{ .Values.vpa.apiVersion }}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if we need optional support for v1beta2 – do we have good reasons for allowing people to use v1beta2 instead of v1?

The recent VPA release 0.13.0 deprecated this api version. This means it will stay around for a short while, but is on a clear track to be removed.
The v1 api version has been around since 2019, so I think it should be safe to hardcode this template to v1.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sure. I was actually trying to find out when it got released, but I must've missed those release notes.

# memory: 150Mi
# maxAllowed:
# cpu: 100m
# memory: 300Mi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how much we want this commented out bit to be a suggestion to people to use these values, or if they're just here for illustration purposes. Probably this is what users will configure as a starting point if they don't have an indication to use different values.
Given that the deployment template uses cpu: 100m and memory: 300Mi as starting points, it seems pretty strict to use the same values as maxAllowed – this would mean the CA would never be scaled up by VPA. In pratice, the main reason I see people use maxAllowed is to ensure that a workload cannot grow to a point where it cannot be scheduled on a node anymore. If we don't have any other scenario in mind, I think we could put higher values into maxAllowed or even leave it out entirely?
For minAllowed, there's a similar case to be made: In our systems I see CAs working fine with cpu: 20m and memory: 50Mi – this configuration here doesn't allow VPA to scale down beyond cpu: 50m and memory: 150Mi, which would be a waste of resources. It is still an improvement on the default values in the Deployment, though, but I'm not familiar enough with CA performance characteristics to say if there is a certain threshold beyond which we wouldn't want VPA to shrink the resources to?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The values here were meant to illustrate the possible contents of a containerPolicy. I'll add a link to the spec instead.

I just checked some deployments and the lowest usage I saw was 2m/22Mi. I'd say 20m/50Mi is a good compromise that still allows some headroom. I'll add these to the example in the README and leave out the maxAllowed.

kind: Deployment
name: {{ template "cluster-autoscaler.fullname" . }}
updatePolicy:
updateMode: "Auto"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see people run their VPA in updateMode: off for a while to observe if anything weird would happen when they switched on updateMode: auto – so maybe this is an option you'd want to make configurable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't think about that. I'll add it

@avorima avorima force-pushed the add-vpa-to-chart branch from 9ca4889 to 29e2554 Compare March 9, 2023 11:23
@avorima avorima requested review from voelzmo and gjtempleton and removed request for gjtempleton and voelzmo March 9, 2023 11:29
Copy link
Contributor

@voelzmo voelzmo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm thanks!

@gjtempleton
Copy link
Member

Thanks for addressing the feedback.

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 27, 2023
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: avorima, gjtempleton, voelzmo

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 27, 2023
@k8s-ci-robot k8s-ci-robot merged commit 28a1abf into kubernetes:master Mar 27, 2023
@avorima avorima deleted the add-vpa-to-chart branch March 13, 2024 16:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/helm-charts cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants