Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VPA: Support vertical scaling of injected containers #5617

Open
jutley opened this issue Mar 24, 2023 · 7 comments
Open

VPA: Support vertical scaling of injected containers #5617

jutley opened this issue Mar 24, 2023 · 7 comments
Labels
area/vertical-pod-autoscaler kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.

Comments

@jutley
Copy link

jutley commented Mar 24, 2023

Which component are you using?:

Vertical Pod Autoscaler

Is your feature request designed to solve a problem? If so describe the problem this feature should solve.:

There are scenarios where we inject sidecar containers into many pods. There is not a one-size-fits-all resource allocation for these pods, so I really want to be able to start with a generic value, and then use a VPA to tune these to a more appropriate value for the workload. Currently, injected containers seem to be entirely ignored, and there is no way to explicit ask for them to be configured.

Describe the solution you'd like.:

I would like to be able to configure either the VPA pods or the VPA resources in a way that allows us to include these injected containers in the VPA scaling process. I think the best option may be in the VPA pods. Since container injection is generally a cluster-admin responsibility, it makes sense that we would want to introduce this configuration at a level maintained by cluster admins.

I can imagine a new flag being introduced for the recommender like --injected-sidecar-allowlist which enumerates the container names which we want to support VPA scaling for.

Describe any alternative solutions you've considered.:

We can augment the logic of our container injections to try to set more appropriate values for their resources, but this is very hard! After all, this is the point of the VPA existing in the first place.

@jutley jutley added the kind/feature Categorizes issue or PR as related to a new feature. label Mar 24, 2023
@voelzmo
Copy link
Contributor

voelzmo commented Mar 29, 2023

Hey @jutley, thanks for the detailed description of what you're trying to achieve!

While I'm not exactly sure, why the feature to ignore injected sidecar containers was introduced in the first place, I'm sure there was a good reason for it. @jbartosik @krzysied, can you maybe remember more context about why calculating recommendations for injected sidecars can be problematic?

That being said, I think it should be possible to

  • either introduce a parameter for the admission-webhook and pass it into the ObservedContainersCalculator, so it can include the sidecar container's name when calculating the patch for the observedContainers annotation. This would lead to all VPAs having this container name in their annotation, even if the Pod doesn't have the injected sidecar container
  • or create a new annotation on the Pod or the VPA (like vpaObservedSidecarContainers) which contains a list of names to added when calculating the patch for the observedContainers annotation. With this solution, you'd have to add an additional annotation to all Pods that get the sidecar injected. That allows you to ensure only the Pods that actually have the sidecar will get an vpaObservedContainers annotation with the sidecar container's name in it – but it makes it also possible to forget adding the new annotation and end up wondering why the sidecar is there but doesn't get any resource updates.

If there's no general case to be made why we shouldn't update sidecar container resources at all, I think we can add something like sketched out above, WDYT @jbartosik?

@jbartosik
Copy link
Collaborator

jbartosik commented May 9, 2023

The reason for #2795 (VPA ignoring injected sidecars) is that some sidecars might be injected by webhooks acting after VPA admission controller. This is a problem because in that case VPA admission controller won't change requests of the injected side car. So when the pod starts VPA will evict the pod (because the injected side car has unadjusted request), causing an eviction loop.

I have a few ideas for how to handle this:

EDIT: reinvocationPolicy was beta in 1.15, but I can't find quickly:

  • Why we decided not to use it,
  • When it went GA / if it went GA (if it wasn't GA in 2020-02 maybe that's the reason we didn't use it?).

@jutley
Copy link
Author

jutley commented May 15, 2023

That reasoning makes a lot of sense. The reinvocationPolicy configuration seems like it should do the trick, and I don't see any reason that adding it would cause issues. The key requirements is that the mutation functionality must be idempotent, and this webhook just changes existing fields.

I can't find any reference to when this feature graduated from beta to GA, but the official docs on admission controllers reference this feature without any mention of it being in beta. I think it is reasonable to assume that it is GA.
https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/

@Mason-776
Copy link

For reference, it looks like reinvocationPolicy was added in v1.15.0-beta.2 and graduated from v1beta1 to v1 in v1.16.0-alpha.1

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 20, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 19, 2024
@voelzmo
Copy link
Contributor

voelzmo commented Feb 21, 2024

/remove-lifecycle rotten
/lifecycle frozen

This is still a thing. My understanding is that we should offer a switch to the admission-controller to disable adding the vpaObservedContainers annotation and setting reinvocationPolicy: IfNeeded.

@jutley On a sidenote: I think you can achieve the desired behavior already now, if you ensure that the vpa webhook will be called after the webhook that injects the container. This way, the container will be part of the vpaObservedContainers annotation and everything will behave as with the other containers.

Currently, the advice is the exact opposite to your situation: we want people to make sure that the vpa webhook is executed before all injecting webhooks, such that the feature of ignoring injected sidecars works properly.

@k8s-ci-robot k8s-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. labels Feb 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/vertical-pod-autoscaler kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.
Projects
None yet
Development

No branches or pull requests

6 participants