-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The latest VPA app v5.2.1 is broken #3421
Comments
I think this has been fixed upstream, but we need to test the update kubernetes/autoscaler#6763 |
Unfortunately vpa 1.1.1 does not fix this issue for us, we still see the same behaviour with the vpa-updater pod crashing:
|
The issue is still there, it's this one: kubernetes/autoscaler#6808 |
The issue was fixed with upstream VPA 1.1.2, which was released with our VPA app v5.2.2. It is safe to upgrade VPA and VPA CRDs to their latest version as of this date (v5.2.2 and v3.1.0). |
Summary
It pulls in upstream v1.1.0 which contains this change which is I believe not working properly (or we have some issues that got uncovered here).
I have tested this on CAPA MC golem where VPA updater was crashlooping in the clusters that use vertical-pod-autoscaler-app v5.2.1, and the error can be tracked down to previously mentioned upstream VPA change. Test clusters were deployed with this cluster-aws PR where default apps are in cluster+cluster-aws and VPA app is on the latest (I think broken) version, while using VPA app v5.1.0 was working without issues.
VPA app have been already updated in default-apps-aws here giantswarm/default-apps-aws#455, but luckily not yet released (so not yet used in e2e tests which is why we have not seen the effects of the issue yet). I believe that this e2e test failure was a genuine one, but e2e tests had passed eventually there, since VPA updater is crashlooping, but when it gets restarted it is ready and running for some time.
Logs
These are the vertical-pod-autoscaler-updater logs after creating the cluster (confirmed multiple times in different clusters):
Mitigation
Tasks
Fixing the issue
Tasks
The text was updated successfully, but these errors were encountered: