VPA Helm Chart.: Updater error: fail to get pod controller, node is not a valid owner #656

sarg3nt · 2024-05-16T17:09:41Z

I'm deploying the latest chart with no configuration changes and am finding that the
kube-system/vertical-pod-autoscaler-updater
Is throwing the following errors:

vertical-pod-autoscaler-updater-bdcd45465-qgdh4 E0516 17:03:33.841287       1 api.go:153] fail to get pod controller: pod=kube-proxy-lpul-vault-k8s-server-0.vault.ad.selinc.com err=Unhandled targetRef v1 / Node / lpul-vault-k8s-server-0.vault.ad.selinc.com, last error node is not a valid owner
vertical-pod-autoscaler-updater-bdcd45465-qgdh4 E0516 17:03:33.841304       1 api.go:153] fail to get pod controller: pod=cloud-controller-manager-lpul-vault-k8s-server-2.vault.ad.selinc.com err=Unhandled targetRef v1 / Node / lpul-vault-k8s-server-2.vault.ad.selinc.com, last error node is not a valid owner
vertical-pod-autoscaler-updater-bdcd45465-qgdh4 E0516 17:03:33.841316       1 api.go:153] fail to get pod controller: pod=kube-apiserver-lpul-vault-k8s-server-0.vault.ad.selinc.com err=Unhandled targetRef v1 / Node / lpul-vault-k8s-server-0.vault.ad.selinc.com, last error node is not a valid owner
vertical-pod-autoscaler-updater-bdcd45465-qgdh4 E0516 17:03:33.841325       1 api.go:153] fail to get pod controller: pod=kube-apiserver-lpul-vault-k8s-server-2.vault.ad.selinc.com err=Unhandled targetRef v1 / Node / lpul-vault-k8s-server-2.vault.ad.selinc.com, last error node is not a valid owner 
vertical-pod-autoscaler-updater-bdcd45465-qgdh4 E0516 17:03:33.841331       1 api.go:153] fail to get pod controller: pod=kube-proxy-lpul-vault-k8s-server-2.vault.ad.selinc.com err=Unhandled targetRef v1 / Node / lpul-vault-k8s-server-2.vault.ad.selinc.com, last error node is not a valid owner 
etc.

The only ones that seem to work are in the kube-system namespace:

vertical-pod-autoscaler-updater-bdcd45465-qgdh4 I0516 17:03:33.841558       1 pods_eviction_restriction.go:226] too few replicas for ReplicaSet kube-system/rke2-snapshot-controller-59cc9cd8f4. Found 1 live pods, needs 2 (global 2) 
vertical-pod-autoscaler-updater-bdcd45465-qgdh4 I0516 17:03:33.841585       1 pods_eviction_restriction.go:226] too few replicas for ReplicaSet kube-system/rke2-snapshot-validation-webhook-54c5989b65. Found 1 live pods, needs 2 (global 2) 
vertical-pod-autoscaler-updater-bdcd45465-qgdh4 I0516 17:03:33.841604       1 pods_eviction_restriction.go:226] too few replicas for ReplicaSet kube-system/rke2-metrics-server-655477f655. Found 1 live pods, needs 2 (global 2)

I have the Terraform to deploy the raw files and that works fine but would like to switch to your Helm chart, which is not working.
I tried comparing what the raw files deploy for ClusterRoles vs the Helm chart but they are so different the comparison is difficult.

In any case, this does not appear to work for us.
Maybe the version of k8s?

Specs:

Kubernetes: v1.28.9+rke2r1
OS: Rocky Linux
Deployment: Terraform Helm Providor

The text was updated successfully, but these errors were encountered:

sebastien-prudhomme · 2024-05-16T18:22:56Z

Hi @sarg3nt, it seems it's related to this bug in the latest version of the app, can you try the chart in version 9.7.0? kubernetes/autoscaler#6808

sebastien-prudhomme · 2024-05-17T19:33:11Z

It should be fixed by #657

sarg3nt · 2024-05-17T23:36:01Z

I tried the new version and am getting the same error.
I confirmed the update is now at 1.1.2
autoscaling/vpa-updater:1.1.2

0517 23:33:11.844076       1 api.go:153] fail to get pod controller: pod=cloud-controller-manager-lpul-vault-k8s-server-1.vault.ad.selinc.com err=Unhandled target ││ Ref v1 / Node / lpul-vault-k8s-server-1.vault.ad.selinc.com, last error node is not a valid owner

sarg3nt · 2024-05-17T23:56:46Z

Update:
I noticed my custom deployment gives me those errors for the kube-system static pods as well, so I think that is normal, which kind of makes sense?
However the Helm chart deployment is only trying to update stuff in the kube-system namespace whereas my custom deployment updates everything.

I'm not seeing a config option that would limit it to the kube-system namespace. Am I missing something?

Also, when I deploy the helm chart with Terraform the VPA resources fail to deploy. It's like the Helm chart finished the install but the CRD's are not quite up yet. When I install my custom version this doesn't happen. I have the same Terraform depends_on logic in place so I'm not sure why it's doing this.

A question as well. How does the chart handle certificate renewal? Does it do it automatically on chart upgrade or are the certs going to expire?

sebastien-prudhomme closed this as completed May 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

VPA Helm Chart.: Updater error: fail to get pod controller, node is not a valid owner #656

VPA Helm Chart.: Updater error: fail to get pod controller, node is not a valid owner #656

sarg3nt commented May 16, 2024

sebastien-prudhomme commented May 16, 2024

sebastien-prudhomme commented May 17, 2024

sarg3nt commented May 17, 2024

sarg3nt commented May 17, 2024

VPA Helm Chart.: Updater error: fail to get pod controller, node is not a valid owner #656

VPA Helm Chart.: Updater error: fail to get pod controller, node is not a valid owner #656

Comments

sarg3nt commented May 16, 2024

Specs:

sebastien-prudhomme commented May 16, 2024

sebastien-prudhomme commented May 17, 2024

sarg3nt commented May 17, 2024

sarg3nt commented May 17, 2024