-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scale up from 0 does not work with existing AWS EBS CSI PersistentVolume #3845
Comments
Same problem here (edit after realizing there is no difference relevant difference in my previous post to what you wrote) After doing some splunking, I you are correct it has something to do with scaling from 0 and usage of the |
I can do some footwork in terraform to get the tags setup. Not sure what you're using to provision your cluster. Though, it would be nice to have the labels generated from the list of AZs assigned to an ASG |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-contributor-experience at kubernetes/community. |
How to resolve this issue for statefulset deployments attached custom storage classes on EKS? |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
So just set
for example? Like the OP mentions, how are we supposed to do this for multiple AZs |
I have this exact problem too, to add further info the error I get on the pod unable to scale from zero is: |
@FarhanSajid1 you should have one node group (and thus one ASG) for each AZ. The above tag needs to be applied to the ASG. |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
Hi Folks! Facing the same issue: Cluster-autosacler logs: I0920 17:30:00.585954 1 scheduler_binder.go:803] Could not get a CSINode object for the node "ip-10-121-173-251.ap-south-1.compute.internal": csinode.storage.k8s.io "ip-10-121-173-251.ap-south-1.compute.internal" not found
I0920 17:30:00.586008 1 scheduler_binder.go:823] PersistentVolume "pvc-50c002d3-a5cc-4143-adf2-1362d18fc40e", Node "ip-10-121-173-251.ap-south-1.compute.internal" mismatch for Pod "kafka/kafka-0": no matching NodeSelectorTerms
I0920 17:30:00.586074 1 scheduler_binder.go:803] Could not get a CSINode object for the node "ip-10-121-68-79.ap-south-1.compute.internal": csinode.storage.k8s.io "ip-10-121-68-79.ap-south-1.compute.internal" not found
I0920 17:30:00.586107 1 scheduler_binder.go:823] PersistentVolume "pvc-31af46c4-0d27-4eea-8ef6-148bbb2b4f0b", Node "ip-10-121-68-79.ap-south-1.compute.internal" mismatch for Pod "kafka/kafka-0": no matching NodeSelectorTerms
I0920 17:30:00.586149 1 scheduler_binder.go:803] Could not get a CSINode object for the node "ip-10-121-162-179.ap-south-1.compute.internal": csinode.storage.k8s.io "ip-10-121-162-179.ap-south-1.compute.internal" not found
I0920 17:30:00.586172 1 scheduler_binder.go:823] PersistentVolume "pvc-50c002d3-a5cc-4143-adf2-1362d18fc40e", Node "ip-10-121-162-179.ap-south-1.compute.internal" mismatch for Pod "kafka/kafka-0": no matching NodeSelectorTerms
I0920 17:30:00.586247 1 scheduler_binder.go:803] Could not get a CSINode object for the node "ip-10-121-241-242.ap-south-1.compute.internal": csinode.storage.k8s.io "ip-10-121-241-242.ap-south-1.compute.internal" not found
I0920 17:30:00.586275 1 scheduler_binder.go:823] PersistentVolume "pvc-50c002d3-a5cc-4143-adf2-1362d18fc40e", Node "ip-10-121-241-242.ap-south-1.compute.internal" mismatch for Pod "kafka/kafka-0": no matching NodeSelectorTerms
I0920 17:30:00.586328 1 scheduler_binder.go:803] Could not get a CSINode object for the node "ip-10-121-5-204.ap-south-1.compute.internal": csinode.storage.k8s.io "ip-10-121-5-204.ap-south-1.compute.internal" not found
I0920 17:30:00.586350 1 scheduler_binder.go:823] PersistentVolume "pvc-31af46c4-0d27-4eea-8ef6-148bbb2b4f0b", Node "ip-10-121-5-204.ap-south-1.compute.internal" mismatch for Pod "kafka/kafka-0": no matching NodeSelectorTerms
I0920 17:30:00.586533 1 scheduler_binder.go:803] Could not get a CSINode object for the node "ip-10-121-173-251.ap-south-1.compute.internal": csinode.storage.k8s.io "ip-10-121-173-251.ap-south-1.compute.internal" not found
I0920 17:30:00.586572 1 scheduler_binder.go:823] PersistentVolume "pvc-0c9887c2-eea3-4ef7-baae-c4c0aca78699", Node "ip-10-121-173-251.ap-south-1.compute.internal" mismatch for Pod "kafka/kafka-1": no matching NodeSelectorTerms
I0920 17:30:00.586622 1 scheduler_binder.go:803] Could not get a CSINode object for the node "ip-10-121-68-79.ap-south-1.compute.internal": csinode.storage.k8s.io "ip-10-121-68-79.ap-south-1.compute.internal" not found
I0920 17:30:00.586663 1 scheduler_binder.go:823] PersistentVolume "pvc-df590cf4-a584-4842-9842-9629312c0e45", Node "ip-10-121-68-79.ap-south-1.compute.internal" mismatch for Pod "kafka/kafka-1": no matching NodeSelectorTerms
I0920 17:30:00.586711 1 scheduler_binder.go:803] Could not get a CSINode object for the node "ip-10-121-162-179.ap-south-1.compute.internal": csinode.storage.k8s.io "ip-10-121-162-179.ap-south-1.compute.internal" not found
I0920 17:30:00.586737 1 scheduler_binder.go:823] PersistentVolume "pvc-0c9887c2-eea3-4ef7-baae-c4c0aca78699", Node "ip-10-121-162-179.ap-south-1.compute.internal" mismatch for Pod "kafka/kafka-1": no matching NodeSelectorTerms
I0920 17:30:00.586802 1 scheduler_binder.go:803] Could not get a CSINode object for the node "ip-10-121-241-242.ap-south-1.compute.internal": csinode.storage.k8s.io "ip-10-121-241-242.ap-south-1.compute.internal" not found
I0920 17:30:00.586827 1 scheduler_binder.go:823] PersistentVolume "pvc-0c9887c2-eea3-4ef7-baae-c4c0aca78699", Node "ip-10-121-241-242.ap-south-1.compute.internal" mismatch for Pod "kafka/kafka-1": no matching NodeSelectorTerms
I0920 17:30:00.586869 1 scheduler_binder.go:803] Could not get a CSINode object for the node "ip-10-121-5-204.ap-south-1.compute.internal": csinode.storage.k8s.io "ip-10-121-5-204.ap-south-1.compute.internal" not found
I0920 17:30:00.586907 1 scheduler_binder.go:823] PersistentVolume "pvc-df590cf4-a584-4842-9842-9629312c0e45", Node "ip-10-121-5-204.ap-south-1.compute.internal" mismatch for Pod "kafka/kafka-1": no matching NodeSelectorTerms
I0920 17:30:00.586929 1 filter_out_schedulable.go:170] 0 pods were kept as unschedulable based on caching
I0920 17:30:00.586938 1 filter_out_schedulable.go:171] 0 pods marked as unschedulable can be scheduled.
I0920 17:30:00.586952 1 filter_out_schedulable.go:82] No schedulable pods
I0920 17:30:00.586966 1 klogx.go:86] Pod kafka/kafka-0 is unschedulable
I0920 17:30:00.586972 1 klogx.go:86] Pod kafka/kafka-1 is unschedulable
I0920 17:30:00.587014 1 scale_up.go:376] Upcoming 0 nodes
I0920 17:30:00.587153 1 scheduler_binder.go:803] Could not get a CSINode object for the node "template-node-for-eks-atlan-node-kafka-pod-spot-20220920151848482300000005-36c1adbd-7aef-51ce-830e-d848e9f27e09-6789034556239763083": csinode.storage.k8s.io "template-node-for-eks-atlan-node-kafka-pod-spot-20220920151848482300000005-36c1adbd-7aef-51ce-830e-d848e9f27e09-6789034556239763083" not found
I0920 17:30:00.587188 1 scheduler_binder.go:823] PersistentVolume "pvc-31af46c4-0d27-4eea-8ef6-148bbb2b4f0b", Node "template-node-for-eks-atlan-node-kafka-pod-spot-20220920151848482300000005-36c1adbd-7aef-51ce-830e-d848e9f27e09-6789034556239763083" mismatch for Pod "kafka/kafka-0": no matching NodeSelectorTerms
I0920 17:30:00.587210 1 scale_up.go:300] Pod kafka-0 can't be scheduled on eks-atlan-node-kafka-pod-spot-20220920151848482300000005-36c1adbd-7aef-51ce-830e-d848e9f27e09, predicate checking error: node(s) had volume node affinity conflict; predicateName=VolumeBinding; reasons: node(s) had volume node affinity conflict; debugInfo=
I0920 17:30:00.587316 1 scheduler_binder.go:803] Could not get a CSINode object for the node "template-node-for-eks-atlan-node-kafka-pod-spot-20220920151848482300000005-36c1adbd-7aef-51ce-830e-d848e9f27e09-6789034556239763083": csinode.storage.k8s.io "template-node-for-eks-atlan-node-kafka-pod-spot-20220920151848482300000005-36c1adbd-7aef-51ce-830e-d848e9f27e09-6789034556239763083" not found
I0920 17:30:00.587361 1 scheduler_binder.go:823] PersistentVolume "pvc-df590cf4-a584-4842-9842-9629312c0e45", Node "template-node-for-eks-atlan-node-kafka-pod-spot-20220920151848482300000005-36c1adbd-7aef-51ce-830e-d848e9f27e09-6789034556239763083" mismatch for Pod "kafka/kafka-1": no matching NodeSelectorTerms
I0920 17:30:00.587386 1 scale_up.go:300] Pod kafka-1 can't be scheduled on eks-atlan-node-kafka-pod-spot-20220920151848482300000005-36c1adbd-7aef-51ce-830e-d848e9f27e09, predicate checking error: node(s) had volume node affinity conflict; predicateName=VolumeBinding; reasons: node(s) had volume node affinity conflict; debugInfo=
I0920 17:30:00.587417 1 scale_up.go:449] No pod can fit to eks-atlan-node-kafka-pod-spot-20220920151848482300000005-36c1adbd-7aef-51ce-830e-d848e9f27e09 Our pods are in pending state due to volume node affinity conflict. Describe kafka-1 pod LAST SEEN TYPE REASON OBJECT MESSAGE
6m52s Warning FailedScheduling pod/kafka-0 0/5 nodes are available: 5 node(s) had volume node affinity conflict.
6m52s Warning FailedScheduling pod/kafka-1 0/5 nodes are available: 5 node(s) had volume node affinity conflict.
73s Normal NotTriggerScaleUp pod/kafka-0 pod didn't trigger scale-up: 1 node(s) had volume node affinity conflict
73s Normal NotTriggerScaleUp pod/kafka-1 pod didn't trigger scale-up: 1 node(s) had volume node affinity conflict |
Hi @decipher27 , Could you show us the labels on you AWS ASG My understanding of this issue is that you need the topology tags:
I've also added
When your ASG is at 0, there no node to retrieve the topogy from. You must have topology labels on ASG itself to allow CA and CSI Driver to retrieve the topology. |
We don't have the mentioned tags mentioned above, and it was working earlier. Though, we found the issue was with the scheduler. we are using a custom scheduler.. |
Also, from your comment, what do you mean by |
Exactly, when an ASG desired value is set to 0 (for instance, after a downscale of all replicas with kube-downscaler, except those from CA itself). CA will not be able to read node labels, because there is no node. |
Got the same issue, if a pvc & pod created and then suspend the asg group & scaled down the asg to 0 to save cost at weekend, but on Monday this pod is not able to start from 0, other stateless pods are okay |
@debu99
|
my pv requires
But I believe this label is added automatically to all nodes? as i didn't add it into ASG tags, but all my nodes has it
|
Yes, but when the ASG is at 0, there are no nodes. cluster-autoscaler needs the labels tagged on the ASG to know what labels the node would have if it would scale up the ASG from 0. |
We are facing the same issue with VolumeNodeAffinity error, and our ASG has node Spun Across AZs, What is the best way for CA to spin up the nodes based on the right AZ. We use the priority expander.
Above error comes when there is enough room for CA to spin up new nodes in the Nodegroup and also there is one more nodegroup where CA can launch, but CA not functioning as expected. CA version: 1.21 |
@KiranReddy230 if you read the comments above yours, the question has been answered three times already. You need to add the tags mentioned above to your ASG. In order for this to work properly, each node group (and thus each ASG) should have only one zone (this is the recommended architecture anyway). |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
I have this issue despite (I believe) having everything set up correctly. EKS - 1.25 CA - 1.25.2:
My 3 ASGs are tagged as following (each of them covers single region a/b/c):
I'm running Prometheus as STS with PVC (affinity rules set to ensure replicas are spread across AZ and hosts):
Every night between 00:00 - 06:00 (I believe this is when AWS rebalancing happens) at least one of prometheus replicas is being stuck in For now I had to set |
This is closely related to issue #4739, which was fixed in cluster autoscaler version 1.22 onward. If you look at the function that generates a hypothetical new node to satisfy the pending pod, the new label that is needed to satisfy volumes created by the EBS CSI driver is not part of that function. It will not scale up unless you add the tag to the ASG manually. Current function: The next function is why adding the labels to the ASG makes this work Since the annotation is widely used now, maybe we update the buildGenericLabels function to use the label topology.ebs.csi.aws.com/zone as well for the new node when its hypothetically being built. |
I can make a stab at providing a PR with a fix. |
Which component are you using?:
What version of the component are you using?:
Cluster-Autoscaler Deployment YAML
Component version:
What k8s version are you using (
kubectl version
)?:kubectl version
OutputWhat environment is this in?:
What did you expect to happen?:
I do have an ASG dedicated to a single CronJob, that get's triggered 6 times a day.
That ASG is pinned to a specific AWS AZ by it's assigned subnet.
The Cronjob is pinned to that specific ASG by Affinity+Toleration
The job uses a PV, that will be provisioned (AWS EBS) on the first ever run and then subsequently reused on each run.
I expect the ASG to be scaled up to 1 after the Pod gets created and removed shortly after the Pod/Job has finished.
What happened instead?:
The ASG will not be scaled up by the cluster-autoscaler.
cluster-autoscaler log output after the Job is created and the Pod is pending
Anything else we need to know?:
Basically this works fine without the volume.
With the volume it works when the volume is not provisioned yet, but fails when it already has been provisioned.
The job also get's scheduled right away when I manually upscale the ASG.
I noticed the volume affinity on the PVC :
That tag is probably set on the node by the "ebs-csi-node" DaemonSet and therefore is unknown for the cluster-autoscaler.
Am I expected to tag the ASG with
k8s.io/cluster-autoscaler/node-template/label/topology.ebs.csi.aws.com/zone
?If so, how am I supposed to set them in a Multi-AZ ASGs ?
Possibly related: #3230
The text was updated successfully, but these errors were encountered: