-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
After removing aws-ebs0 from the cluster deleting nodes fails #8121
Comments
Similar to #8100 |
Hi @analytically can you clarify for me the symptom you're seeing here:
Nomad nodes are being marked as ineligible for placing workloads? Or do you mean you can't schedule CSI workloads on those nodes?
That issue is about volume claim GC, which should be unrelated. |
No this is also the volume claim GC (Nomad Nodes GC) |
There's a difference between "volume claim GC" (which is the cleanup of claims that an allocation has on a volume) and a "node GC" (which is the cleanup of Nomad clients). I'm still not clear what the symptom you're seeing it: Nomad nodes are being marked as ineligible for placing workloads? Or do you mean you can't schedule CSI workloads on those nodes? |
Some follow-up data which isn't quite the same thing but could be related. If we try to delete a job that hasn't registered its plugin, that results in the following error state:
|
Should be fixed with #8619 |
I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues. |
Nomad version
Nomad v0.11.2 (807cfeb)
Operating system and Environment details
Amazon Linux 2, c5.4xlarge
Issue
After removing the aws-ebs0 plugin from our cluster, we're seeing the client list with ineligble nodes grow unbounded without removing nodes.
The logs have:
nomad.fsm: DeleteNode failed: error="csi plugin delete failed: csi_plugins missing pl ugin aws-ebs0"
The text was updated successfully, but these errors were encountered: