-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hanging node group after delete #1325
Comments
hey Have you been able to fix this issue in any way? The same thing happened to me yesterday and I can't find a way to permanently delete nodegroup from my EKS cluster. |
I had this issue, which in my case I found a solution for. For me it related to dangling ENIs left behind by auto-scaling instances up and down (spot in my case). These ENIs were still attached to the node group security group, so the security groups could not be deleted when deleting the cloudformation stack (initiated by eksctl). Deleting these ENIs (they have a status of Available and not attached to an instance, also will have the node group security group listed) allowed cloudformation to properly delete the stack for the node group and it appears completely deleted to eksctl. Deleting these dangling ENIs every so often (depending on how quickly they build up for you) is also good policy as they have caused other issues for me (and others) as well: See: |
+1 faced the same issue |
Facing the same issue here and not sure how to proceed. In trying to delete the cluster I see the following error
I went to the AWS console, I see the EKS cluster there, trying to delete the cluster manually I am seeing the following error. ResourceInUseException Drilling into the NodeGroup, I see it listed there. Tried to manually delete the NodeGroup from AWS console and it error'd out as well with DELETE FAILED With kubectl I am not seeing the nodes anymore but I see the following resources
Any help is appreciated as I don't know of a way to clean and remove this cluster up now Thanks |
Hey @ddavtian did you manage to delete the EKS cluster ? How did you go about it ? |
@chidiebube I did, the issue is that there is a broken configmap in the cluster and that needs to be manually fixed first. Looking at my history of commands, try poking around this to make sure the yaml is proper
Fix it and try to remove it again using AWS Console. |
+1 faced the same issue |
Thanks this worked for me. |
+1 |
1 similar comment
+1 |
+1 We also experienced this issue, what worked for us:
|
+1 I trapped to the same problem :(( ============ This step worked for me:
after that, I can delete nodegroup and the cluster... Yeah! |
I'll also say this is not an eksctl specific issue. Our EKS cluster was not created or managed with eksctl and we had the same issue of dangling ENIs. |
Same issue here. Although eksctl said it deleted the node group, the Cloud Formation stack had failed to delete it. The message "must detach all policies first" made me look at the node group's NodeInstanceRole in IAM. I removed the last remaining policy (CloudWatchLogsFullAccess) on that role and that worked for me. |
Same issue. I deleted the autoscaling group, The NAT gateways and VPCs. Thanks to the billing alerts. I couldn't find any cluster to delete. |
There was another way, the cloudformation stacks were running and I went ahead and deleted the same. That worked too the second time around! |
@MG40 +1. I also deleted the hanging nodegroups by deleting the associated cloudformation stack. |
In my case the problem was:
my issue was nodegroup was not getting deleted. I fixed it : Problem was in under IAM>roles I removed all the roles which were not equal to the role I got when I executed:
I deleted all the roles which were not equal to : and after that I added a new node using: Then second node was successfully attached. |
I faced the same issue. I tried deleting the nodegroup both through the GUI and using a command, but it wouldn't delete. It seemed to get stuck. However, after waiting for 10 minutes, it finally got deleted. |
What happened?
I preformed a
eksctl delete nodegroup --cluster prod-eks --name ng-1
the drain failed because of existing daemon sets and some local data.I drained the nodes manually with kubectl using
kubectl drain -l 'alpha.eksctl.io/nodegroup-name=ng-1' --force --ignore-daemonsets --delete-local-data
I ran
eksctl delete nodegroup --cluster prod-eks --name ng-1
and now got an errorThe CloudFormation delete has also failed to run with the events
All instances were terminated but performing a
eksctl get nodegroups --cluster prod-eks
I can seeWhat you expected to happen?
eksctl
would no longer list the deleted node groupHow to reproduce it?
Not sure why it failed tbh
Anything else we need to know?
Very standard install
Versions
Please paste in the output of these commands:
Logs
Include the output of the command line when running eksctl. If possible, eksctl should be run with debug logs. For example:
eksctl get clusters -v 4
Make sure you redact any sensitive information before posting.
If the output is long, please consider a Gist.
The text was updated successfully, but these errors were encountered: