Available ENIs left dangling after node termination #608

krzysztof-bronk · 2019-09-04T05:39:05Z

Hello,

I have encountered an issue with aws-cni 1.5.1(+?), where, even in a single node test cluster, if you terminate the node so that the ASG kicks in a replacement, the terminated instance ENI switches back to Available, holding IPs, and is seemingly never deleted.

Eventually one will exhaust the IP pool and pods will fail to be created.

This is a bit surprising as node recycling is the basis of autoscaling groups.
Is there some cleanup mechanism I am not aware of? Or is it a bug?

regards,
Krzysztof Bronk

mogren · 2019-09-04T17:59:59Z

Thanks @krzysztof-bronk for reporting, we will have to take a look why this is happening. ENIs attached to the terminated node should be freed by EC2 automatically.

krzysztof-bronk · 2019-09-05T11:21:39Z

Thank you for acknowledging this. If your cluster has a high node churn, or the IP pool is small, this can quickly become an issue. The current workaround, and also an independent report for the issue can be found here: #59 (comment)

vipulsabhaya · 2019-09-05T16:29:20Z

@krzysztof-bronk Do you happen to have the ipamd logs from one such instance?

caiconkhicon · 2019-09-06T08:46:06Z

ENIs attached to the terminated node should be freed by EC2 automatically.

No, it's not true. Only the main (eth0) ENI is cleaned up. The additional ones are not cleaned up, just detached and become available.

krzysztof-bronk · 2019-09-12T06:43:20Z

Here's the situation:

Fresh test cluster, nothing fancy or custom (except external SNAT but I tested both and it's not relevant), a single m5.xlarge worker node.

Private IPs: 10.250.9.56 (eth0), 10.250.11.217 (eth1)
Secondary Private IPs: (a whole bunch of them)

ENI: 3 total
2 In-use (as confirmed by the Private IPs above).
Interestingly, only 1 of them has a Description (aws-K8S-i-0da187e6c45d9d4d5), the other one has an empty Description.
1 Available

aws-node logs are empty (even though they're in DEBUG mode and I've deployed a test nginx container):

===== Starting installing AWS-CNI =========
===== Starting amazon-k8s-agent ===========

Node terminated. ASG spun up a new one.

Private IPs: 10.250.19.53 (eth0), 10.250.16.192 (eth1)
Secondary Private IPs: (a whole bunch of them)

ENI: FOUR total
2 In-use (as confirmed by the Private IPs above).
2 Available - the 1 Available from the earlier instance is still there

aws-node logs (with the successfully running nginx container):

===== Starting installing AWS-CNI =========
===== Starting amazon-k8s-agent ===========
ERROR: logging before flag.Parse: E0912 06:35:31.629160      13 memcache.go:138] couldn't get current server API group list; will keep using cached value. (Get https://172.20.0.1:443/api?timeout=32s: dial tcp 172.20.0.1:443: i/o timeout)

So it looks like aws-cni is leaking the warmup ENIs.

mogren · 2019-09-27T23:45:30Z

The latest pre-release, v1.6.0-rc2, has some changes to mitigate this problem.

Pluies · 2019-10-29T11:01:01Z

Good to know this is getting worked on, it's an issue for us as well. 👍

We have nightly infra testing jobs that bring up a cluster (w/ Terraform), run tests, and delete the cluster, and we've noticed that the terraform destroy step fails fairly often due to ENIs being left in "available" state, which prevents the security group from being deleted.

robin-engineml · 2019-11-01T16:52:01Z

@Pluies We have a similar process, and the same problem. @mogren When can we look forward to a 1.6.0 release with the mitigating changes applied?

robin-engineml · 2019-11-01T16:53:39Z

... or even a 1.6.0-rc4 (but without the problems of 1.5.4?)

mogren · 2019-11-05T03:31:46Z

@robin-engineml Hey, please try v1.6.0-rc4, fresh out of the oven. 😄

robin-engineml · 2019-11-22T19:07:57Z

@mogren The problem persists with v1.6.0-rc4, which we have been using for a couple of weeks. Anecdotally, it does seem to be less frequent.

mogren · 2019-11-22T19:21:51Z

@robin-engineml Thanks for the update! Glad that it has improved a bit at least. There is still a small chance that a fes ENIs will leak, but they should be cleaned up as long as there is at least some nodes running in the cluster.

robin-engineml · 2019-11-22T20:03:54Z

@mogren This occurs upon cluster termination, for us. So, there are not "some nodes running in the cluster". Would this stop occurring if we were to allow some cluster nodes to remain alive longer? (We destroy the EKS cluster via the AWS API, via Terraform.)

mogren · 2019-12-09T18:26:38Z

@robin-engineml The issue is that the EC2 API requires you to first detach an ENI, then wait 2-5 seconds before deleting it. If the instance get terminated after the ENI is detached, but before we delete it, it will stay around. With the v1.6 branch, we try to clean up when the CNI starts up, or in the background once per hour.

jlforester · 2019-12-16T15:35:28Z

Could this be integrated with ASG Lifecycle hooks to allow the processes more time to clean up on instance termination? Simply adding a lifecycle hook to an ASG isn't enough.

krzysztof-bronk · 2020-01-08T08:14:27Z

I've tested 1.6.0-rc5 a bit and I don't see much progress, after terminating a couple nodes, I saw available ENIs dangling, so I terminated all nodes and now I have:

2 fresh m5.xlarge nodes total working just fine (few pods, 2 ENIs attached with WARM_ENI_TARGET=1)
6 available ENIs doing nothing

mogren · 2020-01-20T23:37:39Z

@krzysztof-bronk Did those ENIs stay around? They should have been cleaned up if they were created by the CNI. Not directly, but within five minutes of another worker node being started.

steven-cherry · 2020-02-11T10:47:21Z

Hi, has there been any progress on this issue? This is really affecting us, we are even considering changing the CNI vendor we use.

mogren · 2020-02-11T20:18:55Z

Hi @steven-cherry. The base issue is that the EC2 API requires clients to detach ENIs before they can be deleted. If the node (or the aws-node) gets restarted in the around 2-3 seconds we have to wait for the detach to complete, there will be an ENI with status "available" around.

The code to do the clean up is here

It will filter out ENIs with the tag key node.k8s.amazonaws.com/instance_id and status available in order to only get ENIs that the CNI has created.

I've done some more tests with v1.6.0 on spot instances that get randomly terminated, and the leaked ENIs do get cleaned up eventually. The only sure way to not leak any ENIs is to have this handled outside the node like in our 2.0 CNI design.

steven-cherry · 2020-02-12T09:58:44Z

Hi @steven-cherry. The base issue is that the EC2 API requires clients to detach ENIs before they can be deleted. If the node (or the aws-node) gets restarted in the around 2-3 seconds we have to wait for the detach to complete, there will be an ENI with status "available" around.

The code to do the clean up is here

It will filter out ENIs with the tag key node.k8s.amazonaws.com/instance_id and status available in order to only get ENIs that the CNI has created.

I've done some more tests with v1.6.0 on spot instances that get randomly terminated, and the leaked ENIs do get cleaned up eventually. The only sure way to not leak any ENIs is to have this handled outside the node like in our 2.0 CNI design.

Thanks @mogren any ETA regarding version 2.0 for production workloads?

krzysztof-bronk · 2020-02-21T14:46:12Z

I'll be getting back on the topic soon so I will have a chance to test this once more

krzysztof-bronk · 2020-03-03T10:30:02Z

@mogren
Setup:

CNI 1.6.0
k8s 1.14
2x m5.xlarge nodes
pretty much no customisations introduced

instance 1 ENIs: 10.250.1.228, 10.250.5.254
instance 2 ENIs: 10.250.19.53, 10.250.21.204

The primary interfaces have a Description like "aws-K8S-i-0c841ac56fbadc9b3" indicating the node they belong to.
The secondary interfaces have no Description.

All 4 ENIs are Active.

Terminating instance 1. ASG kicks in a replacement.

Waiting 10 minutes.

There is a third ENI attached to the remaining instance, not sure why, there were several pods running on the terminated instances but not that many. However...

The primary interface of the terminated instance is now stuck in Available state.

Terminating instance 2 (the one with 3 ENIs). ASG kicks in a replacement.

Waiting 10 minutes.

Cluster now has 2 fresh nodes.
There are 4 total ENIs in Active state, 2 for each instance. However...

The primary interface of the second terminated instance is now stuck in Available state.

Maybe I'm triggering some special case but... the cleanup of Available ENIs simply does not happen.

This was removed in #49, but I think that change only fixed the issue with EKS managed SG not being deleted. Stale ENIs are related to this issue aws/amazon-vpc-cni-k8s#608

krzysztof-bronk · 2020-03-17T08:22:00Z

I'll do some further tests because sometimes the interfaces do get cleaned up. How does the mechanism work exactly? Is only the instance that had the interfaces attached responsible for cleaning them up and there is a race condition between the instance termination and the cleanup code? Or is it that if there is at least one node in the cluster, that aws-node pod will attempt to delete unused Available interfaces for the whole cluster?

nickdgriffin · 2020-03-23T14:41:36Z

Also noticed that the ENIs that appear to be leaking for us are missing the tags/description that mean they won't be picked up by the clean up loop. Not on 1.6 yet, but when we are I'll check if that's still the case.

EDIT: It is. We run our nodes in ASGs and on scaling down a test cluster of 6 nodes it leaked all 6 ENIs and left them untagged so they won't be cleaned up.

EDIT 2: It looks like it is the secondary ENIs that are getting leaked because they aren't being tagged or given the "special description" in the first place (i.e. even while "in-use") that allows the clean up to catch them.

nickdgriffin · 2020-04-27T09:47:58Z

We upgraded to 1.6.0 and fixed an oopsie and haven't had any ENIs leaking since.

korjek · 2020-04-28T13:50:37Z

We have upgraded to 1.6.1 and there is no issue with dangling ENIs anymore, thank you!

P.S. you should have "delete on termination" enabled for a primary interface to clean it up on node's termination as well.

mogren · 2020-06-09T00:50:38Z

It is still a small chance that ENIs will leak, but they should be cleaned up pretty quickly if there are still any nodes still in the cluster. Also, I have seen that pods creating ALBs might create ENIs in subnets that then doesn't get cleaned up. If anyone sees ENIs still around in a cluster using CNI v1.6.1 or later, please gather logs and open a new ticket.

Nuru · 2020-09-18T04:57:40Z

@mogren I am having a problem with danging ENIs using amazon-k8s-cni:v1.6.3-eksbuild.1. Please give me details about "gathering logs".

The basic problem is I am using Terraform and trying to destroy a node group and a security group that goes with it, but I cannot because the ENI is dangling after the node group is deleted, so the delete of the security group hangs.

Note that the dangling ENI has the tag node.k8s.amazonaws.com/instance_id=<instance-id> and the instance is terminated.

mogren · 2020-09-18T05:10:19Z

Hi @Nuru,

This has been an issue forever when scaling down the pods, and then suddenly the whole instance gets deleted. The issue triggering this is that there is no EC2 API call to "delete" an ENI that is attached, so instead they first have to be detached, which takes a few seconds, then deleted. If the instance gets terminated after the ENI has been detached, but before it has been deleted, it will be leaked. We have tried to mitigate this by for example having 10s termination policy on the aws-node daemonset, and never detach any ENIs while the CNI is shutting down, but none of this helps when the instance goes away.

Is this a managed nodegroup, or do you handle it on your own using Terraform? If so, terminating all the aws-node pods first, before terminating the instances might at least prevent them from detaching any ENIs in the last few seconds when the other pods are being deleted.

Another option would be if we had a setting to never detach any ENIs, since then the ENIs will get deleted when the instance gets deleted. The reason we don't do this by default is that running out of ENIs is also a common problem.

Nuru · 2020-09-18T05:42:29Z

@mogren wrote

Is this a managed nodegroup, or do you handle it on your own using Terraform? If so, terminating all the aws-node pods first, before terminating the instances might at least prevent them from detaching any ENIs in the last few seconds when the other pods are being deleted.

In my immediate case, I am using the AWS Terraform provider to create an aws_eks_node_group resource; in other words, Terraform is creating a managed node group. It is up to EKS to drain the pods from the nodes and then shut down the instances. There are other node groups (and therefore other nodes) in the cluster so EKS should be able to move all the nodes around, but there are always going to be critical pods (like kube-proxy) that need to stick around until the very end.

Prior to the node being shut down, it is cordoned off, meaning it will be marked as "unschedulable", meaning no new pods should be assigned to the node. You could surely arrange things such that any ENIs that are freed while the node is marked unschedulable are not detached. You do not need to worry about running out of ENIs at that point because there should be no new ENIs getting created. Then the ENIs can be deleted with the instance on termination, or, if the node is marked "schedulable" again without being terminated, a detach/delete loop could be run when the pod returns to the schedulable state. This, of course, requires the "delete on termination" option be set for the ENIs, such that they are automatically deleted when the instance is deleted. I do not see any downside to that setting always being set, as it still leaves you the option of detaching and deleting the ENI when a pod is deleted but the instance is intended to remain.

Maybe the building blocks were not there earlier, but it looks like the piece of the solution are now ready to be put together. Am I missing something?

mogren · 2020-09-18T05:49:35Z

@Nuru I do think you are right, having the VPC CNI be aware of the eks.amazonaws.com/nodegroup=unschedulable:NoSchedule tag does seem feasible. All ENIs that the CNI attaches are marked with DeleteOnTermination. Do you want to open a feature request issue to add this?

(Btw, kube-proxy is using host-networking, just like the aws-node pod does, so it is independent of the CNI being up.)

Nuru · 2020-09-18T07:00:03Z

@mogren I would be happy to have you open the feature request, as you would know better how to put the request together (what parts of code should react to what, and how) and see it through, and also be happy to lend my support to your request. I don't need credit or recognition for the feature request, I just want this done as quickly and efficiently as possible, so I would prefer you do it if you have the time. If it won't happen unless I do it, let me know and I will do it.

@Nuru I do think you are right, having the VPC CNI be aware of the eks.amazonaws.com/nodegroup=unschedulable:NoSchedule tag does seem feasible. All ENIs that the CNI attaches are marked with DeleteOnTermination. Do you want to open a feature request issue to add this?

(Btw, kube-proxy is using host-networking, just like the aws-node pod does, so it is independent of the CNI being up.)

Nuru · 2020-09-18T16:59:20Z

By the way, @mogren

The issue triggering this is that there is no EC2 API call to "delete" an ENI that is attached

Have you opened a feature request for this feature? That would be even better than my suggestion.

mogren · 2020-09-18T17:24:01Z

@Nuru I have checked, and that doesn't seem to be a lot of support for simplifying the API. I would love to have a single call to create or delete ENIs.

For now, I created #1223 to track improving how the VPC CNI is handling this.

…destroy (#336) This is a workaround for the known VPC CNI addon's "leaked ENIs" issue: See aws/amazon-vpc-cni-k8s#608 Co-authored-by: Rafael Mendes Pereira <[email protected]>

mogren added bug priority/P1 Must be staffed and worked currently or soon. Is a candidate for next release labels Sep 4, 2019

mogren mentioned this issue Sep 16, 2019

Leaking Network Interfaces (ENI) #69

Closed

mogren mentioned this issue Sep 25, 2019

Clean up leaked ENIs in the background #624

Merged

davidhole mentioned this issue Sep 25, 2019

Hanging node group after delete eksctl-io/eksctl#1325

Closed

mogren mentioned this issue Oct 7, 2019

Add shutdown listener #645

Merged

jaypipes mentioned this issue Oct 29, 2019

Automate e2e tests for leaked ENIs #690

Closed

jlforester mentioned this issue Jan 13, 2020

Failing to delete an already provisioned subnet if it was used for an Autoscaling Group (that created some EC2 instances) hashicorp/terraform-provider-aws#9495

Closed

errm mentioned this issue Feb 25, 2020

Alternatives to provisioners to ensure a clean cleanup... cookpad/terraform-aws-eks#37

Closed

mogren closed this as completed Jun 9, 2020

Nuru mentioned this issue Sep 18, 2020

Remove autoscaler permissions from worker role cloudposse/terraform-aws-eks-node-group#34

Merged

mogren mentioned this issue Sep 18, 2020

Avoid detaching ENIs on nodes being drained #1223

Closed

TjeuKayim mentioned this issue Mar 4, 2021

Destroy never succeeds, DependencyViolation for Security Group terraform-aws-modules/terraform-aws-eks#285

Closed

4 tasks

rafael-mendes-pereira mentioned this issue Oct 14, 2024

Add provisioner to cleanup any available attached ENIs during subnet destroy Azure/telescope#336

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Available ENIs left dangling after node termination #608

Available ENIs left dangling after node termination #608

krzysztof-bronk commented Sep 4, 2019

mogren commented Sep 4, 2019

krzysztof-bronk commented Sep 5, 2019

vipulsabhaya commented Sep 5, 2019

caiconkhicon commented Sep 6, 2019 •

edited

Loading

krzysztof-bronk commented Sep 12, 2019

mogren commented Sep 27, 2019

Pluies commented Oct 29, 2019

robin-engineml commented Nov 1, 2019

robin-engineml commented Nov 1, 2019

mogren commented Nov 5, 2019

robin-engineml commented Nov 22, 2019

mogren commented Nov 22, 2019

robin-engineml commented Nov 22, 2019

mogren commented Dec 9, 2019

jlforester commented Dec 16, 2019

krzysztof-bronk commented Jan 8, 2020

mogren commented Jan 20, 2020

steven-cherry commented Feb 11, 2020

mogren commented Feb 11, 2020

steven-cherry commented Feb 12, 2020

krzysztof-bronk commented Feb 21, 2020

krzysztof-bronk commented Mar 3, 2020

krzysztof-bronk commented Mar 17, 2020

nickdgriffin commented Mar 23, 2020 •

edited

Loading

nickdgriffin commented Apr 27, 2020

korjek commented Apr 28, 2020 •

edited

Loading

mogren commented Jun 9, 2020

Nuru commented Sep 18, 2020 •

edited

Loading

mogren commented Sep 18, 2020

Nuru commented Sep 18, 2020

mogren commented Sep 18, 2020

Nuru commented Sep 18, 2020

Nuru commented Sep 18, 2020

mogren commented Sep 18, 2020

Available ENIs left dangling after node termination #608

Available ENIs left dangling after node termination #608

Comments

krzysztof-bronk commented Sep 4, 2019

mogren commented Sep 4, 2019

krzysztof-bronk commented Sep 5, 2019

vipulsabhaya commented Sep 5, 2019

caiconkhicon commented Sep 6, 2019 • edited Loading

krzysztof-bronk commented Sep 12, 2019

mogren commented Sep 27, 2019

Pluies commented Oct 29, 2019

robin-engineml commented Nov 1, 2019

robin-engineml commented Nov 1, 2019

mogren commented Nov 5, 2019

robin-engineml commented Nov 22, 2019

mogren commented Nov 22, 2019

robin-engineml commented Nov 22, 2019

mogren commented Dec 9, 2019

jlforester commented Dec 16, 2019

krzysztof-bronk commented Jan 8, 2020

mogren commented Jan 20, 2020

steven-cherry commented Feb 11, 2020

mogren commented Feb 11, 2020

steven-cherry commented Feb 12, 2020

krzysztof-bronk commented Feb 21, 2020

krzysztof-bronk commented Mar 3, 2020

krzysztof-bronk commented Mar 17, 2020

nickdgriffin commented Mar 23, 2020 • edited Loading

nickdgriffin commented Apr 27, 2020

korjek commented Apr 28, 2020 • edited Loading

mogren commented Jun 9, 2020

Nuru commented Sep 18, 2020 • edited Loading

mogren commented Sep 18, 2020

Nuru commented Sep 18, 2020

mogren commented Sep 18, 2020

Nuru commented Sep 18, 2020

Nuru commented Sep 18, 2020

mogren commented Sep 18, 2020

caiconkhicon commented Sep 6, 2019 •

edited

Loading

nickdgriffin commented Mar 23, 2020 •

edited

Loading

korjek commented Apr 28, 2020 •

edited

Loading

Nuru commented Sep 18, 2020 •

edited

Loading