Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failing to delete an already provisioned subnet if it was used for an Autoscaling Group (that created some EC2 instances) #9495

Closed
Dzhuneyt opened this issue Jul 25, 2019 · 21 comments
Assignees
Labels
bug Addresses a defect in current functionality. prioritized Part of the maintainer teams immediate focus. To be addressed within the current quarter. service/ec2 Issues and PRs that pertain to the ec2 service.

Comments

@Dzhuneyt
Copy link
Contributor

Dzhuneyt commented Jul 25, 2019

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform Version

Terraform v0.12.5

  • provider.aws v2.20.0
  • provider.template v2.1.2

Terraform Configuration Files

resource "aws_vpc" "main" {
  cidr_block = "10.0.0.0/16"
  enable_dns_hostnames = true
}
resource "aws_subnet" "subnet1" {
  vpc_id = aws_vpc.main.id
  cidr_block = "10.0.1.0/24"
}
# Read all subnet ids for this vpc/region.
data "aws_subnet_ids" "all_subnets" {
  vpc_id = data.aws_vpc.default.id
  # Wait for the subnets to be actually created, not just the VPC
  depends_on = [
    aws_subnet.subnet1
  ]
}
resource "aws_autoscaling_group" "ecs_cluster_spot" {
  name_prefix = "ecs_cluster_spot"
  termination_policies = [
    "OldestInstance"]
  max_size = local.max_spot_instances
  min_size = local.min_spot_instances
  launch_configuration = aws_launch_configuration.ecs_config_launch_config_spot.name
  lifecycle {
    create_before_destroy = true
  }
  # This is the important part:
  # We attach the subnets of the VPC to the autoscaling group
  vpc_zone_identifier = data.aws_subnet_ids.all_subnets.ids
}

I've truncated some pieces of my configuration to the bare minimum. I later add ECS task definitions and services onto the AWS ECS but I don't think these are important for the issue. I might as well launch them using the AWS console and not with Terraform and I assume the effect will be the same.

Debug Output

...
aws_subnet.subnet1: Still destroying... [id=subnet-0a7d3066014860a8e, 18m10s elapsed]
aws_subnet.subnet1: Still destroying... [id=subnet-0a7d3066014860a8e, 18m20s elapsed]
aws_subnet.subnet1: Still destroying... [id=subnet-0a7d3066014860a8e, 18m30s elapsed]
aws_subnet.subnet1: Still destroying... [id=subnet-0a7d3066014860a8e, 18m40s elapsed]
aws_subnet.subnet1: Still destroying... [id=subnet-0a7d3066014860a8e, 18m50s elapsed]
aws_subnet.subnet1: Still destroying... [id=subnet-0a7d3066014860a8e, 19m0s elapsed]
aws_subnet.subnet1: Still destroying... [id=subnet-0a7d3066014860a8e, 19m10s elapsed]
aws_subnet.subnet1: Still destroying... [id=subnet-0a7d3066014860a8e, 19m20s elapsed]

After 19 minutes. The subnet is still not destroyed.

Expected Behavior

subnet1 is destroyed

Actual Behavior

Destroying subnet1 hangs. If I attempt to manually remove the resource from the AWS console, I get this:
image

I assume this is the same reason why Terraform fails to delete the subnet and hangs.

Steps to Reproduce

  1. terraform apply
  2. Launch some services in your ECS cluster and instances (I don't think this makes a difference)

Important Factoids

I removed the "subnet1" definition from my terraform files and added another subnet definition, causing "subnet1" to be marked for destruction. On my attempt to "apply" the changes, I encountered this hang in deletion.

@github-actions github-actions bot added the needs-triage Waiting for first response or review from a maintainer. label Jul 25, 2019
@aeschright aeschright added the service/ec2 Issues and PRs that pertain to the ec2 service. label Aug 2, 2019
@emmeowzing
Copy link

I've just experienced this same thing, believe it or not.

aws_subnet.subnet_ovpn: Still destroying... [id=subnet-0e676ec34db23e1d7, 1m40s elapsed]
aws_subnet.subnet_ovpn: Still destroying... [id=subnet-0e676ec34db23e1d7, 1m50s elapsed]
aws_subnet.subnet_ovpn: Still destroying... [id=subnet-0e676ec34db23e1d7, 2m0s elapsed]
aws_subnet.subnet_ovpn: Still destroying... [id=subnet-0e676ec34db23e1d7, 2m10s elapsed]
aws_subnet.subnet_ovpn: Still destroying... [id=subnet-0e676ec34db23e1d7, 2m20s elapsed]
aws_subnet.subnet_ovpn: Still destroying... [id=subnet-0e676ec34db23e1d7, 2m30s elapsed]
aws_subnet.subnet_ovpn: Still destroying... [id=subnet-0e676ec34db23e1d7, 2m40s elapsed]
aws_subnet.subnet_ovpn: Still destroying... [id=subnet-0e676ec34db23e1d7, 2m50s elapsed]
...

unfortunate this is an open issue.

@romafederico
Copy link

Any updates on this issue? I bumped into https://aws.amazon.com/blogs/compute/update-issue-affecting-hashicorp-terraform-resource-deletions-after-the-vpc-improvements-to-aws-lambda/ but I can't delete subnets even using provider v2.41.

@bflad
Copy link
Contributor

bflad commented Dec 9, 2019

Hi folks 👋 If you are seeing DependencyViolation errors on EC2 Subnet deletions or long delays in EC2 Subnet deletion, the causes for these will be very specific to your environment and sometimes caused by AWS not properly cleaning up its own infrastructure. Some pointers that may help:

  • Ensure no infrastructure creating ENIs within the Subnet exists outside Terraform
  • For AWS resources that manage ENIs automatically (including but not limited to Network Load Balancers, Lambda Functions, or EKS Node Groups that automatically provision/delete ENIs), if they require IAM Role permissions to perform these actions, ensure that the Terraform resource with the IAM Role reference also includes explicit depends_on to the aws_iam_role_policy/aws_iam_role_policy_attachment resources so those permissions remain until the AWS resource that needs those permissions is deleted properly first
  • Check for lingering ENIs in the Subnet either via the web console (EC2 > Network Interfaces) or via the AWS CLI, e.g. aws ec2 describe-network-interfaces --filters Name=subnet-id,Values=subnet-XXXXXXXXX -- these should help narrow down AWS/Terraform resources that are causing the long deletion delays or DependencyViolation errors.

@jlforester
Copy link

jlforester commented Jan 13, 2020

The orphaned ENI issue is also being worked here:

aws/amazon-vpc-cni-k8s#608 (comment)

@def324
Copy link

def324 commented Feb 15, 2020

I'm having the same issue as the OP when trying to change the availability zone of a subnet. Terraform wanted to update the auto scaling group in place, instead of destroying and recreating it. This made the subnet deletion fail as the subnet still had resources in it. There seems to be similar behavior for load balancers and RDS instances which terraform also wants to update in place.

I ended up destroying pretty much the entire infrastructure and recreating from scratch, that was the only workaround I could find.

@forwardmeasure
Copy link

I have the same issue. In my setup, I create a VPC, an EKS, multiple ASGs, etc. The good thing is that Terraform destroys the ASGs (and EC2 instance which are the costly resources). That bad is that the Internet Gateway, subnets, and network interfaces are left dangling.

I have noticed that they eventually get cleaned up by what is likely a background cleanup job the AWS runs to deallocate dangling resources.

@syerad
Copy link

syerad commented Apr 27, 2020

Same problem for me. In my TF script I'm trying to remove one availability zone and all the resources belonging to it. It's not possible due to the fact, that TF is trying to remove the subnet and this can't be deleted because it still has resources in it. Any Ideas how to solve this problem? Any suggestion despite destroying everything?

@muditmn5
Copy link

I tried with refreshing keys like Access key and Secret key
This help to resolved my issue.

@ghost
Copy link

ghost commented Oct 16, 2020

For anyone who got here because of Jenkins X on EKS, I had this issue too. The terraform destroy was stuck and couldn't delete the subnets or the internet gateway.

I manually deleted the NLB that had been created, and then re-ran the terraform destroy and then the project was deleted successfully.

@justinretzolk
Copy link
Member

Hey y'all 👋 Thank you for taking the time to file this issue and the ongoing discussion! Given that there's been a number of AWS provider releases since this was initially filed, can anyone confirm whether you're still experiencing this behavior?

@justinretzolk justinretzolk added waiting-response Maintainers are waiting on response from community or contributor. and removed needs-triage Waiting for first response or review from a maintainer. labels Dec 9, 2021
@Nordle
Copy link

Nordle commented Jan 19, 2022

I still have this behavior

module.network.aws_subnet.publics["eu-west-1c"]: Still destroying... [id=subnet-09cca311d14ed9867, 11m40s elapsed]
module.network.aws_subnet.publics["eu-west-1c"]: Still destroying... [id=subnet-09cca311d14ed9867, 11m50s elapsed]
module.network.aws_subnet.publics["eu-west-1c"]: Still destroying... [id=subnet-09cca311d14ed9867, 12m0s elapsed]
module.network.aws_subnet.publics["eu-west-1c"]: Still destroying... [id=subnet-09cca311d14ed9867, 12m10s elapsed]

versions

Terraform v1.0.4
on linux_amd64
+ provider registry.terraform.io/datadog/datadog v2.20.0
+ provider registry.terraform.io/hashicorp/archive v2.2.0
+ provider registry.terraform.io/hashicorp/aws v3.71.0
+ provider registry.terraform.io/hashicorp/external v2.2.0
+ provider registry.terraform.io/hashicorp/null v3.1.0
+ provider registry.terraform.io/hashicorp/template v2.2.0
+ provider registry.terraform.io/mongodb/mongodbatlas v0.7.0

@github-actions github-actions bot removed the waiting-response Maintainers are waiting on response from community or contributor. label Jan 19, 2022
@justinretzolk justinretzolk added the bug Addresses a defect in current functionality. label Jan 21, 2022
@johndanek
Copy link

Still a problem for me in 2023:

aws_subnet.rds-sub-4: Still destroying... [id=subnet-0110394d48dab67a4, 15m0s elapsed]
aws_subnet.rds-sub-3: Still destroying... [id=subnet-0d59b7b57b38d3b93, 15m0s elapsed]
aws_subnet.rds-sub-2: Still destroying... [id=subnet-0932eeb4d560314aa, 15m0s elapsed]
aws_subnet.rds-sub-6: Still destroying... [id=subnet-03ec4ecc7b434ce08, 15m0s elapsed]
aws_subnet.rds-sub-1: Still destroying... [id=subnet-094d00d45f3a3772e, 15m0s elapsed]

@BVADY
Copy link

BVADY commented Feb 6, 2023

Still a problem, even via the AWS CLI and Console I'm not able to destroy them. Because the subnets contain one or more network interfaces, and cannot be deleted until those network interfaces have been deleted. And when I want to delete those network interfaces, I get this error:
The network interfaces can't be deleted, Reason: Network interface is currently in use.

My workaround apply again and then destroy.

@prashil-g
Copy link

I am seeing same issue
image

@JafarBadour
Copy link

Still have the same issue
image

@tdaly61
Copy link

tdaly61 commented Aug 28, 2023

Just want to add my voice to this issue, I am seeing similar

@AnrichVS
Copy link

AnrichVS commented Sep 6, 2023

Also having this issue. In my case the subnets cannot be deleted because Global Accelerator and Application Load Balancer ENIs are attached to them. These ENIs are created automatically by AWS.

I tried making the Global Accelerator depend on the subnets that are being destroyed so that TF also destroys the Global Accelerator, and thus hopefully also the attached ENIs, but this doesn't work either (the plan output doesn't indicate that the Global Accelerator will be destroyed).

@vitalyrychkov
Copy link

Having the same issue. In my case Terraform-created subnets cannot be deleted by Terraform due to AWS-created GuardDuty endpoints

@gdavison gdavison self-assigned this Apr 9, 2024
@terraform-aws-provider terraform-aws-provider bot added the prioritized Part of the maintainer teams immediate focus. To be addressed within the current quarter. label Apr 9, 2024
@gdavison
Copy link
Contributor

Hi everyone,

Thank you to everyone who's participated in this discussion. This issue has become a bit of a grab bag of issues related to deleting a subnet when it has EC2 instances or ENIs attached to it. In essence, a subnet cannot be deleted while it has instances or ENIs attached, either using Terraform or the AWS Console.

@bflad's comment (#9495 (comment)) gives good advice on troubleshooting which ENIs are still attached.

Since this issue was opened, we have added special handling code to handle a number of ENI types that are not immediately cleaned up when the associated resource is deleted, including Lambdas, Comprehend, and DMS. There may be others that are now lingering that we're not handling in the provider.

In the originally reported issue, a subnet was in use by the autoscaling group and it was then removed from the configuration. If the subnet had been directly assigned to the ASG instead of being assigned via the data source, Terraform would have been aware of the association and should have handled removal appropriately. Unfortunately, there is no way to address this the way that the OP's configuration was written.

It seems like some of the other commenters are similarly trying to delete only the subnet without first removing the resources attached to the subnet. This won't work in either Terraform or the AWS Console.

For other commenters, we'll need more information so that we can look into the problem that you're having. I'm going to close this issue so that we can focus specifically on individual problems instead of one issue that tries to capture all problems with deleting a subnet.

If you're encountering this problem, please open a new issue. If you'd like, can reference this issue from the new issue to link them together. In your new issue, please include:

@gdavison gdavison closed this as not planned Won't fix, can't repro, duplicate, stale Apr 10, 2024
Copy link

Warning

This issue has been closed, meaning that any additional comments are hard for our team to see. Please assume that the maintainers will not see them.

Ongoing conversations amongst community members are welcome, however, the issue will be locked after 30 days. Moving conversations to another venue, such as the AWS Provider forum, is recommended. If you have additional concerns, please open a new issue, referencing this one where needed.

Copy link

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 11, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Addresses a defect in current functionality. prioritized Part of the maintainer teams immediate focus. To be addressed within the current quarter. service/ec2 Issues and PRs that pertain to the ec2 service.
Projects
None yet
Development

No branches or pull requests