Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Terraform destroy of helm_release resources. #593

Closed
Ragib95 opened this issue Sep 29, 2020 · 13 comments
Closed

Terraform destroy of helm_release resources. #593

Ragib95 opened this issue Sep 29, 2020 · 13 comments
Labels

Comments

@Ragib95
Copy link

Ragib95 commented Sep 29, 2020

Terraform Version and Provider Version

  • Terraform v0.12.26
  • provider.aws v2.65.0
  • provider.helm v1.3.0
  • provider.kubernetes v1.11.3

Provider Version

  • provider.helm v1.3.0

Affected Resource(s)

  • helm_release

Terraform Configuration Files

# Copy-paste your Terraform configurations here - for large Terraform configs,
# please use a service like Dropbox and share a link to the ZIP file. For
# security, you can also encrypt the files using our GPG public key.

provider "helm" {
  kubernetes {
    load_config_file = false
    host             = "${aws_eks_cluster.aws_eks.endpoint}"

    cluster_ca_certificate = "${base64decode(aws_eks_cluster.aws_eks.certificate_authority.0.data)}"

    token = data.aws_eks_cluster_auth.main.token

  }
}

resource "helm_release" "nginx-ingress" {
  name             = "nginx-ingress"
  chart            = "/nginx-ingress/"
  namespace        = "opsera"
  create_namespace = true
  timeout          = 600

  values = [
    "${file("value.yaml")}"
  ]

  depends_on = [
    "aws_eks_node_group.node",
    "helm_release.cluster-autoscaler",
    "aws_acm_certificate.public_cert"
  ]
}

Debug Output

helm_release.nginx-ingress: Destroying... [id=nginx-ingress]
helm_release.nginx-ingress: Destruction complete after 8s
aws_eks_node_group.node: Destroying... [id=*****node]

Panic Output

Expected Behavior

helm_release destruction should wait for all resources (pods, services, and ingress) to be in a destroyed state before going into Destruction complete state.

Actual Behavior

It's going into Destruction complete state within 7-8 secs before pods and services are fully destroyed. This results in EKS node destruction getting started and leaves ELB attached to service.

Reason:- Before helm is releasing pods and services, terraform started deleting node and cluster leaving pods in Terminating state.

image

Steps to Reproduce

  1. Create an EKS cluster with Nginx ingress.
  2. Destroy the resources using terraform destroy
  3. It's giving timeout error as ELB attached to Nginx service is not getting destroyed.

Important Factoids

References

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment
@Ragib95 Ragib95 added the bug label Sep 29, 2020
@alexsomesan
Copy link
Member

@Ragib95 This is expected behaviour due to a limitation in Terraform that causes it to not recognise the implicit dependency between the Helm resource and the EKS cluster resource. Terraform tries to parallelise the destroy operations when no dependency is known between the resources. This can lead to the EKS cluster being destroyed before the Helm release itself.

I'd suggest setting an explicit dependency on the EKS cluster resource in the helm_release resource, like this:

depends_on = [
  aws_eks_cluster.aws_eks,
]

@aareet
Copy link
Contributor

aareet commented Jan 6, 2021

We currently don't have a way to know what resources are created. We will have to wait for helm/helm#2378 to be implemented.

@aareet aareet added the upstream-helm Issue is in Helm not the provider label Jan 6, 2021
@devurandom
Copy link

I am unable to terraform destroy -target=... a helm_release resource:

Error: uninstall: Release not loaded: metrics-server: release: not found

Is this another manifestation of this issue, or should I open a separate one?

@jocutajar
Copy link

We currently don't have a way to know what resources are created. We will have to wait for helm/helm#2378 to be implemented.

Issue closed, but not fixed.

@visla-xugeng
Copy link

I got same error when I tried to destroy resource with terraform. The helm release got deleted, but the pods were in "Terminating" status. And I found that all of helm chart resources got this issue.

my terraform structure:
dev: call modules
prod: call modules
modules: all resources (included helm charts) are built in module directory

Any solution or ideas?

@FearlessHyena
Copy link

We currently don't have a way to know what resources are created. We will have to wait for helm/helm#2378 to be implemented.

Issue closed, but not fixed.

Seems like the referenced helm issue has been fixed by helm/helm#9702
Would it make it easier to solve this issue?

@avinashpancham
Copy link

@alexsomesan as mentioned earlier in this thread helm/helm#9702 seems to solve this issue from within Helm.

Then I think it can be solved in the Terraform Helm provider by adding a new wait_for_destroy argument, that is passed to the Helm uninstall command.

Don't exactly know how to do it, but if you could point me in the right direction I could give it a try.

@ClenchPaign
Copy link

Any update on the Terraform side for helm/helm#9702?

@jferris
Copy link

jferris commented Feb 26, 2022

I believe this was resolved by #786. After upgrading the Helm provider to 2.4, the 'wait' attribute of the helm_release is respected during terraform destroy.

@RicoToothless
Copy link

I believe this was resolved by #786. After upgrading the Helm provider to 2.4, the 'wait' attribute of the helm_release is respected during terraform destroy.

I think is nope.

I used 2.4.1, 2.5.0 and 2.5.1.

wait didn't fix the issue for me. (Default value is true, by the way)

@jocutajar
Copy link

Hi, #786 is an impressive MR (to say the least)! I'm not brave enough to go dig into it. Do we need a test scenario for the wait on destroy?

@WillerWasTaken
Copy link

Our current workaround, which aint great but... yeah...

resource "helm_release" "nginx_ingress_controller" {
  name       = local.service_name_ingress-nginx
  namespace  = var.namespace
  repository = "https://kubernetes.github.io/ingress-nginx"
  chart      = "ingress-nginx"
  version    = "4.2.1"

  values = [
    yamlencode(local.helm_chart_ingress-nginx_values)
  ]
  
  max_history = 3
  depends_on = [
    helm_release.aws_load_balancer_controller,
    time_sleep.wait_nginx_termination
  ]
}

# Helm chart destruction will return immediately, we need to wait until the pods are fully evicted
# https://github.com/hashicorp/terraform-provider-helm/issues/593
resource "time_sleep" "wait_nginx_termination" {
  destroy_duration = "${local.ingress_nginx_terminationGracePeriodSeconds}s"
}

Putting a fixed sleep timer does the job, waiting more than necessary but does the job for now :/

@github-actions
Copy link

github-actions bot commented Sep 3, 2023

Marking this issue as stale due to inactivity. If this issue receives no comments in the next 30 days it will automatically be closed. If this issue was automatically closed and you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. This helps our maintainers find and focus on the active issues. Maintainers may also remove the stale label at their discretion. Thank you!

@github-actions github-actions bot added the stale label Sep 3, 2023
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Oct 4, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 4, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests