-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CLIP-1701: Wait until pods in Helm release are terminated before destroying nfs module #282
Conversation
Out of curiosity, where can I find part that waits extra 5 seconds |
@jjeongatl oops, my bad. That was the original plan, but just termination_grace_period is enough since it takes some time to delete helm release itself and the pod starts terminating already. |
modules/products/bamboo/helm.tf
Outdated
depends_on = [kubernetes_job.import_dataset] | ||
depends_on = [ | ||
kubernetes_job.import_dataset, | ||
module.nfs, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to have module.nfs
here? time_sleep
is already depends on nfs, so here we can remove it as is embedded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch. Yes, depending on nfs module is implied here. Leftovers from trying just to get away with dependency on nfs (without sleep). Removed this redundant dependency
This PR fixes an issue with pods stuck in a Terminating state when destroying infrastructure. The problem was with the deletion of a shared home pvc which is stuck in pending as long as its pod is stuck in Terminating.
Helm release server pods are often stuck in terminating (unless termination grace period is set to 0) because of the following reason:
Destruction of nfs module (that includes nfs helm release and EBS volume with PV and PVCs) and product helm releases happens almost simultaneously, as a result, when, say, confluence pod is in Terminating state (preStop hook can take some time), EBS volume that backs the underlying PV and PVCs gets destroyed too. As a result, confluence container enters a weird state, and kubelet cannot kill the pod because of the following error:
When trying to delete a docker container directly from the node, it turns out that the container is indeed unresponsive:
As a result, confluence pod is stuck in Terminating, and shared home PVC is stuck too and will be in this state until confluence pod exists. As a result, Terraform gives up waiting for the PVC deletion.
Having investigating the issue, and being not able to reproduce manually with
helm delete
it became obvious that helm_release destruction needs to wait for all pods to be wiped out or else there are chances that critical pieces of infra are destroyed when the pod is being terminated.Unfortunately, helm provider cannot wait for pods to be deleted, it just waits for the release to be deleted. See: hashicorp/terraform-provider-helm#593 and helm/helm#2378
The workaround is:
This way the following deletion order is achieved:
We're giving helm release pods time to be terminated and only then EBS, PV and PVCs get deleted.
Checklist