Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Terraform apply destroys resources that are still referenced from other resources #31309

Open
prashantv opened this issue Jun 23, 2022 · 1 comment
Labels
core enhancement explained a Terraform Core team member has described the root cause of this issue in code

Comments

@prashantv
Copy link

Terraform Version

Terraform v1.2.3
on darwin_arm64

Terraform Configuration Files

This configuration relies on a dummy provider, see github/prashantv/terraform-dep-order-repro

resource "tftest_notifier" "n1" {
  email = "[email protected]"
}

resource "tftest_notifier" "n2" {
  email = "[email protected]"
}

resource "tftest_policy" "p1" {
  notifier_ids = [
    tftest_notifier.n1.id,
    tftest_notifier.n2.id,
  ]
}

ref

After applying the above, remove "n2" and the reference to it,

resource "tftest_notifier" "n1" {
  email = "[email protected]"
}

resource "tftest_policy" "p1" {
  notifier_ids = [
    tftest_notifier.n1.id,
  ]
}

Applying this will fail as n2 is destroyed first, while there is still a policy referencing it.

Debug Output

https://gist.github.com/prashantv/45859f8607e690ff22990c310381f8c8

Expected Behavior

When the config removing "n2" is applied, the reference is removed first, and then n2 is removed successfully.

Actual Behavior

Terraform tries to remove "n2" before updating the policy reference, and the remove fails since the notifier is still referenced by a policy.

│ Error: failed to delete Notifier: cannot delete notifier, as policies [policy-174441.453993] still refer to it

Steps to Reproduce

Full steps for reproduction are in the repo.

Summary is:

  • Use a dummy provider with 2 resources: notifiers/policies. policies can reference multiple notifiers.
  • Apply: notifiers "n1" & "n2", and policy "p1" which references "n1" & "n2"
  • Remove "n2" and the reference "n2" in "p2"
--- a/repro/main.tf
+++ b/repro/main.tf
@@ -11,13 +11,13 @@ resource "tftest_notifier" "n1" {
   email = "[email protected]"
 }

-resource "tftest_notifier" "n2" {
-  email = "[email protected]"
-}
+# resource "tftest_notifier" "n2" {
+# email = "[email protected]"
+# }

 resource "tftest_policy" "p1" {
   notifier_ids = [
     tftest_notifier.n1.id,
-    tftest_notifier.n2.id,
+    # tftest_notifier.n2.id,
   ]
 }
  • Run terraform apply, which will try to destroy "n2" first. (which is still referenced, and hence fails).

Additional Context

There is a workaround which causes the destroy to happen after the update: setting the lifecycle meta-argument create_before_destroy. This has to be set on the resoruce and applied before the delete operation. However, this workaround has some issues:

  • Every user of the provider would need to set this meta lifecycle argument as part of every single resource which is a poor user experience (noisy, duplicated across possibly hundreds of resources, and not very obvious that they need to do this). There's no way for a provider to specify a default create_before_destroy.
  • create_before_destroy is not appropriate for some resources. If a resource has a field that must be unique (e.g., name), and a separate attribute that has ForceNew, then create_before_destroy is not compatible with these resources, so the workaround does not work. We acutally have this combination -- the notifier resource has a name that must be unique, and a user-selected immutable ID field that uses ForceNew, so create_before_destroy shifts the problem to users being unable to change names.

Ideally the solution would be:

  • A provider configuration to indicate that destroys of a resource cannot happen till all references to the reference have been updated
  • The configuration doesn't cause creates to happen before destroys (so it's compatible with resources that have fields that must be unique + ForceNew fields).
  • The user doesn't need to do anything special.

References

There are a few existing issues which are almost all closed:

create_before_destroy is recommended in many of the above, but as mentioned in "Additional Context", it's a workaround with drawbacks rather than a solution to this problem.

@prashantv prashantv added bug new new issue not yet triaged labels Jun 23, 2022
@jbardin
Copy link
Member

jbardin commented Jun 23, 2022

Hi @prashantv,

Thanks for filing the issue, and the extensive references you've collected! As you've mentioned, the only solution presented by Terraform to obtain this order of operations is create_before_destroy. The additional enhancements proposed here are not new ideas, but have not been implemented for a couple of primary reasons:

  • Inverting the order of tftest_notifier.n2.id(destroy) and tftest_policy.p1(update) only works in isolation, in a larger graph it introduces cycles (see https://github.com/hashicorp/terraform/blob/main/docs/destroying.md for some notes on replacement and destroy ordering). If tftest_notifier.n2.id were being replaced rather than destroyed the order of operations would always cause a cycle even in the simple case of 2 resources. If we ignore the cycles in the simple case of replacement, then tftest_policy.p1(update) would need to happen twice in a single apply, once to remove the old tftest_notifier.n2.id, then again to add the new tftest_notifier.n2.id, which isn't possible within the current architecture of Terraform.
  • If a provider were to be able to specify resource ordering (hashicorp/terraform-plugin-sdk#585) it would affect other resources in the graph, which may be from other providers not expecting that ordering. For example, because create_before_destroy inverts the edges between the create and destroy subgraphs of resource dependencies, that inversion must be cascaded upstream to prevent cycles during replacement. We've reserved this ability to only come from the user creating the configuration, because only that user has the context to understand the changes as a whole.

Unfortunately this means as of now the problem falls mostly on the providers. Any resource which requires this sort of "registration pattern" should be designed to allow create_before_destroy, at least optionally, by using auto-generated identifiers. The need to put create_before_destroy in the configuration is up to the provider to document, but also the user to discover, which I also agree is not optimal here.

Even in the case that a usable provider-forced option for a new ordering were possible, it would still be up to the providers to implement. Seeing the number of cases where create_before_destroy wasn't documented or allowed by resources which require it, it might not meaningfully reduce the number of overall bugs encountered.

While we're not opposed to the possibility of such a feature, there haven't yet been any proposals which could meat the two fundamental requirements of; not imposing changes on resources outside of that provider's control, not causing cycles in arbitrarily large or complex graphs. Since Terraform is working as designed here, I'm going to re-label this as an enhancement proposal. This can more or less serve as the core equivalent of hashicorp/terraform-plugin-sdk#585, which is meant to cover the same usability aspects.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core enhancement explained a Terraform Core team member has described the root cause of this issue in code
Projects
None yet
Development

No branches or pull requests

2 participants