-
Notifications
You must be signed in to change notification settings - Fork 9.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Terraform passes incorrect provider through when provider depends on resources which need replacing #31520
Comments
Hi @gtmtech, I'm not sure exactly what's going in here from the description alone. Would it be possible to supply a trace log of the failed plan? If you use (probably unrelated, but Thanks! |
@jbardin - sorry just coming back to this. So when you have the following in a dependency chain:
.. and you have successfully terraformed the lot, but then a change to 2 requires 3 to be recreated (e.g. via triggers{} on 3 referencing an attribute from 2)... If 3 needs to be recreated, terraform cant create provider Y for the refresh phase (as depends on 3, but 3 now needs recreating). During refresh phase, terraform assigns provider X (incorrectly) to try and refresh 5, rather than provider Y - as Y is no longer available. It should error or do something else, not assign the wrong provider. |
Hi @gtmtech, I don't think the problem here is the use of the incorrect provider (at least there has been no evidence that could happen yet, but the logs would verify it), rather the configuration for the
If the provider requires the full configuration in order to complete the plan operation, you won't be able to plan and apply a configuration like this without the use of `-target. When given a partially known configuration, most providers opt to attempt whatever operations they need during plan, but generally act as if the unknown value was unset so may fail outright or produce unexpected results. |
Thanks @jbardin - I certainly did not realise that basing a provider on attributes of a managed resource was not supported behaviour as in most cases it works, but I appreciate the docs that you mention. However, if it's not supported behaviour I would argue an error needs to be thrown - Assigning the wrong provider to the wrong resource is in my view of critical importance and should not be tolerated under any circumstances - for example this may lead to api requests being done against the wrong aws account, with all sorts of implications. |
I don't see any evidence here (yet) of the incorrect provider being assigned within Terraform, but an incomplete configuration could make it act like another provider configuration with the same defaults. If that's the case, there's not much we can do within the current protocol, because that is also a perfectly valid mode of operation for some providers. |
I will try and record a video sometime to show you this error @jbardin 👍 |
I have seen exactly the same problem with Databricks workspaces.
If you make a change that causes the workspace resource to change, then Terraform isn't able to refresh the databricks_cluster resource because the provider doesn't exist yet. Often the error will be something that looks authentication-related and causes an enormous amount of confusion Kubernetes providers tend to have the same problem. I will make a feature request for a warning or more meaningful error -- the docs say not to do it, but the initial TF apply does work, and the error on subsequent apply is completely non-obvious. It would be better to report a warning during the parse / validation phase. |
Since we don't have any evidence of an incorrect provider instance being used, I'm going to close this under the assumption that the behavior was an incomplete configuration for a provider. Thanks! |
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. |
Terraform Version
Terraform 1.2.5
AWS Provider 4.23.0
Description
I found an obscure but critical case where terraform does not use the correct provider when refreshing a resource. It is quite convoluted, but as terraform should NEVER mix providers up, I definitely feel its worth raising.
The problem happens with so-called dynamic providers, that is providers which are not immediately instantiated but are instantiated midway through a terraform run.
Recall that a provider without an alias is instantiated at beginning of the terraform run, but a provider with an alias can be made to instantiate part-way through the terraform run if the provider definition depends on resources yet to exist.
This is quite a cool feature that allows us to create a role, and then a provider based off that role to use in creating further resources as that role. This is a very useful way of provisioning multiple accounts for example all in one terraform run.
However I experienced a critical issue that all of a sudden, terraform switched back to using the initial provider to try and refresh the later resources that are supposed to be provisioned using the midway dynamic provider.
Here is some code to better illustrate.
Terraform Configuration Files
All this works the first time you use it, provider foo creates rolebar, and provider bar - based off the new rolebar creates policy1 - great!
But now I realised my triggers were wrong, so I changed the time_sleep resource like this:
Now when I run terraform, I get very strange behaviour like this:
That's odd I think - the code hasn't changed at all, and the IAM Policy policy1 is clearly set to use the aws.bar provider, not the aws.foo provider, so why is terraform trying to use the foo provider to refresh it?
When I turned on TF_LOG=debug, I saw absolutely no posts to the sts endpoint trying to assume the bar role.
What is happening here, is that provider bar depends on the time_sleep resource, and in this case, the time_sleep resource needs to be recreated, so the aws.bar provider which depends on that time_sleep resource cannot be evaluated until that happens.
But in the refresh of the whole plan, terraform has to try and refresh the iam policy policy1. Without an available aws.bar provider to use at this point (because it needs to apply the time_sleep before it can refresh the iam_policy policy1), terraform decided to unilaterally use the original provider foo.
This warrants a fix of some sort - it took absolutely ages to track down, and I would like to have seen a message that terraform couldn't refresh the policy1 resource because the underlying provider couldn't be created. That would have been much more helpful than mis-assigning the wrong provider to the refresh task.
When I finally figured out what was going wrong, I was able to target apply the time_sleep resources to fix the new triggers, and this "freed up" the internal terraform logic to create the aws.bar provider and refresh the policy1 with the correct provider again.
The text was updated successfully, but these errors were encountered: