-
Notifications
You must be signed in to change notification settings - Fork 9.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature: Allow circular dependencies in resources #27188
Comments
Hi @dansimau! Thanks for sharing this use-case. As you've noted, the typical way to deal with this today is for the provider to explain to Terraform that "update cognito user pool to add lambda trigger" is a separate operation by representing it as a separate resource. That creates a relatively easy to explain execution model: there is only one action for each resource per plan (with the special exception of "replace", which is internally a combined destroy/create), and the ordering of those actions is derived from the dependencies between those resources. Off the top of my head I'm not able to imagine a general solution to this which doesn't require the provider to give Terraform enough information to understand that, in your case, it's allowable and reasonable to create a cognito user pool without a lambda trigger at first and then update it later. Any design that requires additional information in the provider schema would not meet the use-case as you framed it, where additional work in the provider was your criteria for failing the current design as a suitable solution. Since your request here explicitly excludes the current design as a possible answer, but there isn't yet a candidate new design to evaluate, I'm going to leave this open for the moment but I want to be explicit that it will likely be closed unless someone suggests a concrete technical design for further discussion, because we (the Terraform team at HashiCorp) consider this problem already "solved" in the sense that there is a way for a provider to represent the sequence of three operations you described. In our conception of Terraform's architecture, we consider it the provider's primary responsibility to map from the concepts of the remote API onto Terraform's workflow, and so although it would be nice to find some way to "automate away" this design problem, architecturally there is no particular need to do so, and if the AWS provider doesn't offer a way to associate a lambda trigger with a cognito pool as a separate operation then I expect it will be far more expedient to work on a specific technical design for the AWS provider to address that than to try to design a generalized solution for hypothetical additional problems that we are not yet aware of. At the very least, we'll need several more examples of similar problems in order to start to analyze what they all have in common and thus how the problem might generalize. |
Thanks for the considered reply @apparentlymart.
Indeed, I'd be interested to know how often this comes up. Judging by the fact that nobody filed an issue before, maybe not as much as I originally assumed when I hit this use case. |
I don't have a proposed solution, but I can provide another example. This circular dependency scenario happens in the AzureRM provider with app services and Azure-managed SSL certificates. The azurerm_app_service resource can be given a custom hostname and SSL certificate via the azurerm_app_service_custom_hostname_binding resource. You normally specify the SSL certificate to use via the 'fingerprint' attribute, which is the SSL fingerprint of the desired certificate. If you wish to use a free Azure Managed Certificate via the azurerm_app_service_managed_certificate resource, a circular dependency is created: azurerm_app_service_managed_certificate requires an azurerm_app_service_custom_hostname_binding, but azurerm_app_service_custom_hostname_binding requires the fingerprint from azurerm_app_service_managed_certificate in order to attach the certificate. (I work around the problems with a judicious use of a local-exec provisioner and ignoring changes to some attributes, so I bring this up just to provide another example use-case) Edit: Ironically, as I was typing this out a new release of the AzureRM provider eliminated this particular circular dependency. 🤣 |
There's also the more general use-case of wanting to build resources that communicate with each other in Terraform. From a what-does-a-solutiosn-look-like perspective, perhaps a configuration block similar in structure/functionality to a provisioner that fires after resource creation, but used to provide for a delayed attribute update/change instead? As a vague example for such a post_create block:
The end-result being that each resource gets created first without SOME_OTHER_KEY being present in app_settings{}, then updated post-creation in the same plan to add it. Referencing the other resource like this would allow for appropriate dependency ordering, hopefully? And after successful creation the results of the post_create can be merged into the regular state so future plans work normally? This would solve almost all of the use-cases for circular dependencies I've run into, I think, including the original Cognito-/Lambda-oriented presented here, and would also allow for more natively-Terraform workarounds for cases where providers haven't caught up with addressing circular resource dependencies like the one I mentioned in my previous comment. |
That's an interesting idea, @okaros, and reminds me a bit of functional reactive programming where programs react to events by merging the event data in with a previous value. It does seem like an idea worth researching in some more detail. Some initial thoughts I have for questions to consider would be:
It does seem like a promising direction to investigate, but also not an easy thing to prototype with Terraform as it exists today. 🤔 I would like to consider it more though, so thanks for suggesting it. |
I don't have full answers, @apparentlymart, but some thoughts from my end-user perspective : An initial or even final implementation might simply say "You can't do that" with regards to nested blocks that aren't addressable, or even nested blocks altogether. Other Terraform functionality has limitations on what can be interacted with ("destroy" provisioners come immediately to mind as an example of something with heavy restrictions), so such limitations wouldn't be unprecedented. A solution that worked with everything would of course be ideal, but for me, at least, even a limited solution be a welcome improvement. Multiple post_create updates would be interesting, although I'm not sure I can see a use-case where they'd be needed (at least, not without introducing additional layers and the concept of post_post_create, which strikes me as being... too much) . But if they are, perhaps they could be handled in the same fashion as provisioners, with multiple blocks simply being handled serially both in the written HCL blocks and in planning/execution? I'd envisioned the post_create block as being limited to addressing attributes on the attached resource and not being capable of adjusting other resources, and two different resources with post_create blocks would only be able to reference attribute values available during the initial creation. i.e. If my example2 app service from above tries to reference example1.app_settings, it only sees SOME_KEY and not SOME_OTHER_KEY (but, once the resources were created SOME_OTHER_KEY would be available and indistinguishable from SOME_KEY). Currently provisioners aren't shown at plan-time at all, and those are my closest analogue to this idea, so.... 🤣 |
@okaros I know it doesn't solve this particular issue but I feel in needs pointing out that the event data passed to the lambda trigger does include the Cognito user pool id. |
I have two Okta orgs managed via https://registry.terraform.io/providers/oktadeveloper/okta and I want to set up a SAML based trust between them. Creating the resources in step 1 and 2 generates unique identifiers that must be exchanged, and cannot be known in advance. Steps:
Using the # idp is created first, with placeholder for argB
resource "okta_saml_idp" "external-idp" {
argA = "value"
argB = "placeholder"
post_create {
argB = okta_app_saml.sp.some.value
}
}
# sp is created later, due to dependency on idp.somevalue
resource "okta_app_saml" "idp-provider" {
argA = okta_saml_idp.idp.someother.value
argB = "value"
}
# finally, post_create can execute as its dependency is satisfied now. Note, I simplified for brevity (removed the needed multiple providers, used example arg/attrib names). I see three alternatives to solving this via
|
AWS Transit Gateway provides routing between multiple VPCs, replacing VPC Peering. Setting this up involves circular dependencies because the TGW must be explicitly attached to the VPCs (requiring knowledge of the It makes a lot of sense to manage the (many) VPCs with their TGW routes independently (note1) of the (one) TGW with its VPC attachments. However, if you break the dependency cycle by setting up the VPC route tables after the VPCs and TGW exist, then you can't manage the VPC because the "new" routes are discovered in subsequent plans. On the other hand, if you setup the TGW first without any attachments, then manage the attachments and route tables from inside each VPC, then that undermines the value of using TGW to centrally administer routes between VPCs. I'm not sure how I could use the proposed note1: By "independently", I mean, "resources managed in distinct tfstate files". |
I've been working around this issue using a blue/green and dev environments for my app, but then I ran into an AWS issue that left appsync domains in a state where it was unusable for hours (seperate Terraform issue regarding Custom domain disassociation). This was the original stacks:
This worked well as i could change a config var and point the route53 domain name to either blue or green stacks easily, but to help illustrate the point below, notice that the global resources and datastack resources are required by other resources in the stacks downstream. So now were to the key problem. I wanted to safeguard against this aforementioned issue (and potential others like downed resources between regions, etc) by creating a "region" stack layer, so that i could replicate the LiveDataStack, DevDataStack, Blue/Green/Dev AppStacks into another/multiple regions like this:
But because of the circular dependencies, this is not possible (or at least I have not been able to find a way to do this). |
I tried to use Terraform to set up a Snowflake "Storage Integration" object that links to an AWS S3 bucket using the "chanzuckerberg" Snowflake provider from the Terraform registry in addition to the standard AWS provider. Part of the process to create the integration requires the following sequence of actions:
Hence there is a circular dependency between the IAM Role and Storage Integration. Steps 1 and 2 are straightforward but step 3 involves modifying an object's state after it has been created. The IAM Role access policy cannot be modified separately from the role itself. |
I've run another use case: trying to manage content with a series of messages, each with a "back to top" link which would link to the table of contents. The table of contents, of course, needs to be able to link to all the other posts. This is another instance of "I need mutually referential identifiers". My alternative suggestion is that two-phase created could be explicitly supported at the platform level, as there are APIs that allow reservation of resources much more cheaply than full creation. Something like:
This design could be extended to multiple phases, but it's not immediately clear you'd want that. |
I'm running into this in OCI. Creating a custom route table and assigning that route table to the subnet works without issue. The problem comes when I also want to create route rules in that route table. For example, I have a subnet defined and that subnet will have an Ubuntu instance in it along with a Palo Alto firewall instance. I need a route table assigned to the subnet that makes the trust interface IP of the firewall the default gateway for the subnet. Here are the components that need to work together: The problem is the circular dependencies. The route rule depends on the network_entity_id of the private IP. That private IP depends on the subnet. The subnet depends on the route table at creation. Everything works until a route rule is specified that includes the id of the private IP as that creates the circular reference. The subnet can't use the route table because the route table has a rule in it that points to the private IP which can't be created before the subnet is created. |
hashicorp/terraform-provider-aws#1824 Another valid use case is cycles in AWS security groups or prefix lists. |
We bumped into this while trying to pass the invokeURL of an AWS gateway resource to a lambda as an environment variable because the gateway has endpoints that route to the lambda. |
I have the same use case that jeffg-hpe posted above, and to expand on it a little, the provider can't cleanly handle this one because building it requires multiple instances of the provider, one targeting the IdP tenant, the other targeting the SP tenant each with distinct API endpoints and auth tokens. So the typical approach of adding a virtual resource to the provider that manages multiple resources under the hood doesn't work here because those resources exist in disparate environments. His third alternative approach, while novel, seems sloppy for a provider. It'd require something like:
So we're left with local-exec or multi-stage deployment unless this can be handled as a feature of Terraform. |
Do you have an example of this typical way? I am researching and not sure how two separate operations (to the same resource I assume?) are modeled as separate resources. Our use case is configuring snowflake. The manual process is:
Some possible mechanisms I've heard referenced in my research are dynamic data sources, dynamic variables, and now this multiple operations. But I haven't worked out yet how to implement any of them. |
One example of this pattern that I can think of quickly is in the There are separate resource types for |
Oh, is that why so many aws features have separate resource types? Does that mean that in places where there are blocks that could be separate resources, the direction of travel is towards the latter? |
There is a separate team responsible for the I suspect you're recalling that earlier versions of the provider just had a single From discussions from the provider teams my understanding is that their modern design approach is to closely match the structure of the underlying API to avoid this sort of design inconsistency in the fine details. That goal might explain other API changes where certain single resource types were split into many separate resource types in later releases, but I'm not involved with the detailed planning of that and I only know about the S3 example because I've previously helped folks in the community who had problems caused by the old design. If you'd like to discuss more about how the Thanks! |
Another use case analogous to @apparentlymart 's S3 case above: locking KMS keys to any resource that uses them with a
with a key policy for the KMS key of:
|
This comment was marked as off-topic.
This comment was marked as off-topic.
3 years later. Any updates here? |
@nibblesnbits Based on scanning @apparentlymart's comments, I would not expect this behavior to change in Terraform v1.x. This is the type of issue the team likes to leave open to generate ideas and use cases for a "hypothetical v2." For future viewers, if you are viewing this issue and would like to indicate your interest, please use the 👍 reaction on the issue description to upvote this issue. Thanks! |
i'll add my use case to the pile. Using the respective hashicorp AWS modules for lambda and eventbridge. Dependency exists when creating a lambda with a eventbridge rule as a trigger. The lambda needs the ARN of the rule to add the required resource permissions to the lambda so it can be invoked by the event tule. The event rule also needs the lambda function ARN to be able to create the lambda target |
Hi @stevemckenney! Thanks for sharing that feedback. Could you link to the specific modules you're referring to? I'm not aware of any HashiCorp-maintained modules for either Lambda or EventBridge, and I wasn't able to find any relevant-seeming modules in the partner-maintained AWS modules. I'd like to be able to see exactly how those modules are configuring Lambda and EventBridge to understand how the circular dependency arises. In the
That sequence therefore avoids any circular dependency because the permission is modeled as a separate resource. I'm guessing that the modules you are trying to use make it hard or impossible to declare that sequence of events. Therefore I'd like to study those modules to understand why that is, and thus what specific changes we might potentially make to the Terraform language to avoid that problem. Thanks again! |
Thanks for those links, @stevemckenney. I'm not super familiar with these modules, but from peeping in the source code for a little while it seems like something like this might work: module "lambda" {
source = "terraform-aws-modules/lambda/aws"
# ...
runtime = "..."
source_path = "..."
# ...
allowed_triggers = {
for k, arn in module.events.eventbridge_rule_arns : k => {
service = "events"
source_arn = arn
# ...
}
}
}
module "events" {
source = "terraform-aws-modules/eventbridge/aws"
# ...
targets = {
crons = [
{
name = "lambda-cron"
arn = module.lambda.lambda_function_arn
input = jsonencode({ "job" : "cron-by-rate" })
}
]
}
# ...
} This relies on the fact that the parts of the Lambda module that configure the function itself don't refer to I don't have an AWS account handy with which to test this right now, but I notice that the Of course, if that doesn't work then putting a permission resource separately outside both of the modules ought to work, as you said. |
I'm having an issue where a lambda function is adding the ARN of a stepfunction as it's environment variable
However, the stepfunction also needs the ARN of this lambda function
How can I go about this? Is it possible to set the environment variable after the stepfunction has been created? |
Hi @esirK, For situations like that a typical strategy would be to add some sort of indirection. That means that instead of passing the step function ARN directly to the Lambda function, you'd instead pass some information that the Lambda function can use to find the step function dynamically at runtime. Of course, you will need to be able to tolerate there being a brief period at the start of the Lambda function's life when the step function doesn't exist yet. Another possibility would be to split your function into two functions, where one is triggered by the step function and the other uses the step function itself. Unfortunately, the AWS Lambda API expects environment variables to be set in the same API call that creates the Lambda function and so the |
Thanks @apparentlymart |
Another use case: |
Current Terraform Version
Use-cases
When I configure this in Terraform, it obviously doesn't work. I get:
However, the use case above is a real-world circular dependency that is legitimate. Outside of Terraform, it would be a 3-step process to configure this, e.g. one of the ways would be:
(You could also create the other resource first, but the steps are the same: Create resource A; Create resource B; Update resource A).
Attempted Solutions
Proposal
Is there a way in which Terraform could attempt to resolve cycles automatically by doing a create A, create B, then update A?
Sorry if this suggestion seems naïve, I admit I'm not familiar with Terraform internals. However, I imagine it to require:
The idea here is that this would be a general solution. The observation here is that resource cycles are a legitimate and real-world use case that need to be dealt with in a general way.
References
I did a search to try and find prior discussions on this but I couldn't find any specific feature request around representing or allowing resource dependency cycles.
The text was updated successfully, but these errors were encountered: