Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Module-aware explicit dependencies #17101

Closed
apparentlymart opened this issue Jan 13, 2018 · 16 comments · Fixed by #25005
Closed

Module-aware explicit dependencies #17101

apparentlymart opened this issue Jan 13, 2018 · 16 comments · Fixed by #25005

Comments

@apparentlymart
Copy link
Contributor

apparentlymart commented Jan 13, 2018

Terraform currently allows the declaration of explicit inter-resource dependencies using depends_on:

resource "example" "example1" {
}

resource "example" "example2" {
  depends_on = ["example.example1"]
}

The presence of the depends_on in the above example causes the graph builder to create a dependency edge from example2 to example1, which ensures that example1 is visited first during any graph traversal.

This mechanism does not generalize to other constructs within Terraform. In particular, it doesn't generalize to modules, since a module is not represented as a single node in the graph. Instead, each individual variable and output in a module is its own graph node, which allows us to optimize our parallelism by getting started on some aspects of a module before all of the input variables are ready, and to begin processing resources that depend on a module before all of its outputs are complete. Even though variables and outputs are in the graph, we do not currently support referring to them in depends_on.

The following proposal describes a generalization of the depends_on mechanism to apply to both resources and modules, with the goal of satisfying the use-cases discussed in #10462, allowing explicit dependencies on module variables and outputs, along with a syntax that creates the effect of an entire-module dependency.


New addressing forms for depends_on

We currently allow references to managed and data resources in depends_on. To support dependencies with modules, we must extend this to support the following forms:

  • aws_instance.example - managed resource dependency, as today
  • aws_instance.another_example[2] - a particular instance of a managed resource with count set
  • data.template_file.example - data resource dependency, as today
  • var.foo - dependency on an input variable passed by a parent module
  • module.example.foo - dependency on an output of a named child module
  • module.example - dependency on an entire module

Our improved configuration language parser (which, at the time of writing, is in the process of being integrated into Terraform Core) allows us to improve the depends_on syntax through direct use of expressions, rather than requiring these references to be inside quoted strings:

# DESIGN SKETCH: not yet implemented and may change before release

resource "example" "example2" {
  depends_on = [
    aws_instance.example,
    aws_instance.another_example[2]
    data.template_file.example,
    var.foo,
    module.example.foo,
    module.example,
  ]
}

This syntax will be used for the examples in the remainder of this proposal.

Support depends_on as a module block argument

The above allows modules to be used as explicit dependencies, but we need to additionally support depends_on inside module blocks in order to allow modules to have dependencies:

# DESIGN SKETCH: not yet implemented and may change before release

module "example" {
  depends_on = [
    aws_instance.example,
  ]
}

Depending on a Module Variable

At first glance, an explicit dependency on a var.foo expression feels a little strange: variables don't have externally-visible side-effects, so it's strange to want to depend on them without using their result.

However, allowing explicit dependencies on variables creates a mechanism for the author of a more-complex reusable module to create custom depends_on-like attributes that serve to block subsets of the functionality of the module. For example:

# DESIGN SKETCH: not yet implemented and may change before release

### in root module

module "database" {
}

module "app" {
  ami_id = "ami-1234"
  app_server_depends_on = [
    module.database,
  ]
}

### in module "app"

variable "app_server_depends_on" {
  default = []
}

resource "aws_security_group" "foo" {
  # Work on _this_ resource can begin immediately
  # ...
}


resource "aws_instance" "app_server" {
  ami = var.ami_id
  # ...

  # We can't create this resource until the caller tells us that it's
  # prepared some hidden dependencies.
  depends_on = [
    var.app_server_depends_on,
  ]
}

This makes it possible to create a re-usable module for deploying arbitrary applications (parameterized by an AMI to deploy, etc), which can immediately create supporting resources like the security group in this example, but defer creating the actual compute resources until some arbitrary, caller-defined dependencies have been dealt with. The caller knows that ami-1234 expects to have a database available to it on boot, while the re-usable module has no direct knowledge of that database.

The actual value of app_server_depends_on in the above example is not actually significant. Instead, we effectively pass the dependencies of that expression through to the module by creating a transitive dependency relationship in the graph.

Depending on a Whole Module

As noted above, modules are not represented directly by graph nodes today, so whole-module dependencies (either as dependencies or dependents) require some new graph-building functionality.

The most likely user intent for a dependency of the form module.example is to wait until everything in the module has completed before continuing. This behavior would have a severe impact on Terraform's ability to achieve parallelism though, and so this proposal suggests a compromise for when depends_on references a whole module: treat this as an alias for depending on each of the module's outputs, but not on any resources or nested modules.

Terraform graph where a nested module called "example" has two resources, example1 and example2, where only example1 is a dependency of the module's outputs

The biggest consequence of this compromise is that in the above example null_resource.example will block until module.example.null_resource.example2 is complete, but will not wait for module.example.null_resource.example3 because none of the module's outputs depend on that resource.

This consequence gives a measure of flexibility and control for the module author, however: if the author knows that the module performs a time-consuming operation but that this operation does not block access to the objects that the caller will depend on then this can be expressed by making that operation not be a dependency of the outputs. From the module caller's perspective, the module can still be thought of as a black box, with the module author designing it such that all significant effects of the module are referenced in an output. In effect, the module author uses output blocks to define what it means for the module to be considered "complete".

The improved configuration language, whose integration is in progress as we write this, allows passing the result of an entire module as a value into another module:

# DESIGN SKETCH: not yet implemented and may change before release

### root module
module "example1" {
}
module "example2" {
  example1 = module.example1
}

### module example1

output "id" {
  value = "placeholder-id"
}

### module example2

variable "example1" {
}

resource "null_resource" "example" {
  triggers = {
    example1_id = var.example1.id
  }
}

This new usage creates an implicit dependency between module.example2.var.example1 and all of the outputs of module.example1, since they must all be complete before the language runtime can construct the value of module.example1 to assign. This implicit usage further reinforces the idea that only the outputs are dependencies in this case, because that is what is necessary to construct the object value returned by module.example1.

Whole-module depends_on

Using depends_on in a module block will also limit parallelism, but the impact is less severe in this case because the effect is under the direct control of the caller module, and so its author can make a tradeoff to decide at what point the limited parallelism hurts enough to warrant more precise dependency handling:

# DESIGN SKETCH: not yet implemented and may change before release

### root module

variable "baz" {
}

resource "null_resource" "example1" {
  triggers = {
    example = "hello"
  }
}

module "example" {
  foo = var.baz

  depends_on = [
    null_resource.example1,
  ]
}

### module "example"

variable "foo" {
}

resource "null_resource.example2" {
  triggers = {
    foo = var.foo
  }
}

resource "null_resource.example3" {
}

module "example2" {
}

### module "example2"

resource "null_resource.example4" {
}

Dependencies away from the module require the creation of a new "begin" graph node for the module that declares depends_on, which must then be a dependency of every resource in the module and of any downstream modules. To reduce the number of graph edges, a "begin" node will be created for each of the downstream modules too, so that only one additional edge needs to be added between the modules (to connect the "begin" nodes).

A "begin" graph node takes no action when visited during a walk and so just serves as an aggregation point to reduce the number of dependency edges. For a module block without depends_on the "begin" graph node can be safely optimized away, along with its incoming dependency edges, during graph construction.

depends_on in other contexts

depends_on can be useful for any Terraform construct that causes externally-visible side-effects, as a means to influence the ordering of those side-effects.

Provider initialization also sometimes has side effects, such as reaching out to an external network service to begin a session or to validate credentials. depends_on could therefore also be useful in provider blocks, as described in #2430. However, providers are special in that they need to be instantiated in all phases of Terraform's operation, and thus it is not always possible to force an ordering for provider initialization relative to resource creation as described in #4149. Implementation of depends_on for modules should not block on the implementation of "partial apply", but we should reserve the depends_on argument for provider blocks as part of implementing this proposal to minimize the risk that a provider in the wild will introduce its own depends_on configuration argument that would then be in conflict.

output, variable and locals blocks do not have any externally-visible side-effects and so depends_on would not serve any useful purpose for these blocks; it is always safe to evaluate the corresponding graph nodes as soon as their implicit dependencies become ready.

provisioner blocks within managed resources are not currently represented as separate graph nodes, and so they are processed as part of a create action for their parent resource node.

@apparentlymart
Copy link
Contributor Author

At the time of writing the Terraform Core team at HashiCorp is focused on integrating the improved configuration language parser/interpreter, so we are not yet able to begin prototyping and implementation of this proposal.

However, I'm sharing this now because some parts of this proposal overlap with the configuration language improvements and so we'd like to lay some groundwork during our current project to ease the later implementation of this feature.

@herry13
Copy link

herry13 commented Oct 15, 2018

Hi, I’m interested on this proposal. Is there any update around this idea?

@jayudhandha
Copy link

Hi Folks,

Any update on this feature?

@apparentlymart
Copy link
Contributor Author

Hi all,

We've been laying some groundwork for this during the v0.12 release development cycle, but won't be able to get it all done before v0.12.0 final due to scope.

The portions of this that should be included in v0.12.0 will be dependency edges from resources to modules, as opposed to the other way around or between modules. In other words, it will be possible to write something like this, with the behavior described in the proposal above:

resource "example" "example" {
  depends_on = [module.foo]

  # ...
}

This comes along with the ability to use an object value representing all of the outputs of a module together as an expression, which builds on the same mechanism:

module "a" {
  # ...
}
module "b" {
  # ...

  # Pass an object representing _all_ of the outputs of module a, which
  # then implicitly depends on all of those outputs.
  a_result = module.a
}

The main thing we were not able to include for v0.12 was the ability for a module as a whole to depend on something else, as opposed to individual variables of that module. This is more complex because it requires Terraform to generate a different shape of graph than it traditionally has and so we want to introduce that change separately from the various other configuration language changes in v0.12 so it can be more effectively tested in isolation. Since we use an iterative planning style we don't know yet when that follow on work would begin, but we'll post more updates here when we have them.

@jayudhandha
Copy link

jayudhandha commented Oct 25, 2018

@apparentlymart
Appreciate your effort! 👍

@flmmartins
Copy link

Any update on this?

@carct
Copy link

carct commented Jun 12, 2019

still no updates ? :( i`m in pain do to this lack of determinism.. :(

@mildwonkey
Copy link
Contributor

Hi folks!

I know this is a long-awaited feature, but "+1" and "any update" comments create noise for others watching the issue and ultimately doesn't influence our prioritization.

Instead, please react to the original issue comment with 👍, which we can and do report on during prioritization.

@tscully49
Copy link

tscully49 commented Aug 23, 2019

Would it help push feature if an official feature request was logged for this functionality?

Edit: Or is this a feature request? 👀

@mildwonkey
Copy link
Contributor

Hi @tscully49, good question! This issue is labeled "enhancement", which we use to track feature requests.

@guitarmanvt
Copy link

It took me a while to figure out Terraform resource dependencies. My first instinct was do try module dependencies as this feature request suggests. If that had worked, I wouldn't have had to struggle so much getting something like module dependencies to work.

Since this was such a pain, I documented everything I learned about Terraform dependencies. This includes:

  • Resource dependencies (with depends_on)
  • Module dependencies (by hooking outputs into inputs)
  • A more explicit work-around using a special module_dependency input and module_complete output.

So far, so good. I'm able to make deterministic dependencies between modules and resources.

The docs and working code examples can be found here:

https://github.com/guitarmanvt/terraform-dependencies-explained

I hope this helps someone. :)

@peturgq
Copy link

peturgq commented Feb 10, 2020

No news on this?

jsf9k added a commit to cisagov/cool-sharedservices-networking that referenced this issue Feb 20, 2020
Since Terraform does not yet support depends_on for modules, it is
necessary to run an initial partial apply (to attach the
ProvisionNetworking policy to the ProvisionAccount role) before
running a full terraform apply.

This is something that will be fixed in the future.  See
hashicorp/terraform#17101 for details.
@martincastrocm
Copy link

martincastrocm commented Mar 25, 2020

Hi everyone! any update here?

@danieldreier
Copy link
Contributor

@martincastrocm @peturgq my update to "depends_on cannot be used in a module" is relevant here. We are planning on adding module depends_onduring the 0.13.x lifecycle. We're not 100% sure yet if this will make it into 0.13.0 or in a later 0.13.x release.

@danieldreier
Copy link
Contributor

I'm very excited to announce that beta 1 of terraform 0.13.0 will be available on June 3rd, and will include module depend_on. I've pinned an issue with more details about the beta program, and posted a discuss thread for folks who want to talk about it more.

@ghost
Copy link

ghost commented Jun 28, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@ghost ghost locked and limited conversation to collaborators Jun 28, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.