Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

validate: ignore providers with no configuration #24896

Merged
merged 14 commits into from
Jan 7, 2021

Conversation

bendrucker
Copy link
Contributor

@bendrucker bendrucker commented May 8, 2020

This PR changes the behavior of terraform validate so that provider schema validation is bypassed if the user does not supply provider configuration. The strategy is pretty simple: if the provider config validates against any empty schema, consider it valid. In other words, if the configuration is completely empty, skip validation against the provider schema. Otherwise, proceed to fetch the provider schema, build the configuration against it, and report errors as usual.

I could particularly use guidance on any side effects this might have, particularly if validation is used ahead of other steps that might depend on current behavior.

What follows is a bit of background on the different variations on provider configuration I'm expecting.


The below provider configurations, many of which @apparentlymart described in #21408 (comment), can become valid at runtime but will currently raise errors during a terraform validate.

Child Module

When authoring a child module, it's useful to terraform validate, particularly because it interacts with provider plugins. It's Terraform's "compile" step—we might still have runtime errors, but we're using types correctly.

  1. Inherited providers

No provider declared, the default instance of this provider will be inherited from the parent

resource "aws_instance" "foo" {
  instance_type = "t3.nano"
  ami = "abc-123"
}
  1. Proxy provider blocks
provider "aws" {
  alias = "foo"
}

resource "aws_instance" "foo" {
  provider = aws.foo
  instance_type = "t3.nano"
  ami = "abc-123"
}

foo is declared but not configured, and we expect the caller to pass providers = { aws.foo = ... }.

  1. Fully configured provider blocks
resource "aws_organizations_account" "account" {
  name = "my_new_account"
  email = "[email protected]"
  role_name = "OrganizationAccountAccessRole"
}

provider "aws" {
  alias = "account"

  assume_role {
    role_arn = "arn:aws:iam:${aws_organizations_account.account.id}:role/${aws_organizations_account.account.role_name}"
  }
}

resource "aws_instance" "foo" {
  provider = aws.foo
  instance_type = "t3.nano"
  ami = "abc-123"
}

The module inherits a provider in order to dynamically configure another provider (here in a new AWS account). Or perhaps the module just accepts variables and then configures the provider, though this should be uncommon.

  1. Unnecessarily declared provider blocks

I included this as an edit, because my change will end up handling them.

provider "aws" {
  version = "~> 2"
}

resource "aws_instance" "foo" {
  instance_type = "t3.nano"
  ami = "abc-123"
}

In a child module, this is unnecessary and will lead to weird bugs when the user inevitably doesn't set providers = { aws = aws } once and the child provider has none of the configuration the module is supposed to inherit.

Terraform could warn on version-only provider blocks in child modules, probably even in validate, I'd be up for working on that in a separate PR.

Root Module

All the same uses of terraform validate apply to root modules. The provider configuration is more likely to be populated, but isn't necessarily so.

  1. Implicit provider
resource "aws_instance" "foo" {
  instance_type = "t3.nano"
  ami = "abc-123"
}

The configuration doesn't declare a provider.

  1. Declared empty provider
provider "aws" {}

resource "aws_instance" "foo" {
  instance_type = "t3.nano"
  ami = "abc-123"
}

The provider is declared but has no configuration. This configuration is valid at runtime if environment variables satisfy the required provider configuration. And I believe it is also required to enable provider passing to a module, e.g. providers = { aws.foo = aws } as in the case 2 for child modules.

My experience as a Terraform user has been that writing provider configuration is a rarity and I'm able to rely on environment variables almost exclusively. This has translated nicely to Terraform Cloud (Enterprise in my case) where we've wired up the tfe_workspace and tfe_variable resources in a module that distributes credentials for ~10 providers to the workspaces that need them such that the Terraform configuration repos/directories have zero provider configuration required.

  1. Configured providers
provider "aws" {
  region = "us-west-2"
}

resource "aws_instance" "foo" {
  instance_type = "t3.nano"
  ami = "abc-123"
}

The provider supplies at least partial configuration. At this point I think we have to validate what we're given. If provider authors want to have required non-credential fields, they'll need to make the credentials fields optional, and validate in the ConfigureFunc. And even then, there are good reasons to make credentials optional (https://github.com/terraform-providers/terraform-provider-datadog/pull/474).

Even the somewhat odd required features {} block in the Azure provider would seem to be compatible with this stance:

https://www.terraform.io/docs/providers/azurerm/index.html#example-usage

This may not entirely satisfy users who'd like all required fields to be optional, but I don't see how we can accomplish that without throwing away legitimate errors. Azure is a good example there, features {} is required, it would be wrong for terraform validate to preempt or discard an error. Required fields are often credentials which probably use schema.EnvDefaultFunc, but there's no simple/safe heuristic Terraform could use to ignore those errors.

Closes #21408

@codecov
Copy link

codecov bot commented May 8, 2020

Codecov Report

Merging #24896 (a39273c) into master (9ac8e3c) will decrease coverage by 0.00%.
The diff coverage is 100.00%.

Impacted Files Coverage Δ
terraform/node_provider.go 88.63% <100.00%> (-0.62%) ⬇️
dag/marshal.go 61.90% <0.00%> (-1.59%) ⬇️

@jsf9k
Copy link

jsf9k commented May 13, 2020

I had to turn take the terraform validate guardrails off my CI configuration because of this very bug. Very glad to see that it is getting fixed! Thanks @bendrucker!

@bendrucker
Copy link
Contributor Author

Willing and eager to help this land as part of 0.13 if I can!

@pselle
Copy link
Contributor

pselle commented Nov 5, 2020

@bendrucker Thank you for this PR, and apologies for the wait! If you could update this PR on top of the latest main branch, I'll review!

We are currently in the 0.14 release cycle, so this would be going into 0.15, for transparency (the main branch is targeting 0.15)

@pselle pselle self-assigned this Nov 5, 2020
@bendrucker
Copy link
Contributor Author

Hi! Happy to update, thrilled to hear this might make 0.15.

@bendrucker
Copy link
Contributor Author

Hey @pselle, merged with master! Relocated this logic to its new location, but the change itself remains the same.

To my note above:

I could particularly use guidance on any side effects this might have, particularly if validation is used ahead of other steps that might depend on current behavior.

I noticed that ConfigureProvider makes no assumptions that the configuration has already been validated, since it implicitly re-validates as part of evaluating the config:

https://github.com/hashicorp/terraform/blob/master/terraform/node_provider.go#L95-L100

So this wouldn't have a side effect of causing a ConfigureFunc to panic because a required attribute is missing. Or in other words, it should only affect the validate command.

@pselle pselle self-requested a review November 12, 2020 22:08
backend/local/backend_refresh_test.go Show resolved Hide resolved
@@ -595,7 +595,7 @@ func TestContext2Validate_providerConfig_bad(t *testing.T) {
}
}

func TestContext2Validate_providerConfig_badEmpty(t *testing.T) {
func TestContext2Validate_providerConfig_skippedEmpty(t *testing.T) {
m := testModule(t, "validate-bad-pc-empty")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To match things up, it could be nice to rename this file to match "skipped" versus "bad"

@@ -48,6 +48,13 @@ func (n *NodeApplyableProvider) ValidateProvider(ctx EvalContext, provider provi

configBody := buildProviderConfig(ctx, n.Addr, n.ProviderConfig())

// if provider config is empty, return early
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More explanation here in the comment would be nice -- why are we returning early? (explain this)

Suggestion, mashed up from the suggestions that led to this PR:

// if a provider config is empty (only an alias), return early and don't continue
// validation. validate doesn't need to fully configure the provider itself, so 
// skipping a provider with an implied configuration won't prevent other validation from completing.

@@ -48,6 +48,13 @@ func (n *NodeApplyableProvider) ValidateProvider(ctx EvalContext, provider provi

configBody := buildProviderConfig(ctx, n.Addr, n.ProviderConfig())

// if provider config is empty, return early
emptySchema := &configschema.Block{}
_, _, evalDiags := ctx.EvaluateBlock(configBody, emptySchema, nil, EvalDataForNoInstanceKey)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Silly Terraform tricks: another way to do this (the approach here is totally valid!)

_, noConfigDiags := configBody.Content(&hcl.BodySchema{})
if !noConfigDiags.HasErrors() {
	return nil
}

This is more to share, and offer if you like it, but your way works absolutely fine.

@pselle
Copy link
Contributor

pselle commented Nov 17, 2020

I noticed that ConfigureProvider makes no assumptions that the configuration has already been validated, since it implicitly re-validates as part of evaluating the config:

The section you've highlighted shows that there's no check to ensure that resp.Provider.Block isn't nil. The validation request still happens here: https://github.com/hashicorp/terraform/blob/master/terraform/node_provider.go#L122-L128 within the ConfigureProvider method.

To say, I've read through to make sure the validation is still happening -- the thing that is not happening is the nil check, which is in the ValidateProvider function with a note that "it should never happen in real code" so I'm inclined, if we didn't run into any issues in the tests, to add that nil check when it's warranted by a real use case.

With that, this is looking merge-able to me, provided you rebase the PR (sorry about that 😅 ).

@bendrucker
Copy link
Contributor Author

Appreciate all the notes! Will address everything and fix the conflict this week.

@bendrucker
Copy link
Contributor Author

Ok, took me a bit longer than expected to get back to this, but should be ready! All comments addressed, and the conflict is resolved.

@bendrucker
Copy link
Contributor Author

There was a meaningful conflict with #27261, which added improved error handling when a provider block is not defined. I've run into that before, particularly in Terraform Cloud, so I was glad to see that improvement.

In order to resolve the conflict, I made the following changes here:

  • Removed the conditional sourceless diagnostic from ValidateProvider, since the earlier check against an empty schema added here makes that an unreachable branch
  • Modified the validate tests so that rather than a missing required attribute there is a type mismatch
  • Added test with a missing required attribute which should not result in errors

ConfigureProvider still retains all the new behavior.

@bendrucker bendrucker requested a review from pselle December 14, 2020 22:49
Copy link
Contributor

@pselle pselle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates and rebases (and patience!!) here @bendrucker! Happy to get this merged now!

@pselle pselle merged commit c9f372a into hashicorp:master Jan 7, 2021
@bendrucker
Copy link
Contributor Author

Awesome, thrilled to see it land!

@bendrucker bendrucker deleted the validate-ignore-empty-provider branch January 7, 2021 23:18
@drdamour
Copy link

drdamour commented Jan 8, 2021

@bendrucker so what does the azurerm provider need to do so that it can take advantage of this? i didn't understand your early explanation unfortunately

@bendrucker
Copy link
Contributor Author

Providers don't necessarily need to do anything. Sorry but I'm not going to provide support for your setup, please don't snipe pull requests like this.

@drdamour
Copy link

drdamour commented Jan 8, 2021

@bendrucker not sure what you mean by snipe, this pr closed an issue the azure rm folks had pointed us to a few times regarding validate requiring the functions block. Sorry my ignorance offended you some way

@drdamour
Copy link

drdamour commented Jan 8, 2021

@pselle
Copy link
Contributor

pselle commented Jan 8, 2021

@drdamour To clarify "when can I use this" from the Core side, you'll see this behavior in the 0.15 version of Terraform. This code will go out in the next alpha release, and it would be very helpful for you to test this with the AzureRM intersection when that's available (about over two and a half weeks from now is when we expect to ship the next alpha). You won't need to upgrade the provider/change the provider to get access to this behavior.

@drdamour
Copy link

drdamour commented Jan 8, 2021

@pselle my question was more around your original comments which seemed to indicate further steps are needed

“ If provider authors want to have required non-credential fields, they'll need to make the credentials fields optional, and validate in the ConfigureFunc. And even then, there are good reasons to make credentials optional (https://github.com/terraform-providers/terraform-provider-datadog/pull/474).

Even the somewhat odd required features {} block in the Azure provider would seem to be compatible with this stance”

That seemed like a call to action for the azurerm provider

@pselle
Copy link
Contributor

pselle commented Jan 8, 2021

@drdamour Yes, I indeed misinterpreted. Not being familiar with the AzureRM provider, I'd recommend raising this issue with them, but I'd really recommend only doing so after you've tested with the next alpha to see if it's relevant.

@ghost
Copy link

ghost commented Feb 7, 2021

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@ghost ghost locked as resolved and limited conversation to collaborators Feb 7, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

"terraform validate" should not fail on implicit provider config when provider arguments are required
4 participants