Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix cross sub creation failure on keyvault embedded resources #14047

Closed

Conversation

magodo
Copy link
Collaborator

@magodo magodo commented Nov 5, 2021

The existance check on the Keyvault in the Keyvault embedded resources are redundent here.

Even worse, it will cause failures to create a a embedded resource (e.g. a secret) from subB in a Keyvault that is created in the subA, since the Resource list API used will default to assume the scope is subB in the Read function of the embedded resource, which causes issues described in #11059.

Also, removing the existance check on the Keyvault also eliminates the potential issues caused by the resource list API as discussed in #13719.

Fixes #11059.

@tombuildsstuff
Copy link
Contributor

hey @magodo

This code it's used to workaround an issue where the Storage Account has been deleted outside of Terraform, which means that any calls to the Data Plane API return a DNS failure rather than an error.

We intentionally use the Key Vault ID rather than the Key Vault Data Plane URI on these nested resources so that we can detect that these nested resources should be deleted when the parent resource requires changes in Terraform. In addition caches and locks within the Azure Provider assume that we're scoped to a Subscription, rather than attempting to manage cross-subscription - as such we intentionally require that users use a provider alias for secondary subscriptions for that purpose.

From what I can see in #11059 this is similar to the problem discussed in #13409 (comment), where it appears the Azure SDK for Go (or rather the base library Azure/go-autorest) isn't correctly handling 429's on Data Plane API's - so rather than trying to work that, it'd be better to fix that problem in Azure/go-autorest so that this is solved for all Data Plane resources. As such whilst I'd like to thank you for this contribution, since we're looking to take a different approach here, I'm going to close this PR for the moment.

Thanks!

@magodo
Copy link
Collaborator Author

magodo commented Nov 18, 2021

@tombuildsstuff I've tested locally that given a vault and a secret, if I remove the vault via the DELETE API. Then the terraform plan can correctly zero the ID for both the vault and the secret. Also from the API sequence, the GET on the secret returns 404 rather than a DNS failure.

Also worth mentioning, this PR is not only meant to support the cross subscription issue. But to remove the unnecessary query on the vault resource, which in this case causes intermittent issues, probably due to the Azure/go-autorest.

@tombuildsstuff
Copy link
Contributor

@magodo fwiw this issue highlights the scenario I'm referring too (albeit this is retrying): Azure/azure-sdk-for-go#16371 - the related problem here being that the Azure DNS is cached for extended periods depending on where it's accessed from

Since there's no confirmation that this fixes the issue (as we have no reliable repro for this issue) - this needs to be fixed in the base layer of the Azure SDK here unfortunately.

@magodo
Copy link
Collaborator Author

magodo commented Dec 2, 2021

@tombuildsstuff Thanks for providing the issue, that clarifies a lot! Do you think it makes sense that we use a context with small intervals when reading the secret, and gracefully handle the DNSError, e.g. resetting the ID in Read()?

BTW, there is one confirmation that this PR works in the issue: #11059 (comment).

@tombuildsstuff
Copy link
Contributor

@magodo if I'm honest based on this comment it sounds like the source of this is an upstream bug in the API which has since been resolved, so I don't think this is something we need to fix in Terraform at all, the Service Team has fixed the broken API?

@tiwood
Copy link
Contributor

tiwood commented Dec 3, 2021

I've just faced this issue and wanted to chime in to discuss this.

We're creating enterprise applications (azuread applications and service principals) for various workloads with a custom Terraform module. Each workload has its own Azure subscription and Key vault.

We want to create the Key vault secrets in our central Pipeline where we also create the principals. I assumed this would work, as the Key vault data plane is not related to the subscription.

It turns out the azurerm_key_vault_resource can create the secrets but fails the Read operation due to the reasons described here (remember we want to create the secrets in multiple Key vaults spread across multiple subscriptions).

So, it would work if the additional check (removed in this PR) would be gone, but right now I'm blocked without any real solution.

@tombuildsstuff, @magodo, maybe we can look further into this, because the upstream issue described in this comment is not realated to my issue.

@github-actions
Copy link

github-actions bot commented Jan 3, 2022

I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jan 3, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Azure KeyVault Error: Provider produced inconsistent result after apply
3 participants