Announcement: Upcoming Changes in Version 2.0 of the Azure Provider #2807

tombuildsstuff · 2019-01-30T22:22:25Z

Terraform initially shipped support for the AzureRM Provider back in December 2015. Since then we've added support for 191 Resources, 58 Data Sources and have launched a couple of related Providers in the form of the Azure Active Directory Provider and the Azure Stack Provider.

Version 2.0 of the AzureRM Provider will be a Major Release - in that it will include some larger-scale changes not seen in a regular release. A summary of these changes is outlined below - however a full breakdown will be available on the Terraform Website after the release of v1.22.

Summary

Existing Resources will be required to be imported
Custom Timeouts will be available on Resources - this will allow you to specify a custom timeout for provisioning the resource in your Terraform Configuration using the timeouts block.
New resources for Virtual Machines and Virtual Machine Scale Sets
Removing Fields, Data Sources and Resources which have been deprecated

A brief summary of each item can be found below - more details will be available in the Azure Provider 2.0 upgrade guide on the Terraform Website once v1.22 has been released.

Existing Resources will be required to be Imported

Terraform allows for existing resources which have been created outside of Terraform to be Imported into Terraform's State. Once a resource is imported into the state, it's possible for Terraform to track changes and manage this resource. The Azure Provider allows Importing existing resources into the state (using terraform import) for (almost) every resource.

Version 2.0 of the Azure Provider aims to solve an issue where it's possible to unintentionally import resources into the state by running terraform apply. To explain this further, the majority of Azure's API's are Upserts - which means that a resource will be updated if it exists, otherwise it'll be created.

Where the unique identifier for (most) Azure resources is the name (rather than for example an aws_instance where AWS will generate a different unique identifier) - it's possible that users may have unintentionally imported resources into Terraform when running terraform apply on an existing resource.

Whilst this may allow resources to work in some cases, it leads to hard-to-diagnose bugs in others (which could have been caught during terraform plan).

In order to match the behaviour of other Terraform Providers version 2.0 of the AzureRM Provider will require that existing resources are imported into the state prior to use. This means that Terraform will be checking for the presence of an existing resource prior to creating it - and will return an error similar to below:

A resource with the ID /subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/group1 already exists - to be managed via Terraform this resource needs to be imported into the State. Please see the resource documentation for `azurerm_resource_group` for more information.

You can opt into this behaviour in version 1.22 of the AzureRM Provider by setting the Environment Variable ARM_PROVIDER_STRICT to true.

Custom Timeouts for Resources

Resources can optionally support a timeouts block - which allows users to specify a Custom Timeout for resource creation/deletion as part of the Terraform Configuration.

Prior to version 2.0 the Azure Provider has a default value set for resource timeouts for an hour - which cannot be overridden. This works for the most-part but there are certain scenarios where it'd be helpful to override this.

This is useful for resources which can take a long time to delete - for example deleting the azurerm_resource_group resource will delete any resources within it, which can take time. Within Terraform your Terraform Configuration this could be represented like so:

resource "azurerm_resource_group" "test" {
  name     = "example-resource-group"
  location = "West Europe"

  timeouts {
    create = "10m"
    delete = "30m"
  }
}

We intend to support the timeout block in version 2.0 of the Azure Provider - which will allow timeouts to be specified on resources (as shown above). This feature request is being tracked here and will form part of the 2.0 release of the AzureRM Provider.

New Resources for Virtual Machines and Virtual Machine Scale Sets

We originally shipped support for the azurerm_virtual_machine and azurerm_virtual_machine_scale_set resources back in March 2016.

Over time new features have been added to these resources by Azure, such as Managed Disks and Managed Service Identity which these resources support. Since these resources first launched Azure's also changed the behaviour of some fields, so that it's now possible to update them where this wasn't previously possible - for example the Custom Data for a Virtual Machine.

We've spent some time thinking about how we can accommodate these changes and about how we can improve the user experience of both resources.
In particular we've wanted to be able to give better validation during terraform plan, rather than bailing out with an Azure API error during terraform apply, however this isn't possible with the current resource structure since they're very generic. The validation requirements also vary substantially based on the fields provided, for example the name field for a Virtual Machine can be up to 63 characters for a Linux Virtual Machine but only allows 15 characters for a Windows Virtual Machine.

As such after spending some time reading through bug reports and thinking/prototyping some potential solutions to this - we believe the best path forward here is to split these resources out, so that we would have:

a Linux Virtual Machine Resource (working name: azurerm_linux_virtual_machine)
a Windows Virtual Machine Resource (working name: azurerm_windows_virtual_machine)
updating the Data Disk Attachment Resource to support Unmanaged Disks
a Linux Virtual Machine Scale Set Resource (working name: azurerm_linux_virtual_machine_scale_set)
a Windows Virtual Machine Scale Set Resource (working name: azurerm_windows_virtual_machine_scale_set)
a separate resource for Virtual Machine Scale Set Extensions (working name azurerm_virtual_machine_scale_set_extension)

Please Note: all of the resources mentioned above currently do not exist but will form part of the 2.0 release.

Whilst we're aware that this isn't ideal since users will eventually have to update their code/import an existing resource - we believe this approach gives us a good footing for the future. In particular this allows us to re-consider the schema design so that we can both support these new use-cases, fix some bugs and improve the user experience with these resources.

The existing azurerm_virtual_machine and azurerm_virtual_machine_scale_set resources will continue to be available throughout the 2.x releases - but over time we'd end up deprecating these in favour of the new resources.

Removing Deprecated Fields, Data Sources and Resources

As v2.0 of the AzureRM Provider is a major release - we'll be taking the opportunity to remove Fields, Data Sources and Resources which have been previously deprecated.

A detailed breakdown will be available on the Terraform Website once v1.22 has been released - and we'll update this issue with a link to that once it's live.

We've spent the past few months laying the groundwork for these changes - and whilst we appreciate that your Terraform Configurations may require code changes to upgrade to 2.0 - we take Semantic Versioning seriously and so try our best to limit these changes to major versions.

Pinning your Provider Version

We recommend pinning the version of each Provider you use in Terraform - you can do this using the version attribute in the provider block, either to a specific version of the AzureRM Provider, like so:

provider "azurerm" {
  version = "=1.22.0"
}

.. or to any 1.x release:

provider "azurerm" {
  version = "~> 1.x"
}

More information on how to pin the version of a Terraform Provider being used can be found on the Terraform Website.

Once version 2.0 of the AzureRM Provider is released - you can then upgrade to it by updating the version specified in the Provider block, like so:

provider "azurerm" {
  version = "=2.0.0"
}

You can follow along with the work in the 2.0 release in this GitHub Milestone - we'll also post updates in this issue and publish the v2.0 upgrade guide on the Terraform Website once the v1.22 release is out.

There's a summary/update in the thread below: #2807 (comment)

The text was updated successfully, but these errors were encountered:

InterStateNomad · 2019-02-01T03:07:50Z

Regarding the "Existing Resources will be required to be Imported" change. Will we still be able to make Azure resources in an existing environment without having to import all of that existing environment into state?

For example, if I just want to make a Azure VM using Terraform in a existing environment. Today I can just provide the name of the resource group I want the VM to be created in (resource_group_name) and use the azurerm_subnet data source to specify the the networking details for the VM NIC.

Will this functionality still be possible or would I have to import the VNET and RSG into a new state file before I can make my VM? I know it sounds odd but lots of times I just like building VMs with Terraform because I prefer it and not really concerned with state management.

Thanks

tombuildsstuff · 2019-02-04T09:56:04Z

@InterStateNomad

Regarding the "Existing Resources will be required to be Imported" change. Will we still be able to make Azure resources in an existing environment without having to import all of that existing environment into state?
For example, if I just want to make a Azure VM using Terraform in a existing environment. Today I can just provide the name of the resource group I want the VM to be created in (resource_group_name) and use the azurerm_subnet data source to specify the the networking details for the VM NIC.
Will this functionality still be possible or would I have to import the VNET and RSG into a new state file before I can make my VM? I know it sounds odd but lots of times I just like building VMs with Terraform because I prefer it and not really concerned with state management.

Yes this will still work - only (existing) Resources need to be Imported into the State via terraform import - Data Sources can be used without importing and so will function as they do today.

TraGicCode · 2019-02-26T04:16:42Z

Hey @tombuildsstuff ,

Is there an ETA for v2? Is there a workaround we can use in the meantime to prevent our CI/CD pipelines in azure release pipelines?

I'm really surprised the timeout issue with resources taking about an hour has been hanging around so long as alot of important resources in azure have about the same spin-up and spin-down times.

tombuildsstuff · 2019-02-28T11:22:33Z

@TraGicCode

Is there an ETA for v2? Is there a workaround we can use in the meantime to prevent our CI/CD pipelines in azure release pipelines?

Not at this time, unfortunately - we'll post more when we have that information - originally we intended to focus on v2.0 after the v1.22 release - however we plan to do a few more v1.x releases before starting work on v2.0.

I'm really surprised the timeout issue with resources taking about an hour has been hanging around so long as alot of important resources in azure have about the same spin-up and spin-down times.

Unfortunately the upstream bugs which were needed to support this have only recently been fixed, so whilst we could add this to a handful of resources today - we feel there's more value in doing this for all resources at once, since we'll also be taking the opportunity to update the default timeout for other resources to be more realistic (for example, it doesn't take an hour to create a Virtual Network, so setting the timeout to an hour [as we do today, since it's shared by all resources] is probably overkill).

Thanks!

Lachlan-White · 2019-05-07T03:37:25Z

@tombuildsstuff Is the Custom timeout available in another version of the provider? Currently this is going to block me for deployment of Azure Service environments?

tombuildsstuff · 2019-05-10T05:15:41Z

@a138076 unfortunately this is a change which needs to be made across every resource in the same release - so this won't land until 2.0.

AdamCoulterOz · 2019-06-01T09:08:11Z

@tombuildsstuff - I'm from Vibrato in Australia, HashiCorp APAC services provider, I work with many clients using Terraform for Azure every week. Moving the resources model away from mapping to the Azure RM model for VM and VMSS to Windows and Linux based specific versions is going to make life very difficult for us. I agree with splitting out VMSS extensions though.

I understand what you're trying to achieve by moving to less generic versions, but can't the design goals still be met while retaining the generic nature of the resource types? There are many different resource types which have different validation rules based on provided attributes (in this case OS), but if you were to create separate resource types for every time validation rules had to vary based on attribute values there would need to be 1000x+ as a many resources maintained and would become completely unusable. Seems inconsistent with the general approach for writing providers.

I also think you are conflating 2 different problems, one; the maintainability of lots of state migration code (hence the need to break to completely new resource implementations), and two, complex attribute dependency validation logic.

Problem 1 - State Migration Maintenance
Since provider version 2.0.0 is marked as breaking you don't need to maintain any compatibility with the existing implementations of VM or VMSS anyway, this is expected if you are following SemVer.
You could remove resource state migrations prior to the new "version 2" SchemaVersion, and include only 1 SchemaVersion prior (provider version 1.xx or higher), and instruct them to upgrade by running against that provider version, then switch to version 2? Alternatively just break if they don't have a SchemaVersion of provider version 2.0.0 or higher, and tell them they need to reimport, or, the resource just reimports at that version with an overwrite of the state file as a 1 time event with an explicit warning message?

Problem 2 - Attribute Validation Logic
I understand that maintaining the validation logic may be slightly more complex (kept within 1 resource) but I fundamentally believe you should be keeping them aligned to the AzureRM model. You still need to maintain the validation logic for both models regardless anyway if you want to provide relevant feedback during the plan phase. Also, couldn't Microsoft provide some kind of validation via their Azure Go API client library?

I'm happy to have feedback if any of my assumptions or reasoning is incorrect. Would like to make sure we get this one right. Thanks!

tombuildsstuff · 2020-02-04T11:20:21Z

hey @AdamCoulterOz

Firstly - apologies it's taken so long to reply to this, I saw this shortly after you posted it but didn't have a chance to reply; and it's been hidden in the "more comments" section until recently.

To go through each of your points in turn:

I understand what you're trying to achieve by moving to less generic versions, but can't the design goals still be met while retaining the generic nature of the resource types?

There are many different resource types which have different validation rules based on provided attributes (in this case OS), but if you were to create separate resource types for every time validation rules had to vary based on attribute values there would need to be 1000x+ as a many resources maintained and would become completely unusable

Early versions of this Provider focused on matching the Azure API's exactly - the result of this being that many users ended up with unclear error messages from the API that we're unable to catch during terraform plan/validate.

Whilst it's unfortunate this is the case - the root cause of this is that the same Azure API can provision multiple things - which leads to them being generic. To go through a couple of examples in turn: HDInsight and App Services.

The HDInsight API looks like a good candidate for having a Generic API, since the only thing that's really different between them is the number of blocks (e.g. worker_node, zookeeper_node and edge_node). We initially looked to ship the HDInsight resources as one resource - but quickly realised this wouldn't work due to the API behaving differently depending on the kind of cluster being provisioned (e.g. a bunch of the SKU's get mutated from their original value to Medium or Large, and is the case for multiple SKU's which varies by cluster type, so we couldn't map this back).

In addition the API flat-out rejected changed to some fields for some configurations but allows them for others. The result of this being that whilst we /could/ make this one generic resource - it would have been a pretty poor user experience. Ultimately we ended up splitting this up into 8 different resources for HDInsight which behave as they should - which whilst is a little more work for us to maintain (and there's things we've done to alleviate this) this allows these resources behave the way that users expect.

The App Service API follows a similar pattern - where an App Service (Web App, Function App, API App, Mobile App etc) can be provisioned within an App Service Plan of different kinds (Linux/Windows/Function/Consumption). Whilst on first glance this sounds fine there's some pretty severe limitations to this resource which has ended up causing issues - namely that the API itself returns an empty HTTP 400 "Bad Request" when the configuration is wrong, with no details about what's gone wrong to be able to debug it.

The end-result of this is that the more generic the resource the more we end up pushing this complexity onto users, which in turn leads to errors we're unable to catch during a plan/validate and ultimately leads to more (Github/support) issues.

As such whilst it does deviate from the Azure API - I fully expect that we'll introduce more "specialized" resources in the future to provide a better user experience than directly-mapping the API - due to the issues caused by the design of the Azure API's.

It's not something we plan to do anytime soon, but a good candidate for this is the azurerm_app_service resource which wants splitting into ~4 sub resources (Web App, Function App [which has already been done], Mobile App & API Apps).

Since provider version 2.0.0 is marked as breaking you don't need to maintain any compatibility with the existing implementations of VM or VMSS anyway, this is expected if you are following SemVer.

Unfortunately this'd mean we'd end up leaving some users on older versions of the Provider unable to access new functionality - which isn't ideal (and is why the azurerm_virtual_machine and azurerm_virtual_machine_scale_set resources are feature-frozen rather than deprecated in 2.0).

In addition - based on the experiences we've had with both the HDInsight and App Service API's - I think this'd be the wrong call since we'd be pushing the complexity onto end-users rather than handling it within the Azure Provider.

Alternatively just break if they don't have a SchemaVersion of provider version 2.0.0 or higher, and tell them they need to reimport, or, the resource just reimports at that version with an overwrite of the state file as a 1 time event with an explicit warning message?

Terraform expects that all of the Resources in the Statefile can be represented by the Providers being used - as such users would be unable to change the state when using this newer version of the Provider, which would mean ultimately starting afresh with state, which again means we'll end up having a set of users who can't upgrade.

I understand that maintaining the validation logic may be slightly more complex (kept within 1 resource) but I fundamentally believe you should be keeping them aligned to the AzureRM model. You still need to maintain the validation logic for both models regardless anyway if you want to provide relevant feedback during the plan phase. Also, couldn't Microsoft provide some kind of validation via their Azure Go API client library?

Due to the way the Azure SDK's are built (they're generated from Swagger, which doesn't contain a means of expressing these conditional validation functions) unfortunately this logic can't easily be added automatically to the Azure SDK and thus it'd need to be some kind of manual mix-in. My concern with this approach is that the Swagger is already an afterthought to some API teams - and if I'm being honest if these got written based on prior experience I don't think these would be maintained, particularly in languages which aren't .net (which is what most of the Azure API's are written in).

Whilst I appreciate splitting these resources out does make creating generic modules harder - it's still possible to achieve that if you need that; to use a hypothetical example:

variable "linux" {
  default = "yes"
}

resource "azurerm_linux_virtual_machine" "example" {
  count = var.linux == "yes" ? 1 : 0
}

resource "azurerm_windows_virtual_machine" "example" {
  count = var.linux == "yes" ? 0 : 1
}

output "virtual_machine_id" {
  value = element(concat(azurerm_linux_virtual_machine.example.*.id, azurerm_windows_virtual_machine.example.*.id), 0)
}

(It's worth noting that whilst it's possible to create a generic module which configures every option on a resource - we'd recommend having multiple more specialized modules instead.)

Overall whilst we're trying to match the schema design used by Azure where that makes sense - when the Azure API is overly generic to be unhelpful to end-users we'd rather diverge from this by creating more specialized resources to be able to provide a better user-experience.

From our side we'll shortly be releasing an opt-in Beta for the new resources in the upcoming version 1.43 of the Azure Provider, which I'd encourage you to try if you have time (also worth noting there's a minor couple of known issues in the documentation). Whilst they are separate resources - overall the resources are (intentionally) pretty similar (albeit with more specific validation) but match the behaviour of the updated API's - and as such we believe should fit most use-cases.

Thanks!

tombuildsstuff · 2020-02-04T12:09:54Z

👋

Over the past few months we've been working on the functionality coming in version 2.0 of the Azure Provider (outlined above).

We've just released version 1.43 of the Azure Provider which allows you to opt-in to the Beta of these upcoming features, rather than detailing this in multiple issues - more information can be found in this Github issue (which is pinned for visibility).

Thanks!

berney · 2020-02-16T23:23:58Z

It sounds like more pressure needs to be applied on Azure to make their API and SDK better.

tombuildsstuff · 2020-02-24T09:41:04Z

👋

Thanks for all of the input here - we've finished up the major changes needed for version 2.0 of the Azure Provider - and as such I'm going to close this meta issue for the moment. As this meta issue is assigned to the 2.0 Github milestone - @hashibot will comment when the 2.0 release of the Azure Provider is available.

Thanks!

ghost · 2020-02-24T16:51:02Z

This has been released in version 2.0.0 of the provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading. As an example:

provider "azurerm" {
    version = "~> 2.0.0"
}
# ... other configuration ...

ghost · 2020-03-28T14:54:26Z

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 [email protected]. Thanks!

tombuildsstuff added the breaking-change label Jan 30, 2019

tombuildsstuff added this to the 2.0.0 milestone Jan 30, 2019

tombuildsstuff pinned this issue Jan 30, 2019

This was referenced Jan 30, 2019

Terraform Plan bases its add/change/destroy on the Terraform Resource name and not the Provider name #2766

Closed

azurerm_virtual_machine_scale_set's extension's protected_settings not updated when protected_settings value changes #2709

Closed

lawrencegripper mentioned this issue Jan 31, 2019

Azure_virtual_machine: Missing force_new from storage_os_disk.name #2813

Closed

This was referenced Feb 26, 2019

Feature Request: API Management support #1177

Closed

azurerm_virtual_machine_extension should also accept azurerm_virtual_machine_scale_set_name under the list of supported arguments. #2957

Closed

JunyiYi unpinned this issue Mar 5, 2019

tombuildsstuff mentioned this issue Mar 5, 2019

terraform import on azurerm_app_service_plan missing many settings #2991

Closed

tombuildsstuff pinned this issue Mar 19, 2019

tombuildsstuff mentioned this issue Mar 19, 2019

azurerm_subnet : nsg and rt associations destroyed on reapply #3077

Closed

mikolajzajac mentioned this issue Mar 21, 2019

azurerm_storage_account error deserializing json #3028

Closed

This comment has been minimized.

Sign in to view

kmoe unpinned this issue Apr 28, 2019

katbyte pinned this issue Apr 29, 2019

This was referenced May 23, 2019

Assignment of new ip_configuration blocks to existing NIC? #3448

Closed

azurerm_shared_image_version deployment #3517

Closed

tombuildsstuff mentioned this issue May 31, 2019

terraform "hijacks" resources that already exist #3562

Closed

tombuildsstuff mentioned this issue Jun 5, 2019

Storage Soft Delete Support #1070

Closed

abombss mentioned this issue Jun 10, 2019

AppEngine Standard Application Version resource. GoogleCloudPlatform/magic-modules#1884

Merged

bgarcial mentioned this issue Jun 10, 2019

No provider "azurerm" plugins meet the constraint "=2.0.0,>= 1.1.0". #3627

Closed

This was referenced Feb 3, 2020

storage_data_disk resize failure #2534

Closed

azurerm_virtual_machine tries to change name of a disk rather than detach/create/attach #1668

Closed

Update resource azurerm_virtual_machine with storage datadisk #1593

Closed

tombuildsstuff mentioned this issue Feb 4, 2020

Announcement: Opt-In Beta for the new functionality coming in 2.0 #5608

Closed

This was referenced Feb 4, 2020

LinuxConfiguration Support in Compute Resource Provider #2264

Closed

Azure Linux VM change passes plan but fails on apply during API call #5418

Closed

Error: Invalid index. The given key does not identify an element in this collection value. #5675

Closed

This was referenced Feb 11, 2020

[azurerm_scheduler_job] setting schedule to never end #1932

Closed

2.0: Enabling Custom Timeouts, New VM/VMSS Resources & Requires Imports by default #5705

Merged

Bowbaq mentioned this issue Feb 13, 2020

Changing osDisk.Name is not allowed #5689

Closed

tombuildsstuff mentioned this issue Feb 17, 2020

stopped virtual machine in terraform #555

Closed

tombuildsstuff closed this as completed Feb 24, 2020

sai-ns mentioned this issue Feb 24, 2020

"features": required field is not set error for Storage account resource #5867

Closed

katbyte unpinned this issue Mar 2, 2020

ghost locked and limited conversation to collaborators Mar 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Announcement: Upcoming Changes in Version 2.0 of the Azure Provider #2807

Announcement: Upcoming Changes in Version 2.0 of the Azure Provider #2807

tombuildsstuff commented Jan 30, 2019 •

edited

Loading

InterStateNomad commented Feb 1, 2019 •

edited

Loading

tombuildsstuff commented Feb 4, 2019

TraGicCode commented Feb 26, 2019

tombuildsstuff commented Feb 28, 2019

This comment has been minimized.

This comment has been minimized.

Lachlan-White commented May 7, 2019

tombuildsstuff commented May 10, 2019

AdamCoulterOz commented Jun 1, 2019

tombuildsstuff commented Feb 4, 2020 •

edited

Loading

tombuildsstuff commented Feb 4, 2020

berney commented Feb 16, 2020

tombuildsstuff commented Feb 24, 2020

ghost commented Feb 24, 2020

ghost commented Mar 28, 2020

Announcement: Upcoming Changes in Version 2.0 of the Azure Provider #2807

Announcement: Upcoming Changes in Version 2.0 of the Azure Provider #2807

Comments

tombuildsstuff commented Jan 30, 2019 • edited Loading

Summary

Existing Resources will be required to be Imported

Custom Timeouts for Resources

New Resources for Virtual Machines and Virtual Machine Scale Sets

Removing Deprecated Fields, Data Sources and Resources

Pinning your Provider Version

InterStateNomad commented Feb 1, 2019 • edited Loading

tombuildsstuff commented Feb 4, 2019

TraGicCode commented Feb 26, 2019

tombuildsstuff commented Feb 28, 2019

This comment has been minimized.

This comment has been minimized.

Lachlan-White commented May 7, 2019

tombuildsstuff commented May 10, 2019

AdamCoulterOz commented Jun 1, 2019

tombuildsstuff commented Feb 4, 2020 • edited Loading

tombuildsstuff commented Feb 4, 2020

berney commented Feb 16, 2020

tombuildsstuff commented Feb 24, 2020

ghost commented Feb 24, 2020

ghost commented Mar 28, 2020

tombuildsstuff commented Jan 30, 2019 •

edited

Loading

InterStateNomad commented Feb 1, 2019 •

edited

Loading

tombuildsstuff commented Feb 4, 2020 •

edited

Loading