Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancement: azurerm_databricks_workspace - Add DBFS root Blob storage URL Attribute #7799

Closed
jonmaestas opened this issue Jul 17, 2020 · 3 comments · Fixed by #12543
Closed

Comments

@jonmaestas
Copy link

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Description

When implementing private Databricks VNET, you are required to add User-defined Routes to ensure that network traffic is routed correctly for your workspace.

In the provided link, all of those DNS entries are constant except the DBFS DNS, which is generated at workspace creation.

Here is what I'm using to generate the Route Table and associate it to the Databricks subnets.

# https://docs.microsoft.com/en-us/azure/databricks/administration-guide/cloud-configurations/azure/udr
# https://www.terraform.io/docs/providers/dns/d/dns_a_record_set.html
data "dns_a_record_set" "metastore" {
  host = "consolidated-northcentralus-prod-metastore.mysql.database.azure.com"
}

# https://www.terraform.io/docs/providers/dns/d/dns_a_record_set.html
data "dns_a_record_set" "artifact_blob_storage_primary" {
  host = "dbartifactsprodncus.blob.core.windows.net"
}

# https://www.terraform.io/docs/providers/dns/d/dns_a_record_set.html
data "dns_a_record_set" "log_blob_storage" {
  host = "dblogprodncentralus.blob.core.windows.net"
}

# https://www.terraform.io/docs/providers/dns/d/dns_a_record_set.html
data "dns_a_record_set" "event_hub_endpoint" {
  host = "prod-westus-observabilityEventHubs.servicebus.windows.net"
}

# https://www.terraform.io/docs/providers/dns/d/dns_a_record_set.html
data "dns_a_record_set" "dbfs_blob_storage" {
  # https://docs.microsoft.com/en-us/azure/databricks/administration-guide/cloud-configurations/azure/udr#dbfs-root-blob-storage-ip-address
  host = "dbstoragexxxxxxxblob.core.windows.net" # TODO: Need a way to capture this from Databricks resource
  # host = module.databricks_default.default.dbfs_root_url
}

# https://www.terraform.io/docs/providers/azurerm/r/route_table.html
# https://docs.microsoft.com/en-us/azure/databricks/administration-guide/cloud-configurations/azure/on-prem-network#--step-3-create-user-defined-routes-and-associate-them-with-your-azure-databricks-virtual-network-subnets
# https://docs.microsoft.com/en-us/azure/databricks/administration-guide/cloud-configurations/azure/udr
resource "azurerm_route_table" "databricks" {
  name                          = lower(join("-", ["rt", var.NAME, "adb", local.metadata.stage, local.metadata.location]))
  location                      = data.azurerm_resource_group.default.location
  resource_group_name           = data.azurerm_resource_group.default.name
  disable_bgp_route_propagation = true

  route {
    name           = "ControlPlaneNAT"
    address_prefix = "23.101.152.95/32"
    next_hop_type  = "Internet"
  }

  route {
    name           = "Webapp"
    address_prefix = "40.70.58.221/32"
    next_hop_type  = "Internet"
  }

  route {
    name           = "Metastore"
    address_prefix = join("/", [data.dns_a_record_set.metastore.addrs[0], "32"])
    next_hop_type  = "Internet"
  }

  route {
    name           = "ArtifactBlobStorage"
    address_prefix = join("/", [data.dns_a_record_set.artifact_blob_storage_primary.addrs[0], "32"])
    next_hop_type  = "Internet"
  }

  route {
    name           = "LogBlobStorage"
    address_prefix = join("/", [data.dns_a_record_set.log_blob_storage.addrs[0], "32"])
    next_hop_type  = "Internet"
  }

  route {
    name           = "DBFSRootBlobStorage"
    address_prefix = join("/", [data.dns_a_record_set.dbfs_root_blob_storage.addrs[0], "32"])
    next_hop_type  = "Internet"
  }
}

# https://www.terraform.io/docs/providers/azurerm/r/subnet_route_table_association.html
resource "azurerm_subnet_route_table_association" "databricks_private" {
  subnet_id      = azurerm_subnet.databricks_private.id
  route_table_id = azurerm_route_table.databricks.id
}

# https://www.terraform.io/docs/providers/azurerm/r/subnet_route_table_association.html
resource "azurerm_subnet_route_table_association" "databricks_public" {
  subnet_id      = azurerm_subnet.databricks_public.id
  route_table_id = azurerm_route_table.databricks.id
}

New or Affected Resource(s)

  • azurerm_databricks_workspace

Potential Terraform Configuration

output "dbfs_root_url" {
  value = azurerm_databricks_workspace.default.dbfs_root_url
}

References

@WodansSon
Copy link
Collaborator

WodansSon commented Jul 13, 2021

In PR #12543 I expose a new field which is settable if you prefer a given name, else it will be generated by the Databricks RP. You can now get this value from the field storage_account_name. I cannot actually return the full URL as that is not currently returned in the GET call.

@WodansSon WodansSon added this to the v2.70.0 milestone Jul 20, 2021
@katbyte katbyte modified the milestones: v2.70.0, v2.71.0 Jul 30, 2021
WodansSon added a commit that referenced this issue Jul 30, 2021
* Upgrade databricks API to 2021-04-01-preview

* Fix documentation lint error

* Expose new fields in API

* Fix lint error

* Fix new line lint error...

* Stable removes depends_on requirement

* add nil checks incase resource is in failed state

* Stable new validation checks

* Update azurerm/internal/services/databricks/databricks_workspace_resource.go

Co-authored-by: kt <[email protected]>

* Update azurerm/internal/services/databricks/databricks_workspace_resource.go

Co-authored-by: kt <[email protected]>

* Fully working private link

* Add datasource, example, and documentation

* Ignore custom params not returned by the ARM GET

* Update names in example

* Update require_network_security_group_rules name

* Update name in example

* PR feedback, added examples and updated docs

* rename test to remove test case ambiguity

* Update tests and documentation

* Fix misspelling in documentation

Co-authored-by: kt <[email protected]>
@github-actions
Copy link

github-actions bot commented Aug 6, 2021

This functionality has been released in v2.71.0 of the Terraform Provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading.

For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template. Thank you!

@github-actions
Copy link

github-actions bot commented Sep 6, 2021

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 6, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.