Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for File paths (and ACLs) in ADLS Gen 2 storage accounts #7118

Closed
stuartleeks opened this issue May 28, 2020 · 10 comments
Closed

Support for File paths (and ACLs) in ADLS Gen 2 storage accounts #7118

stuartleeks opened this issue May 28, 2020 · 10 comments

Comments

@stuartleeks
Copy link
Contributor

stuartleeks commented May 28, 2020

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Description

We currently have the azurerm_storage_data_lake_gen2_filesystem resource for initialising ADLS Gen2 filesystems, but lack the ability to manage paths and ACLs with the provider.

An example of where this would be useful is provisioning ADLS Gen 2 accounts and Azure Databricks via azurerm provider and then configuring Databricks mounts using the databricks provider from databrickslabs. For this scenario we need to be able to create the paths in the ADLS account and set the appropriate ACLs to allow access.

New or Affected Resource(s)

  • azurerm_storage_data_lake_gen2_path
  • azurerm_storage_data_lake_gen2_path_acl

Potential Terraform Configuration

Some initial thoughts on what the new resources might look like are below.

#
# Existing resource types for context
#

resource "azurerm_resource_group" "example" {
  name     = "example-resources"
  location = "West Europe"
}

resource "azurerm_storage_account" "example" {
  name                     = "examplestorageacc"
  resource_group_name      = azurerm_resource_group.example.name
  location                 = azurerm_resource_group.example.location
  account_tier             = "Standard"
  account_replication_type = "LRS"
  account_kind             = "StorageV2"
  is_hns_enabled           = "true"
}

resource "azurerm_storage_data_lake_gen2_filesystem" "example" {
  name               = "example"
  storage_account_id = azurerm_storage_account.example.id
}

#
# New resource types
#

# create a path item (folder)
# future - allow create files by uploading contents?
resource "azurerm_storage_data_lake_gen2_path" "example" {
  storage_account_id = azurerm_storage_account.example.id
  filesystem_name               = azurerm_storage_data_lake_gen2_filesystem.name
  path = "my-path"
}

# set ACLs on a path
resource "azurerm_storage_data_lake_gen2_path_acl" "example" {
  storage_account_id = azurerm_storage_account.example.id
  filesystem_name               = "example"
  path = azurerm_storage_data_lake_gen2_path.example.path
  ace {
    scope = "default"
    type = "user"
    id = "[email protected]"
    permissions  = "rwx"
  }
  ace {
    type = "user"
    id = "stuart@contosocom"
    permissions  = "rwx"
  }
  ace {
    scope = "default"
    type = "group"
    id = "[email protected]"
    permissions  = "r--"
  }
  ace {
    type = "group"
    id = "[email protected]"
    permissions  = "r--"
  }
  ace {
    type = "other"
    permissions  = "--"
  }
  ace {
    type = "mask"
    permissions  = "rwx"
  }
 # resulting ACL string: "default:user:[email protected]:rwx,user:[email protected]:rwx,default:group:[email protected]:r--,group:[email protected]:other::---;mask:rwx"
}

References

@tombuildsstuff tombuildsstuff added new-resource sdk/requires-upgrade This is dependent upon upgrading an SDK service/storage labels May 28, 2020
@stuartleeks
Copy link
Contributor Author

UPDATE: After discussing with @tombuildsstuff the plan is to combine the path and ACL resources into a single resource:

resource "azurerm_storage_data_lake_gen2_path" "example" {
  storage_account_id = azurerm_storage_account.example.id
  filesystem_name               = azurerm_storage_data_lake_gen2_filesystem.name
  path = "my-path"

  ace {
    scope = "default"
    type = "user"
    id = "[email protected]"
    permissions  = "rwx"
  }
  ace {
    type = "user"
    id = "stuart@contosocom"
    permissions  = "rwx"
  }
  ace {
    scope = "default"
    type = "group"
    id = "[email protected]"
    permissions  = "r--"
  }
  ace {
    type = "group"
    id = "[email protected]"
    permissions  = "r--"
  }
  ace {
    type = "other"
    permissions  = "--"
  }
  ace {
    type = "mask"
    permissions  = "rwx"
  }
 # resulting ACL string: "default:user:[email protected]:rwx,user:[email protected]:rwx,default:group:[email protected]:r--,group:[email protected]:other::---;mask:rwx"
}

The plan is to make the ace list optional and computed to allow you to specify just a path that should be created and have it inherit the permissions from the parent folder's permissions.

@stuartleeks
Copy link
Contributor Author

Quick note to say that I've started work on this...

@stuartleeks
Copy link
Contributor Author

This is turning out to be a bigger task than I first thought (and I've had a couple of things that have taken some time away from working on it). My plan is to split it up into smaller pieces of work that are potentially mergeable as I'm keen not to end up with a load of WIP that goes stale if I haven't managed to finish before my team's next engagement starts

My plan is:

  1. directory support only (I'm close to having the resource done for this with associated giovanni changes)
  2. add support for ACLs
  3. add support for files (this brings in uploading potentially large files, handling content type properties etc)

Once 1 is done it opens up a scenario where you can create a Data Lake and set up directories with appropriate permissions using RBAC at the file system level for Databricks and other compute to be able to access and feels like a starting scenario to enable. Adding in 2 would allow you to set permissions with ACLs at a more granular level (folders within the file system) and feels like a fairly compelling scenario.

@alex-goncharov
Copy link
Contributor

alex-goncharov commented Jun 29, 2020

@stuartleeks thanks for this!

Directories on ACL (including default ACL) should cover 99.99% of use-cases that I see in our environments, all of which are the ACL + default ACL on the top-level directory (not /, but one below it).

@stuartleeks
Copy link
Contributor Author

stuartleeks commented Jun 29, 2020

Thanks @alex-goncharov - good feedback to get!

I've got an initial PR on the storage SDK that is used and corresponding WIP PR on the provider that delivers part 1 (directory creation). If they are both ok and can be merged then I might get time to do part 2 (the ACL support for directories) 🤞

@stuartleeks
Copy link
Contributor Author

I've pushed an update to the PR that adds part 2, i.e. support for ACLs :-)

@stuartleeks
Copy link
Contributor Author

I see that this PR in go-autorest is merged and looks like it has been released. Does that mean that we can proceed with updating Giovanni to use the latest version of go-autorest and then update the Giovanni version in azurerm? 🤞

@tombuildsstuff tombuildsstuff added this to the v2.37.0 milestone Nov 16, 2020
@tombuildsstuff tombuildsstuff removed the sdk/requires-upgrade This is dependent upon upgrading an SDK label Nov 16, 2020
@tombuildsstuff
Copy link
Contributor

Fixed via #7521

@ghost
Copy link

ghost commented Nov 20, 2020

This has been released in version 2.37.0 of the provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading. As an example:

provider "azurerm" {
    version = "~> 2.37.0"
}
# ... other configuration ...

@ghost
Copy link

ghost commented Dec 19, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 [email protected]. Thanks!

@ghost ghost locked as resolved and limited conversation to collaborators Dec 19, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants