Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Resource: Azure Data Explorer (kusto) #3856

Closed
subesokun opened this issue Jul 16, 2019 · 15 comments
Closed

New Resource: Azure Data Explorer (kusto) #3856

subesokun opened this issue Jul 16, 2019 · 15 comments

Comments

@subesokun
Copy link

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Description

Request for new resource: Azure Data Explorer (kusto)

References

@r0bnet
Copy link
Contributor

r0bnet commented Aug 7, 2019

Hey @subesokun,
i had a look into the Go SDK for Azure and it seems as if it could be implemented. I'd kindly ask you to propose how you want the resource(s) to look like. This makes it much easier to implement later and we already have a resource design we agreed upon.

I'd suggest to split the data explorer into two resources:

  • azurerm_kusto_cluster
  • azurerm_kusto_cluster_database

Thanks in advance!

@subesokun
Copy link
Author

Hi @r0bnet, splitting up the resources into azurerm_kusto_cluster and azurerm_kusto_cluster_database sounds good to me :) Unfortunately, I don't know how the Go implementation looks like, so I can't give detailed feedback on how the resources shall be structured. But it would be great if there is a way to roll-out and update the database schemas (multiple) via Terraform.

@r0bnet
Copy link
Contributor

r0bnet commented Aug 7, 2019

Okay, then i'll propose a structure for both and probably one of the main contributors can review it. Most of the time you can get the information about what attributes are available from the ARM templates: https://docs.microsoft.com/de-de/azure/templates/microsoft.kusto/2019-01-21/clusters and https://docs.microsoft.com/de-de/azure/templates/microsoft.kusto/2019-01-21/clusters/databases

@r0bnet
Copy link
Contributor

r0bnet commented Aug 8, 2019

My resource design proposal:

Cluster:

resource "azurerm_kusto_cluster" "cluster" {
  name                = "my-kusto-cluster"
  resource_group_name = "my-kusto-rg"
  location            = "northeurope"

  sku {
    name     = "Standard_D13_v2s"   # required; possible values: Standard_D13_v2s, Standard_D14_v2s, Standard_L8s, Standard_L16s, Standard_D11_v2s, Standard_D12_v2s, Standard_L4s
    capacity = 2          # required
  }

  tags = {
    env = "PRODUCTION"
  }
}

Database:

resource "azurerm_kusto_database" "database" {
  name                = "my-kusto-database"
  resource_group_name = "my-kusto-rg"
  location            = "northeurope"
  cluster_name        = azurerm_kusto_cluster.cluster.name

  soft_delete_period = "P365D" # optional
  hot_cache_period   = "P31D" # optional
}

EventHub Data Connection:

resource "azurerm_kusto_eventhub_data_connection" "eventhub" {
  name                = "my-kusto-database"
  resource_group_name = "my-kusto-rg"
  location            = "northeurope"
  cluster_name        = azurerm_kusto_cluster.cluster.name
  database_name       = azurerm_kusto_cluster_database.database.name

  eventhub_id    = var.eventhub_id
  consumer_group = var.consumer_group

  table_name        = "my-table"
  mapping_rule_name = "MyMapping"
  data_format       = "JSON" // valid: 'MULTIJSON', 'JSON', 'CSV', 'TSV', 'SCSV', 'SOHSV', 'PSV', 'TXT', 'RAW', 'SINGLEJSON', 'AVRO'
}

EventGrid Data Connection:

resource "azurerm_kusto_eventgrid_data_connection" "eventhub" {
  name                = "my-kusto-database"
  resource_group_name = "my-kusto-rg"
  location            = "northeurope"
  cluster_name        = azurerm_kusto_cluster.cluster.name
  database_name       = azurerm_kusto_cluster_database.database.name

  storage_account_id = var.storage_account_id
  eventhub_id        = var.eventhub_id // should this be renamed to eventgrid_id?
  consumer_group     = var.consumer_group

  table_name        = "my-table"
  mapping_rule_name = "MyMapping"
  data_format       = "JSON" // valid: 'MULTIJSON', 'JSON', 'CSV', 'TSV', 'SCSV', 'SOHSV', 'PSV', 'TXT', 'RAW', 'SINGLEJSON', 'AVRO'
}

If i understood it correctly then table_name, mapping_rule_name and data_format are optional in both connections and can be omitted if the messages already contain routing information?

@ilayrn
Copy link

ilayrn commented Aug 13, 2019

Hi @r0bnet, the design looks good - a couple of comments:

  • There's a new api version (2019-05-15)

  • Database resource doesn't support tags

  • The size property in a database resource is read only

@r0bnet
Copy link
Contributor

r0bnet commented Aug 14, 2019

Hi @r0bnet, the design looks good - a couple of comments:

  • There's a new api version (2019-05-15)
  • Database resource doesn't support tags
  • The size property in a database resource is read only
  • API Version: Where did you find it? I only found the kusto resources in the 2019-01-21 api version. Looked into ARM specs, REST API specs and the Go SDK
  • Tags: you're right, removed it
  • Size: also right, removed it. Can probably be added as output.

@jrauschenbusch
Copy link
Contributor

Hi @r0bnet, maybe the following features could also be considered in your design:

  • Option to assign Database Permissions (Role & Principal)
  • Option to define Data Connections (EventGrid and EventHub):
    • Downside: ADX Table and Column Mapping Rule must exist in a ADX Database before a Data Connection can be set up

@r0bnet
Copy link
Contributor

r0bnet commented Aug 16, 2019

  • Option to assign Database Permissions (Role & Principal)

  • Option to define Data Connections (EventGrid and EventHub):

    • Downside: ADX Table and Column Mapping Rule must exist in a ADX Database before a Data Connection can be set up
  • Permissions: didn't find anything in that direction in the API specs. Any example(s)?
  • Data Connections: yes, they should definitely be added. I revisited the Go SDK and saw that you can specify both. At first i was only looking at the ARM reference. Sorry for that.

@jrauschenbusch i updated my resource design above. Regarding the downside you mentioned i added a comment on that. Can you approve design and comment?

@jrauschenbusch
Copy link
Contributor

Hi @r0bnet

  • Permissions: didn't find anything in that direction in the API specs. Any example(s)?

The operation you're searching is AddPrincipals(...):

Data Connections: ...

The design looks good to me. One question might be whether it is possible to establish a data connection before the table and table mapping exist. If not, then the ADX commands .create have to be executed before the data connection resources can be created. ADX Schema deployments and updates is an open point anyway.

// should this be renamed to eventgrid_id?

As far as I understand, the eventhub_id property for the event grid data connection is not an error in their API. You can create an Event Grid subscription to a blob store in a storage account. The sink for such events must be one of the following types: Web Hook, Storage Queues, Event Hub, Hybrid Connections, or Service Bus Queue. Apparently, the only type currently supported by ADX data connections is an Event Hub sink.

If i understood it correctly then table_name, mapping_rule_name and data_format are optional in both connections and can be omitted if the messages already contain routing information?

Where did you get this information from?

@jrauschenbusch
Copy link
Contributor

Some notes to the cluster resource:

  • Missing ability to define Availability Zones of a specified Location?
  • Missing security feature Disk Encryption

@r0bnet
Copy link
Contributor

r0bnet commented Aug 19, 2019

Some notes to the cluster resource:

  • Missing ability to define Availability Zones of a specified Location?
  • Missing security feature Disk Encryption

I saw at least that Disk Encryption exists but i guess only in a newer api version (2019-05-15 or something) but we can only use 2019-01-21 for now as the newer one is not available in the go sdk or in the api specs.

@jrauschenbusch
Copy link
Contributor

jrauschenbusch commented Aug 19, 2019

I saw at least that Disk Encryption exists but i guess only in a newer api version (2019-05-15 or something) but we can only use 2019-01-21 for now as the newer one is not available in the go sdk or in the api specs.

True. I created an issue inside the Go SDK repo: Azure/azure-sdk-for-go#5558

Regarding the schema and mapping deployment: Do you think this should be realized via dedicated TF resource types or via scripts and a provisioner like local-exec?

@r0bnet
Copy link
Contributor

r0bnet commented Aug 19, 2019

True. I created an issue inside the Go SDK repo: Azure/azure-sdk-for-go#555

Regarding the schema and mapping deployment: Do you think this should be realized via dedicated TF resource types or via scripts and a provisioner like local-exec?

I think you linked the wrong issue but found it anyway. The issues has to be addressed in a different repository as the sdk is autogenerated via this one: https://github.com/Azure/azure-rest-api-specs

There the new API version (2019-05-15) should be added so that it will be added to the GO sdk later on.

Regarding your other questions:

  • Database principals: Currently unsure if this is easily doable via Terraform as there are no CRUD methods, only add and remove multiple principals. Will have to investigate
  • Schema and Mapping deployment: i'm not a fan of the local-exec thing and i'd rather try to put responsibilities of creating those "resources" into a different kind of deployment as it is more part of the application logic. Downside is that then the creation of the data connections had to be moved outside of the TF stuff because it's depending on those mappings.

We faced similar situations for things like databases. Solution was to ONLY create real infrastructure in TF and leave the other stuff to development teams with their own tools. But also not ideal.

@jrauschenbusch
Copy link
Contributor

@subesokun @r0bnet I think this ticket can now be closed. All resources listed here are now available. Support requests for new features of the specific resource types should be handled in dedicated tickets.

@ghost
Copy link

ghost commented Mar 28, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 [email protected]. Thanks!

@ghost ghost locked and limited conversation to collaborators Mar 28, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants