Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

azurerm_machine_learning_inference_cluster support identity, azurerm_machine_learning_compute_cluster support ssh, ssh_public_access_enabled enhancement, identity #12833

Conversation

ms-henglu
Copy link
Contributor

@ms-henglu ms-henglu commented Aug 4, 2021

image

@ms-henglu ms-henglu force-pushed the ticket-1859724-enhancement-for-ml-compute-part branch from 0e3e748 to a600ebe Compare August 4, 2021 07:10
@ms-henglu ms-henglu force-pushed the ticket-1859724-enhancement-for-ml-compute-part branch from a600ebe to e736766 Compare August 12, 2021 08:14
katbyte added a commit that referenced this pull request Aug 17, 2021
@ms-henglu ms-henglu force-pushed the ticket-1859724-enhancement-for-ml-compute-part branch from e736766 to df102ad Compare August 17, 2021 08:31
@ms-henglu
Copy link
Contributor Author

@katbyte , Hi, I've renamed these fields, please take another look, thanks!

Copy link
Collaborator

@katbyte katbyte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ms-henglu - LGTM

Copy link

@denniseik denniseik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all changes and getting this out! We requires some updates in terms of property naming to be aligned with our upcoming V2 API's and for clarity.

Type: pluginsdk.TypeList,
Required: true,
ForceNew: true,
MaxItems: 1,
Elem: &pluginsdk.Resource{
Schema: map[string]*pluginsdk.Schema{
"type": {
Type: pluginsdk.TypeString,
"max_node_count": {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requires renaming to 'max_instances' to align with upcoming V2 API for Azure ML: https://github.com/Azure/azureml_run_specification/blob/master/schemas/AmlCompute.json.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This property is not introduced in this PR, so we can't change it.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please elaborate why we are unable to change the name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can change the name, but in this PR.

Because this property is already released, so if it's really necessary to change the name, we must add a new property max_instances and deprecate the old one.

IMO, we can handle v1-to-v2 upgrade inside the tf resource, we don't have to expose them to end users, so they can have a smooth upgrade.

"principal_id": {
Type: pluginsdk.TypeString,
Computed: true,
"min_node_count": {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requires renaming to 'min_instances' to align with upcoming V2 API for Azure ML: https://github.com/Azure/azureml_run_specification/blob/master/schemas/AmlCompute.json.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This property is not introduced in this PR, so we can't change it.

},
"tenant_id": {
"scale_down_nodes_after_idle_duration": {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requires renaming to 'idle_time_before_scale_down' to align with upcoming V2 API for Azure ML: https://github.com/Azure/azureml_run_specification/blob/master/schemas/AmlCompute.json.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This property is not introduced in this PR, so we can't change it.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please elaborate why we are unable to change the name?

Copy link
Collaborator

@katbyte katbyte Sep 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be a breaking change. Once a property has been released to users we do not change it until a major version change - otherwise we would break users terraform configurations during a .x release which we do not do.

additionally, like the portal, we do not always match the API naming as we want tp be clear on what properties do as well as provide the best ux for terraform users as many users do not use IDEs that pull in property descriptions. ie something like like cluster.ssh_setting.ssh_value has redundant ssh in it as its already has ssh in the path nor is it as clear as clear as cluster.ssh.key_value.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, thanks for clarifying @katbyte

@@ -130,10 +130,19 @@ The following arguments are supported:

* `description` - (Optional) The description of the Machine Learning compute. Changing this forces a new Machine Learning Inference Cluster to be created.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whenever we speak about "Machine Learning Inference Cluster", could we say "Azure Kubernetes Service cluster" instead? Azure Machine Learning merely helps create an Azure Kubernetes Service cluster programmatically as described on this page: https://docs.microsoft.com/en-us/azure/machine-learning/how-to-create-attach-kubernetes?tabs=python. The name Machine Learning Inference Cluster may create the impression that Inference Cluster is a managed offering although it is not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We actually are deploying https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/kubernetes_cluster it, and then register it to the machine learning workspace. Customers can also register an existing AKS cluster. To our end users, its the same thing.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the description assigned to this resource? and this resource is a Machine Learning Inference Cluster regardless of what it is behind the scenes.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@katbyte We never expected Machine Learning Inference Cluster by name as its concept is not existing in our service.

@@ -130,10 +130,19 @@ The following arguments are supported:

* `description` - (Optional) The description of the Machine Learning compute. Changing this forces a new Machine Learning Inference Cluster to be created.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the description assigned to this resource? and this resource is a Machine Learning Inference Cluster regardless of what it is behind the scenes.

@katbyte
Copy link
Collaborator

katbyte commented Sep 14, 2021

looks like we have some test failures:
image

@ms-henglu
Copy link
Contributor Author

@katbyte , I've fixed the tests. please check.

image

Copy link
Collaborator

@katbyte katbyte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

THanks @ms-henglu - LGTM 🚀

@katbyte katbyte changed the title enhancement for compute_cluster and inference_cluster azurerm_machine_learning_inference_cluster support identity, azurerm_machine_learning_compute_cluster support ssh, ssh_public_access_enabled enhancement, identity Sep 16, 2021
@katbyte katbyte merged commit 447d7ea into hashicorp:main Sep 16, 2021
katbyte added a commit that referenced this pull request Sep 16, 2021
@github-actions
Copy link

This functionality has been released in v2.77.0 of the Terraform Provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading.

For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template. Thank you!

@github-actions
Copy link

I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 18, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants