This repository contains additional notes, checklists, and (eventually) tools to help configure and validate fully private Azure ML deployments.
- Due to the fact that Azure ML will create and manage compute instances on the user's behalf, I highly recommend not using HOSTS files during any part of the evaluation and implementation of Azure ML Private Networking. Since the creation of compute instances and compute clusters require the ability to resolve private link addresses, local HOSTS files can not be leveraged beyond the initial portal configuration.
Microsoft Docs contains a tutorial to walk through creating a secure workspace, along with the requisite Azure resources, connectivity, and configuration. With the number of steps, there is the potential for missed or incomplete steps. During the initial configuration, as well as during any troubleshooting, it can be helpful to have a checklist to work through to ensure that nothing was missed or misconfigured.
- Configuration Checklist - this is an empty markdown file that can be used in conjunction with the official documentation to validate each component is properly deployed and configured.
- Configuration Checklist Example - this is a completed example markdown file to serve as a representative example.
The following are common issues I've seen in deploying Azure ML privately.
- Provisioning a compute instance fails with the message "The specified Azure ML Compute Instance (name) setup failed with error "Failed to get workspace secrets. Details - Root cause: ". Please delete and try to recreate. If the problem persists, please follow up with Azure Support." - More than likely, there's an issue with private link (specifically the keyvault URIs) being unable to resolve from the ML Vnet. Validate the configuration with the checklist to see why the vault.azure.net isn't resolving privately.
- When loading the portal, an error is displayed that "Your administrator has disabled connectivity to your workspace instance from the public internet..." - If you have a proxy between your browser and the ML portal, it's possible that the proxy is routing the private link ML workspace externally / across the internet. Make sure that there are proxy bypasses in place for
*notebooks.azure.net;*api.azureml.ms;*notebooks.azure.net;*instances.azureml.ms;*aznbcontent.net;*files.core.windows.net
. - When attempting to 'bring my own data' and using Azure Data Lake Storage Gen 2 (or likely other data sources), I'm unable to access them from the portal over the private link connection. - The Managed Identity for the ML Workspace needs to be granted reader access to the service's private endpoints. This is documented here for the initial setup, but is also required for adding data sources to the environment later. Additionally, for ADLS Gen 2, both the DFS and the Blob private endpoints need to be created.