- Overview
- Data Flow
- Access Control
- Networking and Security Configuration
- Customer Managed Keys
- Secrets
- Logging
- Testing
- Azure Deployment
Teams can request subscriptions from CloudOps team with up to Owner permissions for Healthcare workloads, thus democratizing access to deploy, configure, and manage their applications with limited involvement from CloudOps team. CloudOps team can choose to limit the permission using custom roles as deemed appropriate based on risk and requirements.
Azure Policies are used to provide governance, compliance and protection while enabling teams to use their preferred toolset to use Azure services.
CloudOps team will be required for
- Establishing connectivity to Hub virtual network (required for egress traffic flow & Azure Bastion).
- Creating App Registrations (required for service principal accounts). This is optional based on whether App Registrations are disabled for all users or not.
Workflow
- A new subscription is created through existing process (either via ea.azure.com or Azure Portal).
- The subscription will automatically be assigned to the pubsecSandbox management group.
- CloudOps will create a Service Principal Account (via App Registration) that will be used for future DevOps automation.
- CloudOps will scaffold the subscription with baseline configuration.
- CloudOps will hand over the subscription to requesting team.
Subscription Move
Subscription can be moved to a target Management Group through Azure ARM Templates/Bicep. Move has been incorporated into the landing zone Azure DevOps Pipeline automation.
Capabilities
Capability | Description |
---|---|
Service Health Alerts | Configures Service Health alerts such as Security, Incident, Maintenance. Alerts are configured with email, sms and voice notifications. |
Microsoft Defender for Cloud | Configures security contact information (email and phone). |
Subscription Role Assignments | Configures subscription scoped role assignments. Roles can be built-in or custom. |
Subscription Budget | Configures monthly subscription budget with email notification. Budget is configured by default for 10 years and the amount. |
Subscription Tags | A set of tags that are assigned to the subscription. |
Resource Tags | A set of tags that are assigned to the resource group and resources. These tags must include all required tags as defined the Tag Governance policy. |
Automation | Deploys an Azure Automation Account in each subscription. |
Hub Networking | Configures virtual network peering to Hub Network which is required for egress traffic flow and hub-managed DNS resolution (on-premises or other spokes, private endpoints). |
Networking | A spoke virtual network with minimum 4 zones: oz (Operational Zone), paz (Public Access Zone), rz (Restricted Zone), hrz (Highly Restricted Zone). Additional subnets can be configured at deployment time using configuration (see below). |
Key Vault | Deploys a spoke managed Azure Key Vault instance that is used for key and secret management. |
SQL Database | Deploys Azure SQL Database. Optional. |
Azure Data Lake Store Gen 2 | Deploys an Azure Data Lake Gen 2 instance with hierarchical namespace. There aren't any parameters for customization. |
Synapse Analytics | Deploys Synapse Analytics instance. |
Azure Machine Learning | Deploys Azure Machine Learning Service. There aren't any parameters for customization. |
Azure Databricks | Deploys an Azure Databricks instance. There aren't any parameters for customization. |
Azure Data Factory | Deploys an Azure Data Factory instance with Managed Virtual Network and Managed Integrated Runtime. There aren't any parameters for customization. |
Azure Container Registry | Deploys an Azure Container Registry to store machine learning models as container images. ACR is used when deploying pods to AKS. There aren't any parameters for customization. |
Azure API for FHIR | Deploys Azure API for FHIR with FHIR-R4. There aren't any parameters for customization. |
Azure Functions | Deploys Azure Functions. There aren't any parameters for customization. |
Azure Stream Analytics | Deploys Stream Analytics instance for streaming scenarios. There aren't any parameters for customization. |
Azure Event Hub | Deploys Azure Event Hub for stream scenarios. There aren't any parameters for customization. |
Application Insights | Deploys an Application Insights instance that is used by Azure Machine Learning instance. There aren't any parameters for customization. |
Azure Services circled on the diagram are deployed in this archetype.
Category | Service | Configuration | Reference |
---|---|---|---|
Storage | Azure Data Lake Gen 2 - Cloud storage enabling big data analytics. | Hierarchical namespace enabled. Optional – Customer Managed Keys. | Azure Docs |
Compute | Azure Databricks - Managed Spark cloud platform for data analytics and data science | Premium tier; Secured Cluster Connectivity enabled with load balancer for egress. | Azure Docs |
Compute | Azure Synapse - End-to-end cloud analytics and data warehousing platform. | Disabled public network access by default. Managed Private Endpoints for Compute & Synapse Studio. Optional – Customer Managed Keys. | Managed Private Endpoints / Connect to Synapse Studio with private links |
Compute | FHIR API - Fast Healthcare Interoperability Resources for healthcare medical exchange. | Private endpoint by default. | Azure Docs |
Compute | Azure Stream Analytics | Real-time analytics and event-processing engine for process high volumes of fast streaming data from multiple sources simultaneously. | Azure Docs |
Compute | Azure Function App - Serverless computing service | Virtual Network Integration for accessing resources in virtual network. | Azure Docs |
Ingestion | Azure Data Factory - Managed cloud service for data integration and orchestration | Managed virtual network. Optional – Customer Managed Keys | Azure Docs |
Ingestion | Event Hub - Data streaming platform and event ingestion service | N/A | Azure Docs |
Machine learning and deployment | Azure Machine Learning - Cloud platform for end-to-end machine learning workflows | Optional – Customer Managed Keys, High Business Impact Workspace | Azure Docs |
Machine learning and deployment | Azure Container Registry - Managed private Docker cloud registry | Premium SKU. Optional – Customer Managed Keys | Azure Docs |
SQL Storage | Azure SQL Database - Fully managed cloud database engine | Optional – Customer Managed Keys | Azure Docs |
Key Management | Azure Key Vault - Centralized cloud storage of secrets and keys | Private Endpoint | Azure Docs |
Monitoring | Application Insights - Application performance and monitoring cloud service | - | Azure Docs |
The intended cloud service workflows and data movements for this archetype include:
- Data can be ingested from data sources using Data Factory with managed virtual network for its Azure hosted integration runtime
- Streaming data can be ingested using Event Hub and Stream Analytics
- The data would be stored in Azure Data Lake Gen 2.
- Healthcare providers can connect to existing data sources with FHIR API.
- Data engineering and transformation tasks can be done with Spark using Azure Databricks. Transformed data would be stored back in the data lake.
- End to end analytics and data warehousing can be done with Azure Synapse Analytics.
- Machine learning would be done using Azure Machine Learning.
- Monitoring and logging would be through Application Insights.
Once the machine learning archetype is deployed and available to use, access control best practices should be applied. Below is the recommend set of security groups & their respective Azure role assignments. This is not an inclusive list and could be updated as required.
Replace PROJECT_NAME
placeholder in the security group names with the appropriate project name for the workload.
Security Group | Azure Role | Notes |
---|---|---|
SG_PROJECT_NAME_ADMIN | Subscription with Owner role. |
Admin group for subscription. |
SG_PROJECT_NAME_READ | Subscription with Reader role. |
Reader group for subscription. |
SG_PROJECT_NAME_DATA_PROVIDER | Data Lake (main storage account) service with Storage Blob Data Contributor role. Key Vault service with Key Vault Secrets User . |
Data group with access to data as well as key vault secrets usage. |
SG_PROJECT_NAME_DATA_SCIENCE | Azure ML service with Contributor role. Azure Databricks service with Contributor role. Key Vault service with Key Vault Secrets User . |
Data science group with compute access as well as key vault secrets usage. |
Service Name | Settings | Private Endpoints / DNS | Subnet(s) |
---|---|---|---|
Azure Key Vault | Network ACL Deny | Private endpoint on vault + DNS registration to either hub or spoke |
privateEndpoints |
SQL Database | Deny public network access | Private endpoint on sqlserver + DNS registration to either hub or spoke |
privateEndpoints |
Azure Data Lake Gen 2 | Network ACL deny | Private endpoint on blob , dfs + DNS registration to either hub or spoke |
privateEndpoints |
Synapse | Disabled public network access; managed virtual network; Data exfiltration protection enabled | * Managed Private Endpoints & Synapse Studio Private Link Hub. Private endpoint DNS registration. | privateEndpoints |
Synapse | Support Azure AD only authentication | Synapse service supports 3 modes, SQL only Authentication, Azure AD only authentication and mixed authentication. Preferred mode is Azure AD only authentication | privateEndpoints |
Azure Databricks | No public IP enabled (secure cluster connectivity), load balancer for egress with IP and outbound rules, virtual network ibjection | N/A | databricksPrivate , databricksPublic |
Azure Machine Learning | No public workspace access | Private endpoint on amlWorkspace + DNS registration to either hub or spoke |
privateEndpoints |
Azure Storage Account for Azure ML | Network ACL deny | Private endpoint on blob , file + DNS registration to either hub or spoke |
privateEndpoints |
Azure Data Factory | Public network access disabled, Azure integration runtime with managed virtual network | Private endpoint on dataFactory + DNS registration to either hub or spoke |
privateEndpoints |
FHIR API | N/A | Private endpoint on fhir + DNS registration to either hub or spoke |
privateEndpoints |
Event Hub | N/A | Private endpoint on namespace + DNS registration to either hub or spoke |
privateEndpoints |
Function App | Virtual Network Integration | N/A | web |
Azure Container Registry | Network ACL deny, public network access disabled | Private endpoint on registry + DNS registration to either hub or spoke |
privateEndpoints |
Azure Application Insights | N/A | N/A | N/A |
This archetype also has the following security features as options for deployment:
-
Customer managed keys for encryption at rest, including Azure ML, storage, Container Registry, Data Factory, SQL Database, Azure Machine Learning, Synapse Analytics and Kubernetes Service.
-
Azure ML has ability to enable high-business impact workspace which controls amount of data Microsoft collects for diagnostic purposes.
To enable customer-managed key scenarios, some services including Azure Storage Account and Azure Container Registry require deployment scripts to run with a user-assigned identity to enable encryption key on the respective instances.
Therefore, when the useCMK
parameter is true
, a deployment identity is created and assigned Owner
role to the compute and storage resource groups to run the deployment scripts as needed. Once the services are provisioned with customer-managed keys, the role assignments are automatically deleted.
If customer-managed key is required for the FHIR API, a separate Key Vault with access policy permission model is required.
The artifacts created by the deployment script such as Azure Container Instance and Storage accounts will be automatically deleted 1 hour after completion.
Temporary passwords are autogenerated, and connection strings are automatically stored as secrets in Key Vault. They include:
- SQL Database username, password, and connection string In the case of choosing SQL Authentication, if choosing Azure AD authentication, no secrets needed.
- Synapse username and password in case of using SQL authentication, however the recommended way is to use Azure AD only authentication. See April 18, 2022 update on /schemas/latest/readme.md
Azure Policy will enable diagnostic settings for all PaaS components in the machine learning archetype and the logs will be sent to the centralized log analytics workspace. These policies are configured at the management group scope and are not explicitly deployed.
Test scripts are provided to verify end to end integration. These tests are not automated so minor modifications are needed to set up and run.
The test scripts are located in tests/landingzones/lz-healthcare/e2e-flow-tests
The scripts are:
- Azure ML Key Vault integration test
- Azure ML terminal connection to ACR test
- Databricks integration with Key Vault Data Lake test
- Synapse integration tests for SQL Serverless, Spark, and SQL DW (dedicated) Pools
Considerations for testing Azure Data Factory and Synapse using managed virtual networks
- Data Factory - in order to test connectivity to data lake, ensure a managed private endpoint is set up along with interactive authoring enabled
- Synapse Analytics
- Pipeline, ensure a managed private endpoint is set up along with interactive authoring enabled to test connectivity to data lake
- SQL Serverless connectivity to data lake:
- The default connectivity is to use user identity passthrough, thus, the user should have storage blob data contributor role to the role
- Managed Synapse identity can be used, which the landing zone deployment automatically grants the MSI storage blob data contributor to the data lake
- Upload some data to the default ADLS Gen 2 of Synapse
- Run the integration tests for Synapse SQL Serverless Pool
- Spark pool connectivity to data lake
- Ensure the user has storage blob data contributor role for the data lake
- Upload some data to the default ADLS Gen 2 of Synapse
- Run the integration tests for Synapse Spark Pool
- Dedicated SQL (SQL Data warehouse)
- Ensure the user identity has a SQL Login (e.g. the admin user could be assigned the SQL AD admin)
- Upload some data to the default ADLS Gen 2 of Synapse
- Run the integration tests for Synapse SQL Dedicated Pool (DW)
Azure ML SQL / Key vault test
- Access the ML landing zone network and log into Azure ML through https://ml.azure.com
- Set up a compute instance and create a new notebook to run Python notebook
- Use the provided test script to test connection to Key Vault by retrieving the SQL password
- Create a datastore connecting to SQL DB
- Create a dataset connecting to a table in SQL DB
- Use the provided dataset consume code to verify connectivity to SQL DB
Azure ML terminal connection to ACR test
- Access the ML landing zone network and log into Azure ML through https://ml.azure.com
- Set up a compute instance and use its built-in terminal
- Use the provided test script to pull a hello-word Docker image and push to ACR
Databricks integration tests
- Access Azure Databricks workspace
- Create a new compute cluster
- Create a new Databricks notebook in the workspace and copy in the integration test script
- Run the test script to verify connectivity to Key Vault, SQL DB/MI, and data lake
Azure ML deployment test
- Access the ML network and log into Azure ML through https://ml.azure.com
- Set up a compute instance and import the provided tests to the workspace
- Run the test script, which will build a Docker Azure ML model image, push it to ACR, and then AKS to pull and run the ML model
Reference implementation uses parameter files with object
parameters to consolidate parameters based on their context. The schemas types are:
-
Schema (version:
latest
)-
Common types
-
Spoke types
As an administrator, you can lock a subscription, resource group, or resource to prevent other users in your organization from accidentally deleting or modifying critical resources. The lock overrides any permissions the user might have. You can set the lock level to CanNotDelete
or ReadOnly
. Please see Azure Docs for more information.
This archetype does not use CanNotDelete
nor ReadOnly
locks as part of the deployment. You may customize the deployment templates when it's required for your environment.
Service health notifications are published by Azure, and contain information about the resources under your subscription. Service health notifications can be informational or actionable, depending on the category.
Our examples configure service health alerts for Security
and Incident
. However, these categories can be customized based on your need. Please review the possible options in Azure Docs.
Sample deployment scenarios are based on the latest JSON parameters file schema definition. If you have an older version of this repository, please use the examples from your repository.
Scenario | Example JSON Parameters | Notes |
---|---|---|
Deployment with Hub Virtual Network | tests/schemas/lz-healthcare/FullDeployment-With-Hub.json | - |
Deployment with Location | tests/schemas/lz-healthcare/FullDeployment-With-Location.json | parameters.location.value is canadacentral |
Deployment without Hub Virtual Network | tests/schemas/lz-healthcare/FullDeployment-Without-Hub.json | parameters.hubNetwork.value.* fields are empty & parameters.network.value.peerToHubVirtualNetwork is false. |
Deployment with optional subnets | tests/schemas/lz-healthcare/FullDeployment-With-OptionalSubnets.json | parameters.network.subnets.optional array is set with optional subnets. |
Deployment with subscription budget | tests/schemas/lz-healthcare/BudgetIsTrue.json | parameters.subscriptionBudget.value.createBudget is set to true and budget information filled in. |
Deployment without subscription budget | tests/schemas/lz-healthcare/BudgetIsFalse.json | parameters.subscriptionBudget.value.createBudget is set to false and budget information removed. |
Deployment without resource tags | tests/schemas/lz-healthcare/EmptyResourceTags.json | parameters.resourceTags.value is an empty object. |
Deployment without subscription tags | tests/schemas/lz-healthcare/EmptySubscriptionTags.json | parameters.subscriptionTags.value is an empty object. |
Deployment without SQL DB | tests/schemas/lz-healthcare/SQLDBIsFalse.json | parameters.sqldb.value.enabled is false. |
Deployment with SQL DB using AAD only authentication | tests/schemas/lz-healthcare/SQLDB-aadAuthOnly.json | parameters.sqldb.value.aadAuthenticationOnly is true, parameters.sqldb.value.aad* fields filled in. |
Deployment with SQL DB using SQL authentication | tests/schemas/lz-healthcare/SQLDB-sqlAuth.json | parameters.sqldb.value.aadAuthenticationOnly is false & parameters.sqldb.value.sqlAuthenticationUsername filled in. |
Deployment with SQL DB using mixed mode authentication | tests/schemas/lz-healthcare/SQLDB-mixedAuth.json | parameters.sqldb.value.aadAuthenticationOnly is false, parameters.sqldb.value.aad* fields filled in & parameters.sqldb.value.sqlAuthenticationUsername filled in. |
Deployment with synapse using Azure AD only authentication | tests/schemas/lz-healthcare/Synapse-aadAuthOnly.json | parameters.synapse.value.aadAuthenticationOnly is true, parameters.synapse.value.aad* fields filled in |
Deployment with Synapse using SQL only authentication | tests/schemas/lz-healthcare/Synapse-sqlAuth.json | parameters.synapse.value.aadAuthenticationOnly is false & parameters.synapse.value.sqlAuthenticationUsername filled in. |
Deployment with Synapse using mixed authentication | tests/schemas/lz-healthcare/Synapse-mixedAuth.json | parameters.synapse.value.aadAuthenticationOnly is false, parameters.synapse.value.aad* fields filled in & parameters.synapse.value.sqlAuthenticationUsername filled in. |
Deployment without customer managed keys | tests/schemas/lz-healthcare/WithoutCMK.json | parameters.useCMK.value is false. |
This example configures:
- Service Health Alerts
- Microsoft Defender for Cloud
- Subscription Role Assignments using built-in and custom roles
- Subscription Budget with $1000
- Subscription Tags
- Resource Tags (aligned to the default tags defined in Policies)
- Log Analytics Workspace integration through Azure Defender for Cloud
- Automation Account
- Spoke Virtual Network with Hub-managed DNS, Hub-managed private endpoint DNS Zones, Virtual Network Peering and all required subnets and 2 optional subnets.
- Deploys Azure resources with Customer Managed Keys.
Note 1: Azure Automation Account is not deployed with Customer Managed Key as it requires an Azure Key Vault instance with public network access.
Note 2: All secrets stored in Azure Key Vault will have 10 year expiration (configurable) & all RSA Keys (used for CMK) will not have an expiration.
{
"$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentParameters.json#",
"contentVersion": "1.0.0.0",
"parameters": {
"serviceHealthAlerts": {
"value": {
"resourceGroupName": "service-health",
"incidentTypes": [ "Incident", "Security" ],
"regions": [ "Global", "Canada East", "Canada Central" ],
"receivers": {
"app": [ "[email protected]" ],
"email": [ "[email protected]" ],
"sms": [ { "countryCode": "1", "phoneNumber": "6045555555" } ],
"voice": [ { "countryCode": "1", "phoneNumber": "6045555555" } ]
},
"actionGroupName": "Service health action group",
"actionGroupShortName": "health-alert",
"alertRuleName": "Incidents and Security",
"alertRuleDescription": "Service Health: Incidents and Security"
}
},
"securityCenter": {
"value": {
"email": "[email protected]",
"phone": "6045555555"
}
},
"subscriptionRoleAssignments": {
"value": [
{
"comments": "Built-in Role: Contributor",
"roleDefinitionId": "b24988ac-6180-42a0-ab88-20f7382dd24c",
"securityGroupObjectIds": [
"38f33f7e-a471-4630-8ce9-c6653495a2ee"
]
},
{
"comments": "Custom Role: Landing Zone Application Owner",
"roleDefinitionId": "b4c87314-c1a1-5320-9c43-779585186bcc",
"securityGroupObjectIds": [
"38f33f7e-a471-4630-8ce9-c6653495a2ee"
]
}
]
},
"subscriptionBudget": {
"value": {
"createBudget": true,
"name": "MonthlySubscriptionBudget",
"amount": 1000,
"timeGrain": "Monthly",
"contactEmails": [
"[email protected]"
]
}
},
"subscriptionTags": {
"value": {
"ISSO": "isso-tag"
}
},
"resourceTags": {
"value": {
"ClientOrganization": "client-organization-tag",
"CostCenter": "cost-center-tag",
"DataSensitivity": "data-sensitivity-tag",
"ProjectContact": "project-contact-tag",
"ProjectName": "project-name-tag",
"TechnicalContact": "technical-contact-tag"
}
},
"logAnalyticsWorkspaceResourceId": {
"value": "/subscriptions/bc0a4f9f-07fa-4284-b1bd-fbad38578d3a/resourcegroups/pubsec-central-logging/providers/microsoft.operationalinsights/workspaces/log-analytics-workspace"
},
"resourceGroups": {
"value": {
"automation": "health-automation",
"compute": "health-compute",
"monitor": "health-monitor",
"networking": "health-network",
"networkWatcher": "NetworkWatcherRG",
"security": "health-security",
"storage": "health-storage"
}
},
"useCMK": {
"value": true
},
"keyVault": {
"value": {
"secretExpiryInDays": 3650
}
},
"automation": {
"value": {
"name": "automation"
}
},
"sqldb": {
"value": {
"enabled": true,
"aadAuthenticationOnly":true,
"aadLoginName":"DBA Group",
"aadLoginObjectID":"4e4ea47c-ee21-4add-ad2f-a75d0d8014e0",
"aadLoginType":"Group"
}
},
"synapse": {
"value": {
"aadAuthenticationOnly": true,
"aadLoginName": "synapse.admins",
"aadLoginObjectID": "e0357d81-55d8-44e9-9d9c-ab09dc710785",
"aadLoginType":"Group"
}},
"hubNetwork": {
"value": {
"virtualNetworkId": "/subscriptions/ed7f4eed-9010-4227-b115-2a5e37728f27/resourceGroups/pubsec-hub-networking/providers/Microsoft.Network/virtualNetworks/hub-vnet",
"rfc1918IPRange": "10.18.0.0/22",
"rfc6598IPRange": "100.60.0.0/16",
"egressVirtualApplianceIp": "10.18.1.4",
"privateDnsManagedByHub": true,
"privateDnsManagedByHubSubscriptionId": "ed7f4eed-9010-4227-b115-2a5e37728f27",
"privateDnsManagedByHubResourceGroupName": "pubsec-dns"
}
},
"network": {
"value": {
"peerToHubVirtualNetwork": true,
"useRemoteGateway": false,
"name": "health-vnet",
"dnsServers": [
"10.18.1.4"
],
"addressPrefixes": [
"10.5.0.0/16"
],
"subnets": {
"databricksPublic": {
"comments": "Databricks Public Delegated Subnet",
"name": "databrickspublic",
"addressPrefix": "10.5.5.0/25"
},
"databricksPrivate": {
"comments": "Databricks Private Delegated Subnet",
"name": "databricksprivate",
"addressPrefix": "10.5.6.0/25"
},
"privateEndpoints": {
"comments": "Private Endpoints Subnet",
"name": "privateendpoints",
"addressPrefix": "10.5.7.0/25"
},
"web": {
"comments": "Azure Web App Delegated Subnet",
"name": "webapp",
"addressPrefix": "10.5.8.0/25"
},
"optional": [
{
"comments": "Optional Subnet 1",
"name": "virtualMachines",
"addressPrefix": "10.5.9.0/25",
"nsg": {
"enabled": true
},
"udr": {
"enabled": true
}
},
{
"comments": "Optional Subnet 2 with delegation for NetApp Volumes",
"name": "NetappVolumes",
"addressPrefix": "10.5.10.0/25",
"nsg": {
"enabled": false
},
"udr": {
"enabled": false
},
"delegations": {
"serviceName": "Microsoft.NetApp/volumes"
}
}
]
}
}
}
}
}
Please see archetype authoring guide for deployment instructions.