Azure Cosmos DB is a fully managed NoSQL database for modern multitenant application development. You can build applications fast with open source APIs, multiple SDKs, schemaless data and no-ETL analytics over operational data. Single-digit millisecond response times, and instant scalability, guarantee speed at any scale. Guarantee business continuity, 99.999% availability and enterprise-grade security for every application. End-to-end database management, with serverless and automatic scaling matching your application and TCO needs. Supports multiple database APIs including native API for NoSQL, API for Mongo DB, Apache Cassandra, Apache Gremlin and Table. It also started supporting PostgreSQL extended with the Citus Open Source which is useful for highly scalable relational apps.
To begin using Azure Cosmos DB, create an Azure Cosmos DB account in an Azure resource group in your subscription. You then create databases and containers within the account.
A single Azure Cosmos DB account can virtually manage an unlimited amount of data and provisioned throughput. To manage your data and provisioned throughput, you can create one or more databases within your account, then one or more containers to store your data.
Cost of database operations is normalized by Azure Cosmos DB and is expressed by Request Units (RU). It is a performance currency abstracting the system resources such as CPU, IOPS and Memory to perform the database operations supported by Azure Cosmos DB. You can examine the response header to track the number of RUs that are consumed by any database operation.
Azure Cosmos DB is a schema-agnostic database that allows you to iterate on your application without having to deal with schema or index management. By default, Azure Cosmos DB automatically indexes every property for all items in your container without having to define any schema or configure secondary indexes. When an item is written, Azure Cosmos DB effectively indexes each property's path and its corresponding value. In some situations, you may want to override this automatic behavior to better suit your requirements. You can customize a container's indexing policy by setting its indexing mode, and include or exclude property paths.
Access Azure Cosmos DB Documentation for more details and training.
Here are the scenarios where Azure Cosmos DB can help:
- Looking to modernize their monolithic on-premise applications as SaaS applications.
- scale up to the max throughput for addressing unpredicted workloads and scale down automatically.
- Goals to expand globally with low latency and highly scalable throughput.
- Trying to reduce costs to support multiple customers with fluctuating throughput requirement.
- Application needs to support multiple businesses with flexible schema.
- Unable to meet performance SLA requirements and reaching max storage limits with growing data.
All the above use cases need a new mindset and special features. This workshop will show you how Azure Cosmos DB will be the best option.
- Challenge-1: Deploy Azure Cosmos DB Service
- Challenge-2: Model data to build SaaS applications
- Challenge-3: Design Cosmos DB Account to serve small, medium and large customers
- Challenge-4: Validate Cosmos DB features Auto Failover, Autoscale and Low Latency
- Challenge-5: Build an application using Cosmos DB Emulator at no cost
By using partitions with Azure Cosmos DB containers, you can create containers that are shared across multiple tenants. With large containers, Azure Cosmos DB spreads your tenants across multiple physical nodes to achieve a high degree of scale.
Typically, you provision a defined number of request units per second for your workload, which is referred to as throughput. You can assign throughput at a container level, at a database level to share among the containers, automatically scale up to the max throughput for address unpredicted workloads and scale down to 10% of Max to save costs.
Fictitious ISV company called "Smart Booking Inc" has built an on-line reservation application called "EasyReserveApp" and currently deployed as an on-premises application to 4 hotel chains. This application is a big hit in the industry and they want to convert this application as a SaaS application to meet the global demand. They are looking for a database to handle unpredictable volume, maintain low latency response time to users at any part of the world, maintain high availability & business continuity with optimized cost based on the usage.
It currently has the following clients in the Hotel Industry:
This workshop gives you handson experience on designing Azure Cosmos DB for small, medium and large multi-tenant customers using this use case.
We have developed an Azure Deployment script to provision the required Azure Services used in the above architecture diagram.
1.1 Open a new InPrivate window from your Microsoft Edge browser.
1.2 Enter the following Workshop github link in the browser.
https://github.com/microsoft/CosmosDB_Multi-Tenant
1.3 Click Challenge-1 from Workshop Challenge List.
1.4 Click the "Deploy to Azure" button
1.5 Enter your Workshop login Id and Password provided by the instructor.
1.6 You will see Welcome to SDP Innovation! screen. Select Yes.
1.7. Enter the following options in the custom deployment screen.
-
Select your allocated resource group from the dropdown.
-
Select your region from the dropdown list for Cosmos DB Location selection.
Best practice is to keep resource group and Cosmos DB account in the same region. We had to set the resource group "West US" to automate the deployment script. Don't follow this practice in your environment.
1.8 Click on "Review+create" button.
1.9 It completes the validation as the next step and click on 'create' button.
It will provision Azure Cosmos DB account in your subscription:
It may take 2 to 5 minutes to create the services.
1.10 Click on "Go to resource group" when the deployment is complete.
It will take you to your resource group showing the installed Azure Cosmos DB services.
You have successfully deployed the required services to Azure. Congratulations for completing your first challenge.
Let us review the object model for this application and plan the data model for SaaS application.
Business_Entity object contains all the business entity data. Tenant object contains the tenant location address and contact info. Hotel_Room object contains catalog of rooms with type, view, number of beds etc. Customers object has all the customer profile data. Room_Inventory object contains all the room numbers with room type in each tenant location. Room_Availability object contains available dates with rate info for each tenant location. Hotel_Reservations object contains all the hotel reservations for each tenant location.
- Provide Hotel Room availability to support customer search based on location, dates and room types.
- Need to update availability after customer completes the reservation.
- Customer will access the reservation to review, cancel or update.
- Business owners and the support team at each tenant location will access the availability to book/change reservation as per their customer request.
You would want to keep all the relevant data in one object based on the highly frequent access patterns to write and read data.
As per the above diagram, it make sense to keep all the business entity information such as customers and room types along with tenant related data such as room inventory, availability and reservations in one Cosmos DB Container.
This challenge demonstrate how software object model transform to NoSQL database design model which completely different from SQL based databases. Cosmos DB Data Model requires a different mindset and also requires the knowledge of highly frequent access patterns.
Apply the same methodology to migrate your legacy applications or to build new green field applications. You have successfully completed challenge 2 by creating a Cosmos DB data model based on highly frequent access patterns!!
Evaluate options to keep relevant data in one logical partition using partitioning key.
It would be better to create one container per business entity and share the throughput at the database level. This will avoid creating one database per each customer and saves lot of money. You will have to understand that it may cause noisy neighbor problem. This approach tends to work well when the amount of data stored for each tenant is small.
It can be a good choice for building a pricing model that includes a free tier, and for business-to-consumer (B2C) solutions. In general, by using shared containers, you achieve the highest density of tenants and therefore the lowest price per tenant.
-
Access Cosmos DB Service in Azure Portal.
-
Select Data Explorer from the left panel.
-
Select SharedThroughputDB database.
-
Expand CasinoHotel container.
-
select Settings.
-
You will see tenantId as the partition key. It will create logical partitions per each tenant location.
-
Verify FamilyFunHotel Container for the partition key.
-
Select Scale property under SharedThroughputDB database.
-
It is set to use Autoscale up to 2,000 RUs and will share across all the containers (business entities).
Autoscale will fall back to 200 RUs (10% of max RUs) when no activity. This will save lot money since you don't have to allocate the maximum capacity all the time.
This database models serves multiple customers with multi-tenant data models with AutoScale capability to lower the costs and avoids creating single database per customer.
Hiking Hotel is a medium size business entity and you can avoid noisy neighbor issue by providing a dedicated throughput at the container level. Follow the steps to create a dedicated throughput as part of the shared throughput database.
Follow the steps to provision a new container with a dedicated throughput in a shared throughput database.
- Select New Container from the top bar inside Data Explorer.
- Select ShardThroughputDB under Use existing database dropdown
- Type HikingHotel as the container name
- Type /tenantId as partition key.
- Select Provision dedicated throughput for this container option.
- Set the Container Max RUs as 2000.
- click OK
You can provision dedicated containers for each business entity. This can work well for isolating large customers with higher throughput requirement and for providing dedicated capacity. It will provide guaranteed level of performance, serve medium size customers.
- Select DedicatedThroughputDB database
- Expand GoodFellas container.
- Select Scale & Settings
- It shows 1000 RUs as the Maximum RUs with Autoscale throughput option.
You can provision separate database accounts for each tenant, which provides the highest level of isolation, but the lowest density. A single database account is dedicated to a tenant, which means they are not subject to the noisy neighbor problem. You can also configure the location of the container data via replication according to the tenant's requirements, and you can tune the configuration of Azure Cosmos DB features, network access, backup policy, geo-replication and customer-managed encryption keys, to suit each tenant's requirements.
You can consider a combination of the above approaches to suit different tenants' requirements and your pricing model. For example:
- Provision all free trial customers within a shared container, and use the tenant ID or a synthetic key partition key.
- Offer a higher Silver tier that provisions dedicated throughput for the tenant's container.
- Offer the highest Gold tier, and provide a dedicated database account for the tenant, which also allows tenants to select the geography for their deployment.
To store multi-tenant data in a single container, Azure Cosmos DB provides partition key to distribute the data into logical partitions. By using tenantId as the partition key, Cosmos DB keeps the data for each tenant in one logical partition and will perform faster queries with low cost.
Partition Key plays a major role to save costs and to provide sub millisecond response time. Make sure to keep the partition key as part of most frequent queries. Cosmos DB also provides Document Type to keep relevant data in one container.
Download the Workshop Data zip file (Multi-Tenant_CosmosDB_Workshop_data.zip) from the provided github link data folder. Unzip the file into your local folder and you should see the following files.
Multi-Tenant databases have unique set of data per each tenant in our use case number of rooms, availability and reservations. It would make sense to create TenantID as the partition key and collect room availability and reservation info using document type.
You can also keep reference data such as Guest info and room type definitions in the same container.
- Expand Casino Hotels container under SharedThroughputDB database
- select items section
- Select upload to load the data files from the local foldder.
- Select CasinoHotel_RoomInventory.json file from the local folder.
- Select Open
- Click Upload button.
- Load Reservation data into the same container by selecting CasinoHotel_Reservation.json file.
- Follow the above steps to load the data.
- Follow the above steps to load the data.
- Follow the above steps to load the data.
You can query data using APIs and also can use Data Explorer for quick check.
- Select CasinoHotel container.
- Open Query window by selecting 'Folder with Search' icon from the top bar.
- Type the following Query
SELECT count(1) FROM c where c.type='Reservation'
- Select 'Execute Selection' from the top bar to execute the query
- Results tab will be display the results on the bottom.
- Query Stats next to results tab will show RU cost, size of the document and Query Execution time.
With this challenge you have gained a hands-on experience to create multi-tenant Cosmos DB database to support small, medium and large customers. Congratulations!!
Azure Cosmos DB is designed to provide multiple features and configuration options to achieve high availability to satisfy the mission critical enterprise application's requirement.
Replica outages refer to outages of individual nodes in an Azure Cosmos DB cluster deployed in an Azure region. Azure Cosmos DB automatically mitigates replica outages by guaranteeing at least three replicas of your data in each Azure region for your account within a four replica quorum.
In many Azure regions, it is possible to distribute your Azure Cosmos DB cluster across availability zones, which results increased SLAs, as availability zones are physically separate and provide distinct power source, network, and cooling. See Availability Zones. When an Azure Cosmos DB account is deployed using availability zones, Azure Cosmos DB provides RTO = 0 and RPO = 0 even with a zone outage.
Select 'Replicate data globally' under 'Settings' section in the left pane. It show all the available regions for Cosmos DB deployment. Availability Zone option for the write region can be enabled at the time of account creation.
select "+ Add region" to add a read region. Check the box for 'Availability Zone'. No save action needed for this lab.
Region outages refer to the outages that affect all Azure Cosmos DB nodes in an Azure region, across all availability zones. In the rare cases of region outages, Azure Cosmos DB can be configured to support various outcomes of durability and availability
To protect against complete data loss that may result from catastrophic disasters in a region, Azure Cosmos DB provides continuous and periodic backup modes.
It allows Azure Cosmos DB to fail over the write region of multi-region account. Region failovers are detected and handled by Azure and do not require any changes from the application.
Select "Service-Managed Failover" option to failover the database to read region at the time of region outage.
Select the "On" button under "Enable Service-Managed Failover".
No action is needed for this lab. It will take time to enable the failover option.
It allows you to scale the throughput (RU/s) of your database or container automatically and instantly. The throughput is scaled based on the usage, without impacting the availability, latency, throughput, or performance of the workload.
Autoscale provisioned throughput is well suited for mission-critical workloads that have variable or unpredictable traffic patterns, and require SLAs on high performance and scale.
Select 'Data Explorer' from the left pane and expand 'DedicatedThroughputDB' database.
Expand GoodFellasHotel container. Select Scale & Settings setting.
Change Max RU/s back to '2000' and select save button on the top bar.
It will change the throughput instantly without impacting the current workloads.
Select 'Data Explorer' from the left pane and expand 'SharedThroughputDB' database. Hover over 'CasinoHotel' container and select three dots. It provide options to create SQL Query, Stored Procedure, UDF & Trigers. Select the 'New SQL Query' option.
Type the following Query:
SELECT * FROM c where c.type='Reservation'
select "Query Stats" tab and check the Query execution time. It shows the sub millisecond response time.
Cosmos DB is a developer friendly database and supports SaaS applications with no schema and indexing to manage. It also provides built in Cache for improved performance. It provides Cosmos DB Emulator tool to build your applications using Cosmos DB in your development environment with no cost.
You can test building an application from Cosmos DB service portal itself. Select 'Quick start' from the left panel. It will you programming language options .NET, Xamarin, Java, Node.js & Python to choose.
Use the default .NET option.
5.1 Select create 'Items' container button.
It creates an Items container in "ToDoList" database with 400 RU throughput capacity.
5.2 Select Download button to download .NET app to your laptop.
5.3 Extract all from 'DocumentDB-Quickstart-DotNet.zip' file and open "CosmosGettingStarted.sln" in sql-dotnot folder with Visual Studio 2022.
5.4 Clean and rebuild the solution.
5.5 Put a breakpoint in GetStartedDemoAsync method at
this.ReplaceFamilyItemAsync();
5.6 run the debug program by selecting green start button.
This application creates documents in the Cosmos DB Items container and stops at the breakpoint.
5.7 Go back to Cosmos DB in Azure Portal and verify the data the application has created.
5.8 Come back to Visual Studio and continue the execution by selecting 'Continue' button. It will delete all the items it created in the Cosmos DB database.
5.9 Go back to Cosmos DB in Azure Portal and verify if the application has deleted the database, container and items.
You are successfully built an application to access Cosmos DB Service and to create database, container and populate with items. Congratulations!!.
The Azure Cosmos DB Emulator provides a local environment that emulates the Azure Cosmos DB service for development purposes. Using the Azure Cosmos DB Emulator, you can develop and test your application locally, without creating an Azure subscription or incurring any costs. When you're satisfied with how your application is working in the Azure Cosmos DB Emulator, you can switch to using an Azure Cosmos DB account in the cloud.
- Currently Windows Server 2016, 2019 or Windows 10 host OS are supported. The host OS with Active Directory enabled is currently not supported.
- 64-bit operating system
- 2-GB RAM
- 10-GB available hard disk space
- administrative privileges on the computer. The emulator will add a certificate and also set the firewall rules in order to run its services. Therefore admin rights are necessary for the emulator to be able to execute such operations.
Access Azure Cosmos DB Emulator for local development and testing for more details.
5.10 Download Azure Cosmos DB Emulator
download and install the latest version of Azure Cosmos DB Emulator on your local computer.
You will download azure-cosmosdb-emulator-2.14.9-3c8bff92.msi file to your local environment.
Run a DOS window as an administrator and run the install by entering the full file name at the prompt.
5.11 Installer launches the emulator in a browser with the following screen.
5.12 Copy the URI and Primary key to a notepad.
5.13 Open up Visual Studio Quick Start application and use the Solution Explorer to navigate to App.config. update EndpointUri and PrimaryKey which you copied from Cosmos DB Emulator.
5.14 Execute the application from Visual Studio with a breakpoint
5.15 Verify the data using the Explorer in the local Cosmos DB emulator tool
5.16 You can also run SQL queries to analyze the data using the SQL Query window similar to the tool you used with Azure Cosmos DB Service.
This is the best way to estimate your query costs and optimize your queries to save costs.
You have built your local environment to build applications on Cosmos DB with no costs till you are ready to deploy to Azure. Congratulations!!
This workshop provided you a hands-on experience to model data in Cosmos DB and optimize the throughput for small, medium and enterprise customers. Also gave an experience to build applications using Cosmos DB in your development environment with no cost.
Please send your feedback and suggestions to us!!