This article is a part of the series – Learn NoSQL in Azure where we explore Azure Cosmos DB as a part of the non-relational database system used widely for a variety of applications. Azure Cosmos DB is a part of Microsoft’s serverless databases on Azure which is highly scalable and distributed across all locations that run on Azure. It is offered as a platform as a service (PAAS) from Azure and you can develop databases that have a very high throughput and very low latency. Using Azure Cosmos DB, customers can replicate their data across multiple locations across the globe and also across multiple locations within the same region. This makes Cosmos DB a highly available database service with almost 99.999% availability for reads and writes for multi-region modes and almost 99.99% availability for single-region modes.
In this article, we will focus more on how Azure Cosmos DB works behind the scenes and how can you get started with it using the Azure Portal. We will also explore how Cosmos DB is priced and understand the pricing model in detail.
How Azure Cosmos DB works
As already mentioned, Azure Cosmos DB is a multi-modal NoSQL database service that is geographically distributed across multiple Azure locations. This helps customers to deploy the databases across multiple locations around the globe. This is beneficial as it helps to reduce the read latency when the users use the application.
Figure 1 – Azure Cosmos DB Global Distribution (Source)
As you can see in the figure above, Azure Cosmos DB is distributed across the globe. Let’s suppose you have a web application that is hosted in India. In that case, the NoSQL database in India will be considered as the master database for writes and all the other databases can be considered as a read replicas. Whenever new data is generated, it is written to the database in India first and then it is synchronized with the other databases.
While maintaining data over multiple regions, the most common challenge is the latency as when the data is made available to the other databases. For example, when data is written to the database in India, users from India will be able to see that data sooner than users from the US. This is due to the latency in synchronization between the two regions. In order to overcome this, there are a few modes that customers can choose from and define how often or how soon they want their data to be made available in the other regions. Azure Cosmos DB offers five levels of consistency which are as follows:
- Bounded staleness
- Consistent prefix
Figure 2 – Consistency Levels in Azure Cosmos DB (Source)
In most common NoSQL databases, there are only two levels – Strong and Eventual. Strong being the most consistent level while Eventual is the least. However, as we move from Strong to Eventual, consistency decreases but availability and throughput increase. This is a trade-off that customers need to decide based on the criticality of their applications. If you want to read in more detail about the consistency levels, the official guide from Microsoft is the easiest to understand. You can refer to it here.
Azure Cosmos DB Pricing Model
Now that we have some idea about working with the NoSQL database – Azure Cosmos DB on Azure, let us try to understand how the database is priced. In order to work with any cloud-based services, it is essential that you have a sound knowledge of how the services are charged, otherwise, you might end up paying something much higher than your expectations.
If you browse to the pricing page of Azure Cosmos DB, you can see that there are two modes in which the database services are billed.
- Database Operations – Whenever you execute or run queries against your NoSQL database, there are some resources being used. Azure terms these usages in terms of Request Units or RU. The amount of RU consumed per second is aggregated and billed
- Consumed Storage – As you start storing data in your database, it will take up some space in order to store that data. This storage is billed per the standard SSD-based storage across any Azure locations globally
Let’s learn about this in more detail.
As we know, while interacting with the databases, we are going to make requests to the database service to provide us the data from the database. This operation is billed and the term in which the billing is calculated is Request Unit (RU).
Figure 3 – Request Unit Calculation for Azure Cosmos DB (Source)
The above figure shows the high-level idea of consumption of Request Units based on the operations performed on the database. There are three modes in which you can be billed while using Azure Cosmos DB which are as follows.
- Provisioned throughput mode – In this mode, you define the maximum RU per second and your application can be scaled up or down based on the 1000 RUs. You will be charged on a per-hour basis based on the highest number of RUs used in that hour. This is helpful if you have consistent traffic all the time on your databases
- Autoscale mode – If your traffic is inconsistent, then using the provisioned throughput mode is not ideal as you might end up paying a lot more and your resources will remain unutilized. The Autoscale mode helps us to instantly scale the RUs up and down based on the requests that are being made to the database. This reduces the billing to a large extent
- Serverless mode – This is a newly added pricing model for Azure Cosmos DB where you do not need to provision any RUs upfront. When you complete your billing cycle every month, you will receive a bill based on the number of RUs consumed for that particular month
Figure 4 – Pricing Detail for Azure Cosmos DB
In the above figure, you can see that how the provisioning was done (red line) in order to keep the service ready and available. In the provisioned throughput mode, you will always have some of the resources underutilized as the number of RUs vary from time to time. For the Autoscale mode, there is some flexibility to define the highest and lowest number of RU/sec and it will automatically scale the RUs based on the traffic. The serverless mode is the latest addition to the billing modes and it is calculated based on the number of RUs requested in the month.
Getting started with Azure Cosmos DB using the Azure Portal
Now that you have some idea about how Azure Cosmos DB works and the pricing model as well, let us take a look at the Azure portal and get started with it.
Navigate to https://portal.azure.com and login with your account credentials. Navigate to the Databases section and select Azure Cosmos DB.
Figure 5 – Getting started with Azure Cosmos DB
You will be navigated to the “Create Azure Cosmos DB account” page and we will learn more about it in my upcoming articles, so stay tuned!
In this article, we continued talking about NoSQL in Azure. Azure, as a leading cloud service provider, offers Azure Cosmos DB, a database as a service that is multi-model, meaning there are different types of database APIs that can be used within the Cosmos DB services. Cosmos DB is able to support a wide range of applications and also has a rich variety of features that make it highly scalable in nature. Due to this, Cosmos DB is mostly used in applications where very low latency and higher reads are supported. Additionally, we have also understood what the pricing units are for Azure Cosmos DB. In the upcoming articles in the series, I will provide more information about each of the database APIs and work along providing hands-on information from the Azure portal that would be beneficial for the readers.
Table of contents
|Learn NoSQL in Azure: An overview of Azure Cosmos DB|
|Learn NoSQL in Azure: Diving Deeper into Azure Cosmos DB|
|Learn NoSQL in Azure: Getting started with DocumentDB SQL API|