Gauri Mahajan
Continuous Backup

Continuous Backups in Azure Cosmos DB

December 20, 2021 by

In this article, we will learn what continuous backup is and how to configure it on an Azure Cosmos DB account.

Introduction

Databases come in a variety of flavors like relational databases, NoSQL databases, document databases, graph databases, etc. Some databases provide a variety of APIs which allows using the same database but with a variety of APIs and/or Driver interfaces. Such databases may also be called multi-modal databases. Cosmos DB offers a variety of APIs that one can select while creating an instance of Azure Cosmos DB. As this database offers options to store and access a variety of data in a variety of formats, it’s obvious that this database would host vast amounts of data. With the large volume of data comes the need to backing up data. Some use-cases need data backed up at regular intervals while some use-cases need a continuous backup of the data so that it can be restored to any point-in-time to recover the data or extract a historic version of data for the required purpose. Azure Cosmos DB recently introduced a continuous backup option which is not generally available.

Continuous Backup

To follow the exercise in this article, we would need an Azure Account with administrative privileges to operate the Azure Cosmos DB service. It is assumed that such an Azure Account and setup is already available and ready for use. The Continuous backup option is not available for all the editions of Cosmos DB and works only under certain configurations. The focus of this exercise is to understand the use-cases and options under which continuous backup would work in Cosmos DB.

There are four restrictions (as of the draft of this article), which should be kept in view while considering using continuous backup.

  • The account type should be SQL API or API for MongoDB only
  • The account should not have multi-region writes enabled on it
  • The account should be using managed keys as the Customer Managed Keys are not supported yet
  • The analytical store i.e., the HTAP option should not be enabled as it won’t support backing up the analytical store data

Any use-cases that cannot fit within the above restrictions should not consider the continuous backup in Azure Cosmos DB till these restrictions exist. Cosmos DB performs a continuous backup of data from multiple regions in locally redundant storage blobs without consuming the provisioned capacity of the database. The below diagram from Microsoft Azure Cosmos DB documentation shows an illustrative view of continuous backup.

Continuous Backup

Now that we understand the concept of continuous backup, let’s learn how to create an Azure Cosmos DB account with continuous backup. Navigate to the dashboard of Cosmos DB and click on Create button to invoke a new account creation wizard. We need to select the Core SQL API as we would be using this in our exercise. It would bring up a page as shown below. We need to start with basic details like subscription name, resource group, new account name and the location in which the account would be hosted. For the capacity mode, we can choose provisioned throughput mode as well as serverless. Continuous backup is supported in both cases. Once done, move to the next step.

Core SQL Account

In this step, we need to configure the global distribution relation settings. The Geo-Redundancy setting can be enabled as we may want the data to be made geo-redundant for high availability and durability as well. If we enable, multi-region writes then we won’t be able to use continuous backup. We can though enable availability zones, but we cannot enable multi-region writes as it negates the restrictions laid out for the use of continuous backup.

Global Distribution Settings

In this step, we need to configure the networking settings. By default, the “All networks” option is selected which will work for our use case. For production environments, one would typically use a private endpoint, but for lower environments, one can use all networks as the starting point.

Networking Settings

In this step, we need to configure the backup policy. The periodic option works for cases where we need to back up the data at scheduled intervals. In use cases where we need continuous backup, we can select the continuous backup as shown below.

Backup Policy

In this step, we need to select the data encryption-related settings. As the continuous backup requires the use of a Service-managed key as the customer-managed key is not supported, we will continue with the default option of service managed key.

Encryption Settings

We can optionally add tags to this account if required. When using specific configuration or new features like continuous backup, it can be helpful at times to specify the same in tags so the administrator can easily identify the specifics.

Tags

In the final step, if the configuration has been done as explained above, one would get a validation success message in the final step of reviewing the configuration. Click on Create button and it would create an Azure Cosmos DB account with continuous backup.

Configuration Review

Once the account is created, one can navigate to the account features, and it would confirm that the continuous backup has been successfully configured and turned on for the account.

Cosmos DB Features

In this way, we can create a Cosmos DB account with continuous backup.

Conclusion

In this article, we learned the concept of continuous backup in Cosmos DB, the use-cases and restrictions under which it works, and the settings we need to configure to create a Cosmos DB account with continuous backup.

Gauri Mahajan
Azure, Azure Cosmos DB

About Gauri Mahajan

Gauri is a SQL Server Professional and has 6+ years experience of working with global multinational consulting and technology organizations. She is very passionate about working on SQL Server topics like Azure SQL Database, SQL Server Reporting Services, R, Python, Power BI, Database engine, etc. She has years of experience in technical documentation and is fond of technology authoring. She has a deep experience in designing data and analytics solutions and ensuring its stability, reliability, and performance. She is also certified in SQL Server and have passed certifications like 70-463: Implementing Data Warehouses with Microsoft SQL Server. View all posts by Gauri Mahajan

168 Views