Daniel Calbimonte

What is Azure SQL Cosmos DB?

May 10, 2016 by

Introduction

In the Azure Portal, you will find the option to install Azure SQL Cosmos DB. So, what is Cosmos DB, exactly?

Figure 1. Azure Cosmos DB in the Azure Portal

This article will clarify what Azure SQL Cosmos DB is, how much it costs, how to integrate it with SQL Server, how to scale the database and other related information.

What is Azure SQL Cosmos DB?

It sounds like a Database to store documents, but it is a NoSQL Database used to store non-structured data in Azure. It was created by Microsoft as a high performance non-normalized database. It is flexible, with high availability and automatic scaling.

Cosmos DB, is not used to store documents like MS Words or PDF documents. It is used to store JSON documents.

What is JSON?

JSON (Java Script Object Notation) is a standard object notation used to transmit data objects. It is very popular in Web Services where it is replacing XML. It started on Java Script, but now it is a standard supported in several programming languages.

Is it hard to learn JSON?

No, if you are a programmer, it will be a straightforward to learn JSON, because the object notation is very easy. If you are a DBA, it might be a little bit more difficult. In SQL Server 2016, JSON support is included, so, it looks like it is something that a DBA needs to eventually learn. In SQL Server if you are familiar with XML, using JSON will be very easy because the sentences are similar.

JSON syntax example

Here is an example for learning JSON:

It is not hard.. The customer in the sample is the object and the firstName, lastName and email are the names with their values. The syntax is easy to read and very flexible. The data is separated by commas. Curly braces contain the objects and the square brackets contain arrays.

If you have worked with C# and Java, you know that it not so easy to parse elements from XML (at least not the first time). In JSON it is easier. That is why XML is dying and JSON is replacing it (however, in SQL Server using JSON is not easier than XML).

JSON is very common also in Big Data (Azure HDInsight for example).

Figure 2. The JSON logo

Q: OK, everything is stored in JSON documents, but how can I query the Azure SQL Cosmos DB data?

A: You can use traditional SQL. It may sound weird because Cosmos DB is for non-structured data, but Microsoft tried to make our life easier. That is why you can easily learn how to query Cosmos DB.

Q: Cosmos DB is not related to Azure SQL Server. In which scenarios should I use Azure SQL Server and in which scenarios Cosmos DB?

A: Cosmos DB is used specially in gaming, Social media and IoT (Internet Of Things) applications. If you have an application and it need to scale around the world and you have millions of users, Cosmos DB in Azure is a simple and inexpensive solution. Azure SQL for several millions of users is harder to scale. In general, a traditional Relational Database cannot scale as easy as simple JSON documents. The other problem is the data. Some applications require images and videos. It is really hard to store and handle these types of data in a relational database like Azure SQL.

A SQL Azure database is a great solution when you do not need to scale and deal with millions of users at the same time and the relational model is also a good idea.

What customers are using Azure SQL Cosmos DB?

One important customer is MSN.com, a web portal with half a billion monthly users and 20 TB of document storage.

Next games, a gaming company required a low latency solution to store for millions of users. They required to store millions of profiles from Facebook, Google, GameCenter and store the scores, chat messages, challenges, etc.

Halo 5, a popular XBOX game with more than 65 millions of users.

As you can see, usually we talk about several millions of users, low latency and a scalable solution.

Sounds fine. What is the price?

There are two ways:

Pre-defined and User defined billing.

Pre-defined performance billing that depended on the data storage and the request units.

A S1 collection with 10 GB of storage and 250 request units per second costs $25 per month and a S3 with the same storage and 2500 request units per second costs $100 per month.

The User-defined performance billing offers a storage of $0.25 per GB per month and each 100 request units reserved per $6 per month.

What are the request units?

A Request Unit is the throughput to get 1 KB document.

What is a collection?

It is a container of JSON documents. You can have several collections in the same database. The size of a collection is unlimited because it can be compound of multiple disks from multiple servers.

Figure 3. A Cosmos DB collection

How can I migrate from SQL Server to Azure SQL Cosmos DB?

You can use the Cosmos DB Data Migration Tool. It is an open source to import data from SQL Server and other sources like CSV files, MongoDB, Azure Table Storage and other sources.

How does the Cosmos DB scale?

The collections can be stored in multiple partitions. The information is stored in multiple solid state disks. If you need more disks, the database grows. The scaling is unlimited. For the partition it is used a partition key. The number specifies the partition used.

How can I retrieve the Azure SQL Cosmos DB information?

Cosmos DB has a REST API that allows it to be called by any language capable of handling HTTP and HTTPS requests. There are libraries for Java, .NET, Python.

Conclusion

Azure SQL Cosmos DB is a new NoSQL database offered in Azure. It is a simple and inexpensive solution in the cloud used specially in Online Games, IoT applications and Social Media and other application with millions of users. It is possible to migrate data from SQL Server to Cosmos DB. The information is stored in JSON documents.

The use of the NoSQL databases is growing because of the volume of data required to store them.

Daniel Calbimonte

Daniel Calbimonte

Daniel Calbimonte is a Microsoft Most Valuable Professional, Microsoft Certified Trainer and Microsoft Certified IT Professional for SQL Server. He is an accomplished SSIS author, teacher at IT Academies and has over 13 years of experience working with different databases.

He has worked for the government, oil companies, web sites, magazines and universities around the world. Daniel also regularly speaks at SQL Servers conferences and blogs. He writes SQL Server training materials for certification exams.

He also helps with translating SQLShack articles to Spanish

View all posts by Daniel Calbimonte
Daniel Calbimonte

Latest posts by Daniel Calbimonte (see all)

168 Views