Prashanth Jayaram

Graph database implementation with Azure Cosmos DB using the API

November 5, 2018 by

In my previous article, I’ve discussed a lot about the Graph database implementation with SQL Server 2017. In this case, we’ll see a walk-through of Graph API integration with Azure Cosmos DB.

Before we jump into the concepts though, let’s take a high-level overview of NoSQL databases. A NoSQL database is designed in such a way that no extra efforts are needed for the database to be distributed because NoSQL Database designed that way.

Note: In one of my previous article, I talked about differences between SQL vs NoSQL. I would recommend reading it for the better understanding of NoSQL concepts.

What is Azure Cosmos DB?

Having explained the basic characteristics of a NoSQL database, we can now take a look at what Azure Cosmos DB is all about. In sort, it is an extension of Document DB, which has been Microsoft’s NoSQL document database running on Microsoft Azure since 2015. Cosmos DB capabilities started with Document DB, which was already delivering low latency and high availability for schema-free JSON documents.

Azure Cosmos DB is Microsoft’s globally distributed, fully managed, multi-model database service with global distribution of the data, elastic scaling, automatic tuning, and provides querying capabilities, and it also support Gremlin standard. You can quickly create and query document, key/value, and graph databases, all of which benefit from the global distribution and horizontal scale capabilities at the core of Azure Cosmos DB. It also provides the ability to use multiple models like document and graph over the same data. For example, you can use a document collection to store graph data side by side with documents, and use both SQL queries over JSON and Gremlin queries to query the collection.

Prerequisites

  1. Requires an Azure subscription
  2. If you don’t have one, create a free Azure subscription account
  3. Basic knowledge of Graph databases
  4. Gremlin query
  5. Basic understanding of JSON documents

Getting started

Let’s jump in and get started.

The following step by step details gives you the required information to understand the concepts of the design of Azure Cosmos DB. In this section, let us a take a look at the ways to create, query, and traverse the Graph database models.

  1. To login to the Azure portal, browse portal.azure.com and enter the required credentials
  2. To create the Azure Cosmos DB, on the left part of the pane, click + New button and type the search string “Cosmos” to lookup for Azure Cosmos DB component

  3. Click the Create button at the bottom of the screen

  4. In the Azure Cosmos DB New account form, enter the required configuration for the Azure Cosmos DB account

      The supported programming models are:
    • Key-Value
    • Column family
    • Documents
    • Graph
  5. In the New Account page, enter the settings for the new Azure Cosmos DB account

    ID

    unique name

    The unique name graphdemo will be used to identify this Azure Cosmos DB account. The URI, graphdemodb.documents.azure.com is a new unique identifiable ID.

    API

    Gremlin (graph)

    The GraphAPI, Gremlin (graph) is selected out of five APIs suite.

    Subscription

    Your subscription

    Select Azure subscription. In this case its pay-as-you-go subscription is selected. This is type that is used for Azure Cosmos DB account.

    Resource group

    Enter the Resource Group

    Enter the new or existing resource group. In this case, the new resource group called graphresource is created.

    Location

    Select the nearest location

    Use the closest location that gives an optimal performance gain to access the data.

    Enable geo-redundancy

    Check/un-check

    To create a replicated version of the database over a second (paired) region.

    Pin-to-dashboard

    Select

    This option adds the database to the dashboard for easy access.

  6. Click Create


  7. The account creation step may take a few minutes

  8. Use the Data Explorer tool in the Azure portal to create a Graph database. Let us select the Quick start option and click the Create ‘Persons’ container

  9. Follow the below steps to create and query vertices in a Graph database under the ‘Persons’ container. This can be done in two ways:
    • The first, using GUI. Click the New Vertex button; this will open up a form to enter the properties of the vertices. Once done, click Ok button to create the vertices. The interface interprets the entered values as Gremlin commands. This you can see in the message pane at the bottom

    • The second, using Gremlin command. Let’s enter the following command in the Query filter section and click Apply filter to run the Gremlin query

      g.addV(‘person’).property(‘firstName’, ‘Prashanth’).property(‘lastName’, ‘Jayaram’).property(‘age’, 35).property(‘skillset’, ‘SQL’)

      You can run g.V().count() command to list all the vertices

    • Next, let’s create two more vertices for the input values Brian Lockwood with SQL as his skillset and Samir Behara with. Net as his skillset
      1. g.addV(‘person’).property(‘firstName’, ‘Brian’).property(‘lastName’, ‘Lockwood’).property(‘age’, 50).property(‘skillset’, ‘SQL’)
      2. g.addV(‘person’).property(‘firstName’, ‘Samir’).property(‘lastName’, ‘Behara’).property(‘age’, 35).property(‘skillset’, ‘.Net’)
    • Let’s run a simple command to list all the vertices. Type the following g.V() and click Apply filter

    • Let’s try simple queries to list the only person using the following query. You can also view the JSON document by selecting the JSON tab

      g.V().hasLabel(‘person’).has(‘firstName’, ‘Prashanth’)

    • To get the count of graphs use the simple g.V().count() command

  10. Steps to create edges in a graph database. In this section outlines the steps that define the relationship between the vertices. The relationship is “knows”. Let’s define a relationship that Brain knows Prashanth and Samir using firstName properties of the person label

    g.V().hasLabel(‘person’).has(‘firstName’,’Brian’).addE(‘knows’).to(g.v().hasLabel(‘person’).has(‘firstName’,’Prashanth’))

    g.V().hasLabel(‘person’).has(‘firstName’,’Brian’).addE(‘knows’).to(g.v().hasLabel(‘person’).has(‘firstName’,’Samir’))

    On clicking each vertex, we can see the relationship between all three

    g.V().hasLabel(‘person’).has(‘firstName’, ‘Brian’)

  11. Let’s see an example to update a vertex Prashanth with an age 34. This can be done in two ways:

    g.V().hasLabel(‘person’).has(‘firstName’, ‘Prashanth’).property(‘age’, 34)

    Let us query to find the persons who have SQL as his skillset

    g.V().hasLabel(‘person’).has(‘skillset’, ‘SQL’)

    SQL Query

    Now, let’s try to run a query against the dataset to understand some of the key aspects of Azure Cosmos DB’s SQL query language. For example, the following query returns the documents where the id field matches “1”. The output of the query is the complete JSON document.

    To run a query against a Persons collection

    1. Click New Query
    2. Type the following SQL in the query space which is marked as 2 in the below image

      SELECT { “id”:p.id, “Name”:p.name, “Age”: p.age} FROM Persons p where p.id=”1″

    3. To run the query, press the execute icon which is marked 3 in the below image
    4. The output is a JSON document is displayed in the result pane

    That’s all for now…

    Summary

    Thus far, we’ve covered a lot of ground in this first article. We started with NoSQL; comparison between SQL vs NoSQL. We then introduced Microsoft Azure Cosmos DB as Microsoft’s distributed NoSQL database. You also witnessed how to get started using the Azure portal to create a Cosmos DB account and integration of the Graph database API with Cosmos DB. Then we’ve created the sub-documents in the collection and then query the collection. In my next article, I will include a few complex models to give more insight into Cosmos DB integration. Thanks for reading this article. Feel free to comment below…


    Prashanth Jayaram
Azure, Graph database, SQL Azure

About Prashanth Jayaram

I’m a Database technologist having 11+ years of rich, hands-on experience on Database technologies. I am Microsoft Certified Professional and backed with a Degree in Master of Computer Application. My specialty lies in designing & implementing High availability solutions and cross-platform DB Migration. The technologies currently working on are SQL Server, PowerShell, Oracle and MongoDB. View all posts by Prashanth Jayaram

168 Views