Frank Solomon
Creating a new Cosmos DB collection.

Link an Azure Cosmos DB into a SQL Server Stored Procedure

August 3, 2020 by

Introduction

Microsoft Azure offers hundreds of cloud products, with more on the way. In the database space, Microsoft offers Azure Cosmos DB, a NoSQL database product. We can build Java, .Net, etc. applications around Cosmos DB resources. To extend this capability, this article will show how to query Cosmos DB directly from SQL Server.

The SQL Server database

We’ll start with the sample OFFICE_EQUIPMENT_DATABASE database seen in my earlier SQL Shack articles, with modifications. Built-in SQL Server 2014 Standard on a Windows 10 Education edition device, it has three tables, as shown below:

  • OFFICES
  • OFFICE_EQUIPMENT
  • OFFICE_MANAGERS

And this stored procedure, SP_SQLSERVER_COSMOSDB_OFFICE_EQUIPMENT.

Run the first script in SQL Shack article Lever T-SQL to dynamically define duplicate SQL Server database table rows to build the basic OFFICE_EQUIPMENT_DATABASE database. Then, run the below script to build the resources in that database that we’ll need for this article:

We’ll use Azure Cosmos DB to host data that the SQL Server stored procedure will query. Although Azure products can become expensive, depending on data volumes and product use, Azure offers a free trial pricing plan. The data we’ll host will fall within the trial plan limits. This SQL Shack article,Getting Started with Azure Cosmos DB, explains how to get going with it, so we won’t cover that topic in this article. For this article, we’ll need an Azure Cosmos DB resource with firewall and virtual network settings that will accept queries that originate from the internet.

Placing data in the Azure Cosmos DB

The original OFFICE_EQUIPMENT_DATABASE table has two tables that we’ll clone over to Cosmos DB. We’ll focus first on the original OFFICES table. This table has six rows. The data we’ll import needs a JSON format, so first, we must convert the source data. The screenshot below shows the original data:

Source data for a Cosmos DB collection.

This query builds the result set seen in the above screenshot:

In an earlier SQL Shack article, How to import/export JSON data using SQL Server 2016, Marko Zivkovic explained that in SQL Server 2016 and later, the FOR JSON clause export data as JSON. For earlier SQL Server versions, we’ll use the STUFF< function, as shown below:

Every source table column becomes an expression in the SELECT clause. If needed, JSONLint, for example, can validate the output JSON format. In this query, the FOR XML PATH clause places each row into a formatted JSON row, with key/value pairs that match the column/value pairs of the original rows. To get the data ready, the empty (”) value in the FOR XML PATH() clause at line 10 separates each XML row with a default comma. At line 11, the STUFF function arguments format the result set as a string and remove the leading “.” in the original data. Save the finished result XML-format result set as a JSON file. This file will become the data we’ll import.

A Cosmos DB database has zero or more collections, which correspond to SQL Server tables. A collection has zero or more documents, which correspond to SQL Server table rows. In the Cosmos DB

OFFICE_EQUIPMENT_DATABASE database, we’ll create a collection called OFFICES, and then place the OFFICES table rows from the SQL Server OFFICE_EQUIPMENT_DATABASE into that collection. Similarly, create an OFFICE_MANAGERS collection. First, create a database called OFFICE_EQUIPMENT_DATABASE beforehand. This screenshot shows the new database:

A Cosmos DB in the Cosmos DB Data Explorer.

Click OFFICE_EQUIPMENT_DATABASE and click the dots at the right, as shown in this screenshot:

Creating a new Cosmos DB collection.

Drill down to New Container, as shown in this screenshot:

Creating a new Cosmos DB collection.

This opens the Add Container pane. We’ll call the container “OFFICES” and we’ll add OFFICE_ID as the partition key, as seen in this screenshot:

Creating a new Cosmos DB collection.

Click OK. Then, click OFFICES -> Itemsand drill into Upload Items, as seen in this screenshot:

Uploading data to a Cosmos DB collection.

Click Upload Item to open the Upload Items pane, as seen in this screenshot:

Pick a source directory to upload data to a Cosmos DB collection.

Click the circled icon to open a file picker, to upload the JSON format OFFICES table data file we built earlier. Verify that the upload steps worked with the query:

You can see in the below screenshot:

Examine Cosmos DB collection data.

With similar steps, use the SQL Server OFFICE_MANAGERS table to create the OFFICE_MANAGERS collection OFFICE_EQUIPMENT_DATABASE. At that point, we can access both collections in the Data Explorer, as seen in this screenshot:

Access the collections of an Azure Cosmos DB.

Configure the ODBC connection

We need to make the Cosmos DB resource visible to our SQL Server instance. We’ll build a new ODBC connection for this, with a specific Microsoft tool. Starting at this page, download the relevant MSI file.

On that page, Microsoft provides the Microsoft Azure Cosmos DB 64-bit.msi (installs the ODBC Data Sources 64-bit application) and Microsoft Azure Cosmos DB ODBC 32×64-bit.msi (installs the ODBC Data Sources 32-bit application) installation files as downloads for 64-bit Windows devices. This article relied on the second MSI file.

At the Azure portal, select the OFFICE_EQUIPMENT_DATABASE, and in the Settings category, click Keys, as shown in this screenshot:

Keys of a Cosmos DB.

Copy the URI and PRIMARY KEY values, blanked out in the screenshot above. Next, open the ODBC Data Source Administrator tool, and click the System DSN tab, as shown in this screenshot:

Use this tool to build a new ODBC connection to a Cosmos DB.

Click Add to open the Create New Data Source window. Scroll down to Microsoft Azure DocumentDB ODBC Driver and click Finish, as shown in this screenshot:

Pick the correct driver type for the new ODBC connection to the Cosmos DB.

The DocumentDB ODBC Driver DSN Setup window will open, as shown in this screenshot:

Use the Setup window to build the ODBC connection to the Cosmos DB.

We copied values from the Keys page earlier; place the URI value in the Host box, and the PRIMARY KEY value in the Access Key box. Except for the Host textbox value – blanked out in this screenshot – it should have these values as shown:

Place the correct values in the ODBC Driver DSN Setup window.

For this connection, ignore Logging Options, Schema Editor, and Advanced Options; click Test… to test the driver, and finally, click OK. The Administrator tool System DSN tab will list the driver, as shown in this screenshot:

The ODBC connection to the Cosmos DB, tested and ready to go.

Click OK to close the tool. When building the ODBC connection, the first 64-bit MSI file might look like the logical choice to install the tool. However, it could lead to an error message, as shown in this screenshot:

Connecting to a Cosmos DB using an ODBC driver built with the Microsoft Azure Cosmos DB ODBC 64-bit.msi tool could lead to an error.

In that case, try the second 32×64-bit MSI file, and proceed.

Build a Linked Server

In a SQL Server instance, we can use a SQL Server Linked Server to work with a data resource outside of that instance. From the OFFICE_EQUIPMENT SQL Server database that we built, we want to query data in the OFFICE_EQUIPMENT Cosmos DB database. We just built an ODBC driver that can see that database, so we have everything we need to build a Linked Server. First, right-click Linked Servers in Object Explorer, and click New Linked Server… as shown in this screenshot:

Build a SQL Server Linked Server - first step.

The New Linked Server dialog box will open. Fill it in as shown in this screenshot:

Fill in the SQL Server New Linked Server window.

In the first textbox, we named this Linked ServerCOSMOSDB’. In the Provider dropdown, pick the Microsoft OLE DB Provider for ODBC Drivers option. In the Data source textbox, use CosmosDB – the name of the ODBC driver we built above. Click OK. Drill down to see the collections in the SQL Server Object Explorer, as shown in this screenshot:

Cosmos DB collections available through the COSMOSDB Linked Server.

Link an Azure Cosmos DB into a SQL Server Stored Procedure

The OFFICE_EQUIPMENT_DATABASE has one stored procedure, SP_SQLSERVER_COSMOSDB_OFFICE_EQUIPMENT , with this code:

Also seen in the following screenshot:

At line 3, this stored procedure has one integer parameter, @OFFICE_ID, which can have a NULL value. It has two derived tables, one between lines 19 and 26, and one between lines 29 and 35.

The first derived table has a conventional SELECT clause, as shown in this code:

In the FROM clause, the OPENQUERY function supplies the basic set of rows, from the Azure Cosmos DB database, for the query itself. In the OPENQUERY function, the first parameter specifies the name of the linked server we want to use. Here, we’ll use COSMOSDB. The second parameter specifies a query string that we want to execute at the linked server. Lines 23 and 24 list the column names of the originally uploaded data, and we could easily place only the columns used in the SELECT clause in lines 20 and 21. At lines 23 and 24,

SELECT * would also work. However, this would include the following :

  • _attachments
  • _etag
  • _rid
  • _self
  • _ts
  • id

Line 26 aliases the derived table as COE, for “Cosmos Office Equipment.” Line 18 LEFT OUTER JOINs the COE derived table to the OFFICE_EQUIPMENT SQL Server table on the OFFICE_ID column, to catch OFFICE_EQUIPMENT rows that have no OFFICE_ID values. Between lines 29 and 35, the second derived table operates in a similar way. Line 28 LEFT OUTER JOINs the COM, or Azure Cosmos DBCosmos Office Manager” derived table, to the OFFICE_EQUIPMENT SQL Server table on the OFFICE_ID column. This catches OFFICE_EQUIPMENT rows that might have no OFFICE_MANAGER_ID value. At line 37, the WHERE clause uses the @OFFICE_ID parameter to filter the result set at the OFFICE_EQUIPMENT.OFFICE_ID value. If @OFFICE_ID has a NULL value, the WHERE clause filters no rows and returns the entire unfiltered result set. This screenshot shows tests of the SP_SQLSERVER_COSMOSDB_OFFICE_EQUIPMENT stored procedure:

Testing the SP_SQLSERVER_COSMOSDB_OFFICE_EQUIPMENT stored procedure.

Conclusion

As the importance of cloud products grows, we need a way to integrate cloud data resources – including NoSQL resources like Azure Cosmos DB – with the SQL Server data products that we build. This article shows how to do it.

Frank Solomon
168 Views