Hadi Fadlallah
Imported data into Neo4j

Migrating SQL Server graph databases to Neo4j

March 9, 2021 by

This article provides a step-by-step guide for migrating SQL Server graph databases to Neo4j using C#.

Why migrating graph databases from SQL Server?

Even if Microsoft introduced the SQL Server graph database in 2017, this feature is still far from the leading database providers in this domain, especially Neo4j. There are many articles published online where you can find different comparisons between graph technologies. In brief, the following are some reasons to migrate graphs from SQL Server to Neo4j:

  1. Neo4j reached a higher level of maturity since it was first released more than 10 years
  2. While SQL Server graphs are slowly evolving and improved, a wide community contributes to the Neo4j open source project
  3. Neo4j is open-source, while SQL Server graph technology is not
  4. Cypher is more straightforward than the extended graph functionalities in SQL Server
  5. In SQL Server, graphs cannot be visualized without an external tool
  6. Most importantly, Microsoft is a customer for Neo4j!

neo4j customers

Figure 1 – Neo4j customers

When are SQL Server graphs useful?

One question comes to mind: “Are SQL Server graph functionalities useless?” In my opinion, there is no general answer to this question. Suppose your data is stored within SQL Server databases. You need to implement some graphs within your ecosystem and integrate them with relational data. In that case, SQL Server graph database is preferable. Moreover, you may need all functionalities provided by Neo4j in several cases.

You can also direct to this article, An introduction to a SQL Server 2017 graph database, to gain a quick understanding of SQL Server graph databases.

Migrating SQL Server graphs to Neo4j

Even if I prefer using SSIS for data transfer operations, Neo4j doesn’t have any official or stable (free) SSIS component. While searching, I found a third-party component that is still in the beta version.

Another approach for migrating SQL Server graphs to Neo4j is to export data into flat files and then import them into Neo4j.

The third approach is to develop a small application using C# to migrate Nodes and Edges created in SQL Server to a Neo4j database. This approach is explained in detail in this section.

Note that in this tutorial, we will use Visual Studio 2019. And the source SQL Server database is built based on this article published previously on SQL Shack: Understanding graph databases in SQL Server.

Create a new console application project

First of all, we should open Visual Studio and create a new console application, as shown in the images below:

creating a console application project

Figure 2 – Adding a console application

Configuring console application

Figure 3 – Naming project

Installing NuGet packages

To build this project, we have to install two NuGet packages:

  1. Neo4jClient: A third party class library that is used to connect with Neo4j from .Net applications
  2. System.Data.SqlClient: A library developed by Microsoft, previously it was a part of the .Net framework. Since we are using .Net core to build the application, we need to install it using NuGet packages

To install NuGet packages, in the Visual Studio menu strip, go to Tools > NuGet package Manager > Manage NuGet packages for solution…

Managing NuGet packages

Figure 4 – Opening Nuget packages manager

When the Nuget packages manager window appears, go to Browse, and search for Neo4jClient. Then, as shown in the image below, click on the package, select the project, and click “Install”.

Installing Neo4jClient package

Figure 5 – Install Neo4jClient NuGet package

Once installed, repeat the same steps with the System.Data.SqlClient NuGet package.

Adding Column, Node, and Edge classes

In order to read and write columns, nodes, and edges information, we will create three classes as follows:

Column.cs class

Node.cs Class

Edge.cs

Reading data from SQL Server

To read data from the SQL Server database, I create a class called SQLReader.cs. The only parameter the user needs to pass in this class constructor is the connection string (SQL Server instance + database + authentication).

I will define two functions GetNodes() and GetEdges() to read nodes and edges from the SQL Server dabatase.

SQLReader.cs Class

Writing nodes and edges to Neo4j

To write exported nodes and edges from SQL Server into the Neo4j database, I am going to create Neo4jWriter.cs class. This class takes the Neo4j database URI as a parameter. I am also going to define two functions ImportNodes() and ImportEdges() to import nodes and edges into the Neo4j database.

Neo4jWriter.cs Class

Importing data into Neo4j

After creating the classes we mentioned above, we will use the following code in the Program.cs class to import data from an SQL Server graph database into Neo4j database.

Note that you should first start the Neo4j database, as explained in this series’s previously published article.

To ensure that data is imported successfully, you can open the Neo4j desktop (or from a web browser) and run the MATCH(n) RETURN n Cypher command from the database console to retrieve all data in the database.

As shown in the image below, the data is migrated successfully.

Imported data into Neo4j

Figure – Visualizing migrated data

Conclusion

In this article, we first explained some reasons to migrate graph databases from SQL Server into Neo4j. Then we provided a step-by-step guide to migrate Nodes and edges using C#. The code provided may or may not be optimal. I published it on GitHub, so feel free to use it or improve it.

Table of contents

Import data from MongoDB to SQL Server using SSIS
Getting started with the Neo4j graph database
Migrating SQL Server graph databases to Neo4j
Export indexes and constraints from SQL Server graph databases to Neo4j
Migrating SQL Server graph databases to ArangoDB

Hadi Fadlallah
Graph database, SQL Server 2017

About Hadi Fadlallah

Hadi is an SQL Server professional with more than 10 years of experience. His main expertise is in data integration. He's one of the top ETL and SQL Server Integration Services contributors at Stackoverflow.com . Also, he published several article series about Biml, SSIS features, Search engines, Hadoop, and many other technologies. Besides working with SQL Server, he worked with different data technologies such as NoSQL databases, Hadoop, Apache Spark. He is a Neo4j and ArangoDB certified professional. On the academic level, Hadi holds two master's degrees in computer science and business computing. Currently, he is a Ph.D. candidate in data science focusing on Big Data quality assessment techniques. Hadi really enjoys learning new things everyday and sharing his knowledge. You can reach him on his personal website. View all posts by Hadi Fadlallah

751 Views