Haroon Ashraf
SQL Machine Learning Service using Python Language Import SQL data Manipulate data in Python Export data in SQL

SQL Machine Learning in simple words

May 15, 2023 by

This article is about SQL Machine Learning which is one of the most interesting topics equally attractive to both beginners and professionals of different areas of expertise.

Most importantly this article intends to give the basic understanding that anyone with some background in databases or business intelligence requires to begin their SQL Machine Learning journey.

Additionally, easy-to-understand tips equipped with concepts that help the readers to know the connection between SQL and machine learning along with simple examples to implement at your own pace are part and parcel of this article.

About Machine Learning

Let us first define machine learning so that we know the very basics of what we are going to explore further ahead in this article.

What is Machine Learning

Machine learning is a field of Artificial Intelligence or Computer Science which makes use of (data) models that improve their accuracy with time through experience somewhat similar to humans but with a lot more limitations.

Simple definition

We can simply say that it is all about making machines (data models) learn independently (automatically with limitations) without any pre-programmed or hard-coded logic or instructions.

Who works in this field

Data scientists, Machine Learning Engineers, Data Analyst Programmers, and BI Professionals are capable of working in this area; however, it cannot only be limited to these professionals.

We can say any data professional with the right capability can make use of SQL Machine Learning or similar technologies to achieve their business or professional life objectives whether it be a company revenue generation goal or highly beneficial research in the field of medicine.

About SQL Machine Learning (ML)

As the name suggests SQL ML is a service that you install on your SQL Server instance to use it for the desired purposes.

Two modes of installation

The SQL Machine Learning service can be installed in the following two ways:

  1. You install the SQL Server instance, and you add the service straight away during the installation
  2. You have already installed a SQL instance and you add it later on as a feature

Three flavors (languages) of installation

There are three flavors for using SQL Machine Learning which means we can add any of the three languages to utilize this service:

  1. Python
  2. R
  3. Java

Three Benefits

Now we know that SQL Server (databases) is one of the best places for the data to reside but in the past Data, Scientists had to take the data out of SQL to work on it for machine learning purposes.

So, SQL ML provides three main benefits including addressing the above-mentioned concerns:

  1. In-Database processing (no need to take data out of SQL)
  2. Full data set processing (we can have full-length data set processing without worrying about space or storage)
  3. Built-in Security wrapper (SQL security remains intact during the ML processes as well)

Using and running Python Code (SQL Machine Learning)

There are many merits of using Python code in SQL as part of ML:

  1. The same query editor that we use for writing and running SQL scripts can be used for running Python code
  2. The Python coding is supported in both main Microsoft SQL database development and management tools like SQL Server Management Studio (SSMS) and Azure Data Studio.
  3. You have the freedom to store Python code as procedures
  4. The stored procedures that contain Python code can be reused by other developers easily

SQL Machine Learning (ML) Installation

Let us assume we are installing SQL Server for the first time, and we want to install SQL ML service at the same time so this means we have to install it as a feature.

Please remember if you have multiple instances installed then you can add ML service to any of the installed instances plus it is an optional service that you have to exclusively choose to install.

Please download the SQL Server 2019 version or any compatible version that supports ML by visiting the following page:

SQL Server Downloads | Microsoft

Run the setup and choose New SQL Server stand-alone installation:

SQL Server Installation Center 
New SQL Server installation

After running some configuration checks then Select Python under Machine Learning Services and Language in the Feature Selection tab:

Selecting SQL Machine Learning Services and Language (Python) to be installed

Accept the terms:

Accepting the terms.

If the Next button is disabled then you have to get Python files from the links mentioned below:

Following the links to get the desired Python files in order to install SQL Machine Learning feature with Python language.

Once the installation files are downloaded, please copy them into the Setup folder as follows:

Setup folder with Python cab files

You should be now able to proceed further with SQL Server installation along with SQL Machine Learning service:

Successful installation.

Verifying SQL Server Machine Language Installation

Open SQL Server Management Studio (SSMS) or Azure Data Studio and connect to your locally installed SQL Server instance:

Connecting to Server (SQL 2019).

Enable External Scripts (sp_configure)

Now the first thing is to enable the external scripts and here in our case, it is Python scripts.

Please run the following script against the Master database of the instance:

The following output is shown:

External script enab

Check the status of the external script

Please run the following statement to check the status now:

The output is shown below:

external script enabled 1 1 1

Please note even after enabling the external scripts you still see that config_value or run_value is 0 for the external scripts enabled column then please consider restarting the SQL instance by clicking on the name of the instance in the Object Explorer and clicking Restart.

Test Python Code

Let us now test some Python code.

Please remember Python is strictly case sensitive (Print is not the same as print) and the code should be well indented too.

Print out

You can simply print out the article titled SQL Machine Learning in Simple Words in Python by using the print command as follows:

The output is as follows:

The output shows SQL Server Machine Learning in Simple Words

Get System Version

You can view the system version by the following code:

The output is as follows:

System version: 3.7.1

Using Variables in Python

Unlike SQL you don’t need to declare a variable type in Python. A python variable can simply be defined and initialized.

Subtracting two numbers in Python

To subtract two numbers in Python we run the following script using two variables First and Second:

The results are as follows:

The result is 100 (SQL Machine Learning)

Loop in Python

We can also use looping constructs in Python (SQL Machine Learning language) such as While or For loop. A simple While loop can be executed as follows:

The results are shown below:

The output is:
Counter 1
Counter 2
Counter 3
Counter 4
Counter 5

SQL Machine Learning (ML) using Python

The three most important objectives of SQL Machine Learning using Python are as follows:

  1. We need to send/load data from SQL into Python (script)
  2. Process the loaded/imported data in Python
  3. Return the processed data to SQL (in a form understandable by SQL)

The Role of InputDataSet variable from Query Output (results) to Python Processing

In other words, we need to understand the following flow:

  1. SQL query output is changed into the Python script
  2. This is done by redirecting the SQL output to a special variable called InputDataSet
  3. Then Python Data Frame processes the data that is in InputDataSet

Now we simply need T-SQL to import SQL data into Python for machine learning purposes such as to be able to process it via Python since Python is a highly efficient language for data processing.

Another way to understand this is as follows:

  1. Import (data from SQL into Python)
  2. Manipulate (data in Python)
  3. Export (data from Python back to SQL)

This can be illustrated as follows:

SQL Machine Learning Service using Python Language
Import SQL data Manipulate data in Python Export data in SQL

Setting up the Sample database

Let us quickly build a sample database called StudentsExamDemo:

SET IDENTITY_INSERT [dbo].[Student] OFF

Load data from SQL into Python (Import data into Python using T-SQL) – InputData1

In the pursuit of SQL Machine Learning let us now perform the first step of loading this data into Python script which means we are going to convert this data into a Python Data Frame.

Now we have to think in terms of variables to do these tasks. For example, there is a variable that will be assigned to the SQL script to be loaded into the Python script for the second step which is manipulating data. We make use of the input_data_1 property to initialize with SQL script of interest to be then manipulated by Python (data frame).

Build and run the script as follows:

The results are as follows:

The command ran successfully

The importance of the Input_Data_1 property and InputDataSet variable

Now, input_data_1 is the property that is going to take the SQL script and convert it into a data frame for Python. When input_data_1 converts the SQL output (resultset) into a data frame it is called Pandas Data Frame and it can be accessed by a variable called InputDataSet.

So, InputDataSet is the variable that can display the contents of the result set ready to be processed (manipulated) using SQL Machine Learning service compatible language like Python.

We can see all this in action by running the following script:

The output is as follows:

STDOUT message(s) from external script

So, we can see that the variable InputDataSet shows the contents of the Python Data Frame which was first imported into Python with the help of the input_data_1 property.

An interesting point to note here is that in Python the index begins with 0.

Manipulating the Python Data Frame (Processing/Manipulation)

Now we have the data in the required form but the reason we did all that is just to be able to use the best technology (SQL Machine Learning service) to manipulate the data to explore it and to get value out of it.

To manipulate the data (which now exists as Python Data Frame) we make use of Python libraries also called Pandas Data Frame API reference which is a list of all the available functions that we can use to understand the data from SQL Machine Learning perspective.

Finding integer location iloc of the data (pandas.DataFrame.iloc)

Let us find the integer location of the data by using the iloc property of the Pandas Data Frame:

The result set is as follows:

STDOUT message(s) from external scripts

So, this result showed us the student at index location 4 in the data frame.

Similarly, we can also view the data by specifying a range of indexes. So, to view the data at an index ranging from 1 to 3 we can run the following script:

The output is as follows:

SQL Machine Learning using Python

Please note that the range 1:3 means the index location 1 to 2 while index 3 is not included.

Using the head function to manipulate the data frame (pandas.DataFrame.head)

Similarly, we can use the head function to manipulate data by inputting the number of rows to be displayed.

For example, to only view the first 3 rows we will use the head function as follows:

The result set is as follows:

STDOUT message(s) from external script

Please note that there is a rich library of functions for a comprehensive data manipulation although we just focus on the basic functions available in SQL Machine Learning using Python.

Export Data Frame (back to SQL)

We began with importing our SQL result set into Python followed by converting it into a Python data frame and applying some manipulations on the data but finally, we need to push the data back to SQL.

The main benefit of SQL Machine Learning using Python is that we can revert to SQL after doing the complex manipulations and this can be used for the following purposes:

  1. Reporting based on the returned data set in SQL
  2. Publishing the returned data set in SQL to a website
  3. Using the returned data set in SQL as a parameter to be passed into another function for further work (processing)

Export Manipulated data using OutputDataSet Variable

There is a handy variable OutputDataSet available that exports the data back to SQL and this is done by assigning manipulated data frame to it where the manipulated data frame is held by InputDataSet.

Example: Finding the shape of the student table

Let us take an example of the student table we have in our sample database. Now we want to know more about our table which means we want to know the student in at index 0 and 1.

Let us write a script that does the following:

  1. Import the table into Python Data Frame
  2. Get the shape of the table (number of rows and columns)
  3. Export the shape back into SQL

This is done as follows:

The results can be seen below:

The SQL query output in Grid

Conclusion

The most important part is to understand the business needs of your organization and how machine learning or simply this three-step process (import, manipulate and export to SQL) can help you to achieve those business goals and objectives. You can formalize this process based on what you learn from the day-to-day data needs of your organization.

Congratulations, you have successfully learned the basics of SQL Machine Learning and can also explore it further now.

Haroon Ashraf
Latest posts by Haroon Ashraf (see all)
Machine learning, Python

About Haroon Ashraf

Haroon’s deep interest in logic and reasoning at an early age of his academic career paved his path to become a data professional. He holds BSc and MSc Degrees in Computer Science and also received the OPF merit award. He began his professional life as a computer programmer more than a decade ago, working on his first data venture to migrate and rewrite a public sector database driven examination system from IBM AS400 (DB2) to SQL Server 2000 using VB 6.0 and Classic ASP along with developing reports and archiving many years of data. His work and interest revolves around Database-Centric Architectures and his expertise include database and reports design, development, testing, implementation and migration along with Database Life Cycle Management (DLM). He has also received passing grade to earn DevOps for Databases verified certificate, an area in which he finds particular interest and potential. View all posts by Haroon Ashraf

168 Views