Ben Richardson

Dynamic Data Masking In SQL Server

October 15, 2019 by

Security has been one of the prime concerns of database developers since the inception of database management systems. Various data protection schemes have been introduced to provide secure access to sensitive data.

One such security feature introduced in SQL Server 2016 is called dynamic data masking. Dynamic data masking (as distinct from static data masking) is used to hide data from the user on the client-side.

It is important to mention that data masking is not the same as data encryption and should not be used as a primary security layer. It is only used to mask sensitive information such as social security numbers, email addresses, phone numbers, credit card numbers, etc. Following are the examples of dynamic data masking:

  1. Phone number -> xxxx-xxx-xx-5458
  2. Email -> xxxx@xxxx.com

The data is actually not masked physically in the database. Rather, the data is masked in the query result. By default, all the users will see the masked data in the output. The unmasked data is visible in the actual database.

For an introduction to Dynamic Data Masking in Azure see Implementing Dynamic Data Masking in Azure SQL database article, and for an overview of Dynamic Data Masking see Using Dynamic Data Masking in SQL Server 2016 to protect sensitive data article.

Data masking types

Default

The default mask, masks complete values in the specified column. To specify a mask for a particular column, you have to use the “MASKED WITH” clause. Inside the MASKED WITH clause, you have to specify the FUNCTION that you want to use for masking. If you want to perform default masking, you use the “default()” function.

Let’s take a look at a simple example of default masking. Let’s create a database named Masks. Execute the following script:

Inside the Masks database, we will create a table DefaultMask with four columns: ID, Name, BirthDate, and Social_Security. The Name, BirthDate and Social_Security columns will have default dynamic data masking. Execute the following script:

Let’s now insert some dummy data in the DefaultMask table that we just created:

Execute the following query to SELECT all the records from the DefaultMask table:

The output looks like this:

Data Table Showing Output Without Data Masking

In the output, you can see the unmasked values. This is because we returned the record as a database user that has full access rights. Let’s create a new user that can only access the DefaultMask table and then select the records using our new user. Execute the following script:

In the output, you can see that all of the data has been masked:

Data Table Showing Output Using Full Data Masking

Partial

The default mask hides everything in the column it is applied to. What if we want to partially display the information in the column while leaving some part of it hidden?

This is where partial masks come in handy. To use a partial mask, you have to pass “partial(start characters, mask, end characters” as the value for the function parameter of the MASKED WITH clause. It is important to mention that the partial mask is only applicable to string type columns.

Let’s take a look at a simple example of a partial mask. Let’s create a table PartialMask in the Masks database that we created earlier. Execute the following script:

The PartialMask table has three columns ID, Name and Comment. The Name and the Comment columns have been partially masked. For the Name column, we used the “partial(2, XXXX, 2)” mask. This mask will display the first two characters and the last two characters of the string value in the Name column. The remaining characters will be replaced by XXXX. Similarly, for the Comment column. the first 5 and last 5 characters will be displayed while the remaining characters will be masked by XXXX.

Let’s insert some dummy data in our PartialMask table:

Let’s select all the records from the PartialMask table using the database user:

The output looks like this:

Data Table Showing Output For Second Set Of Data

Again, let’s create a new user and grant him/her access to the PartialMask table to see what results he/she gets. Execute the following script:

The output of the script looks like this:

Data Table Showing Output When Data Table 2 Is Partially Masked

As you can see from the output, for Name column, we have the first 2 and last 2 characters displayed and for the Comment column, we have the first 5 and last 5 characters displayed while the rest of the data is hidden.

Email

The email mask is used to dynamically mask data which is in the email format. The function used is “email()”. Let’s create a new table with a column called Email and mask it using an email mask. Execute the following script:

The following script inserts some dummy records into the EmailMask table:

Let’s first retrieve the records using the database user:

The output looks like this:

Data Table Showing Email Data Table Before Masking

Let’s create a new user and then access the data from the EmailMask table to see masked data. Execute the following script:

The output looks like this:

Data Table Showing Output When Using Email Mask

Random

The Random mask is used to mask the integer columns with random values. The range for random values is specified by the random function. Look at the following example. Execute the following script:

In the script above, we masked the SSN and Age columns. The values in the SSN column will be replaced by a value between 1 and 99 while the values in the Age column will be replaced by a value between 1 and 9.

Let’s insert some dummy data in the RandomMask table that we just created:

Finally, we will create a new user that will access the data from the RandomMask table:

The output looks like this:

Data Table Showing Output Using Random Mask

From the output, you can see that the values for the SSN and Age columns have been masked using our random mask.

Conclusion

In this article, we saw how to perform dynamic data masking with the help of several examples. Dynamic Data Masking feature was introduced in SQL Server 2016 to enhance data security on the client-side. There are four major types of masks available in SQL Server: Default, Partial, Random, and Email and we went through all of them.

Suggested Links:

Ben Richardson

Ben Richardson

Ben Richardson runs Acuity Training a leading provider of SQL training the UK. It offers a full range of SQL training from introductory courses through to advanced administration and data warehouse training – see here for more details. Acuity has offices in London and Guildford, Surrey. He also blogs occasionally on Acuity’s blog

View all posts by Ben Richardson
Ben Richardson
370 Views