Rajendra Gupta
Filter results for description starting with character A or L

T-SQL RegEx commands in SQL Server

September 17, 2019 by

This article explores T-SQL RegEx commands in SQL Server for performing data search using various conditions.

Introduction

We store data in multiple formats or data types in SQL Server tables. Suppose you have a data column that contains string data in alphanumeric format. We use LIKE logical operator to search specific character in the string and retrieve the result. For example, in the Employee table, we want to filter results and get the only employee whose name starts with character A.

We use regular expressions to define specific patterns in T-SQL in a LIKE operator and filter results based on specific conditions. We also call these regular expressions as T-SQL RegEx functions. In this article, we will use the term T-SQL RegEx functions for regular expressions.

We can have multiple types of regular expressions:

  • Alphabetic RegEx
  • Numeric RegEx
  • Case Sensitivity RegEx
  • Special Characters RegEx
  • RegEx to Exclude Characters

Pre-requisite

In this article, we will use the AdventureWorks sample database. Execute the following query, and we get all product descriptions:

product descriptions table

Let’s explore T-SQL RegEx in the following examples.

Example 1: Filter results for description starting with character A or L

Suppose we want to get product description starting with character A or L. We can use format [XY]% in the Like function.

Execute the following query and observe the output contains rows with first character A or L:

Filter results for description starting with character A or L

Example 2: Filter results for description with first character A and second character L

In the previous example, we filtered results for starting character A, or L. Suppose we want starting characters of descriptions AL. We can use T-SQL RegEx [X][Y]% in the Like operator.

In the output, you can we get only records with first character A and second characters L.

Filter results for description with first character A and second character L

We can specify multiple characters as well to filter records. The following query gives results for starting characters [All] together:

filter results using T-SQL RegEX

Example 3: Filter results for description and starting character between A and D

In the previous example, we specified a particular starting character to filter the results. We can specify character range using [X-Z]% functions.

The following query gives results for description starting character from A and D:

Filter results for description and starting character between A and D

Similarly, we can specify multiple conditions for each character. For example, the below query does the following searches:

  • The first character should be from A and D alphabets
  • The second character should be from F and L alphabet

In the output, you can see that both result set satisfies both conditions.

specify multiple conditions

Example 4: Filter results for description and ending character between A and D

In the previous examples, we filtered the data for the starting characters. We might want to filter for the end position character as well.

In the previous examples, note the position of percentage (%) operator. We specified a percentage character at the end of search characters.

In the following query, we changed the position of percentage character at the beginning on search character. It looks for the characters with the following condition:

  • Ending character should be from G and S

In the output, we get the character that satisfies our search condition.

Filter results for description and ending character between A and D

Example 5: Filter results for description starting letters AF and ending character between S

Let’s make it a bit complex. We want to search using the following conditions:

  • Starting character should be A (first) and F (second)
  • Ending character should be S

Execute the following query, and in the output, we can see it satisfies our requirement:

Filter results for description starting letters AF and  ending character between S

Example 6: Filter results for description starting letters excluding A to T

In the following example, we do not want the first character of output rows from A to T. We can exclude characters using [^X-Y] format in Like operator.

In the output, we do not have any first characters from A to T.

Filter results for description starting letters excluding A to T

Example 7: Filter results for description with a specific pattern

In the example below, we want to filter records using the following conditions:

  • The first character should be from R and S character – [R-S]
  • We can have any combination after the first character – %
  • We require the P character – [P]
  • It should be followed by either an [P] or [I] – [PI]
  • It can have any other character after previous condition- %

Filter results for description with a specific pattern

Example 8: Case sensitive search using T-SQL RegEx functions

By default, we do not get case sensitive results. For example, the following queries return the same result set:

Case sensitive search using T-SQL RegEx functions

We can perform case sensitive search using the following two ways:

  1. Database collation setting: Each database in SQL Server have a collation. Right-click on the database and in the properties page, you can see the collation

    database collation

    We have SQL_Latin1_General_CP1_CI_AS performs case insensitive behaviour for the database. We can change this collation to case sensitive collation. It is not a simple solution. It might create issues for your queries. It is not a recommended way unless you explicitly require case sensitive collation.

    We can use Column Collation with T-SQL RegEx functions to perform case sensitive search.

    In the table, we have letter A in upper and lowercase. If we run the following select statement, it returns both uppercase and lowercase:

    sample data

    Suppose we want to filter the uppercase letter in the result. We can use column collation as per the following query:

    It returns uppercase letter A in the output.

    case sensitive search

    Similarly, the following query returns lowercase letter in the output:

    case sensitive search

  2. We can use T-SQL RegEx function to find both upper and lowercase characters in the output.

    We want the following output:

    • The first character should be uppercase character C
    • The second character should be lowercase character h
    • Rest of the characters can be in any letter case

    case sensitive search for both upper and lowecase

Example 9: Use T-SQL Regex to Find Text Rows that Contain a Number

We can find a row that contains the number as well in the text. For example, we want to filter the results with rows that contain number 0 to 9 in the beginning.

search for a number pattern

Example 10: Use T-SQL Regex to Find valid email ID’s

Let’s explore a practical scenario of the RegEX function. We have a customer table, and it holds the customer email address. We want to identify valid email address from the user data. Sometimes, users make typo mistake and enter @@ instead of @ character.

First, create the sample table and insert some email address into it in different formats.

Execute the following select statement with the T-SQL RegEx function and it eliminates invalid email addresses.

We do not have following invalid email address in the list.

  • ABC@@gmail.com
  • ABC.DFG.LKF#@gmail.com

Use T-SQL Regex to Find valid email ID's

Conclusion

In this article, we explored T-SQL RegEx functions to perform a search using various conditions. You should be aware of these to search based on specific requirements.

Rajendra Gupta
Latest posts by Rajendra Gupta (see all)
168 Views