Jignesh Raiyani

Converting commas or other delimiters to a Table or List in SQL Server using T-SQL

January 31, 2020 by

Database developers often need to convert a comma-separated value or other delimited items into a tabular format. Delimiters include pipe “|”, hash “#”, dollar “$” and other characters. SQL Server compatibility level 130, and subsequent versions, support a string_split() function to convert delimiter-separated values to rows (table format). For compatibility levels under 130, developers have previously done this with a user-defined function, which incorporates a While loop or Cursor to extract the data.

Why is comma-separated input required for a procedure or T-SQL query?

Execute a program in an iteration using a loop is a common method in the back-end application as well as the database. When iteration count is smaller than expected, the execution time can be less however the number of iterations is bigger which leads to longer processing time. For example, if the application is fetching data from a database with executing a piece of program in a loop by parsing different input parameter application will need to wrap the database response of each iteration. To handle this type of situation, we can execute the same program with a single execution with parsing the comma or delimiter separated values as an input parameter in the database and make it to tabular format in the T-SQL program to proceed with further logic.

SQL thread Execution

Today, many platforms have been changed to micro-service architecture. Not only application but database design could also be structured in micro-service-based architecture and most of the internal communication between the databases is done with integer references only. That is the reason to get some tricky way to achieve better execution from the database side.

Multiple inputs to the same parameter can be accomplished with comma-separated values in the input parameter of the stored procedure or input to the tabular function, and used with the table in a T-SQL statement. There may be more situations for this strategy when using T-SQL. While programming with SQL Server, a developer must break a string into elements using a separator. Using the split function against the string defines the separator as an input parameter. The entire string will be split and returned as a table. In certain circumstances, the input parameter can be a combination of two or more character-separated values for each set of input. For instance:

N’203616, 198667, 193718, 188769,…’ N’1021|203616$1021|198667$1022|193718$1022|188769$…’ N’1021|203616,198667$1022|193718,188769$…’

Single Thread Execution

There are advantages to single-thread execution with comma-separated values over multi-thread execution in SQL Server:

  • Single execution for part of the program, resulting in the number of input
  • Single read or write operation over the database tables instead of several iterations using a loop
  • One-time server asset use
  • No more hurdles on the back-end application to set up a response from multiple result sets
  • One-shot commit means that the database/SQL Server won’t block the transaction and slow things down. Thus, no blocking possibilities
  • Easy to oversee transaction consistency

Split function using loop

In most cases, a developer writes the table-valued function with a loop to extract the data from comma-separated values to a list or table format. A loop extracts the tuple one by one with the recursion. Performance issues can arise when working with the large input parameter since it will take more time to extract it. Here, we have a simple function with the name of split_string, which is using a loop inside the table-valued function. This function allows two arguments or input parameters (1 – input string and 2 – delimiter):

A result-set table returned by this function contains every substring of the input parameter string which is separated by the delimiter. The substrings in the table are in the order in which they happen in the input parameter. If the separator or delimiter doesn’t coordinate with any piece of the input string, then the output table will have only one component. The tabular function will return the result set in a row-column format from the comma-separated string.

Recent versions of SQL Server provide a built-in function string_split() to do the same task with the input parameters of the input string and delimiter. But there is another efficient way to perform this task using XML.

Split function using XML

The Split function using XML logic is a useful method to separate values from a delimited string. Using XML logic with the T-SQL based XQUERY approach results in the T-SQL not being characterized as a recursion using loop or cursor. Essentially, the input parameter string is converted to XML by replacing the XML node in the T-SQL articulation. The resulting XML is used in XQUERY to extract the node value into the table format. It also results in better performance.

For example, we have a comma-separated value with the variable @user_ids in the below T-SQL statement. In a variable, a comma is replaced by the XML tags to make it an XML data type document and then extracted using XQUERY with the XPATH as ‘/root/U’:

T-SQL code for Comma separated to list

In the above function, we have used the VARCHAR (100) data type for the extracted value rather than BIGINT. There can be changes in the input parameter string, too. For example, if the input string has one extra ‘,’ at the beginning or end of the string, it returns one extra row with 0 for the BIGINT data type and ” for VARCHAR.

For example,

T-SQL sample

To handle this type of ” or 0 data in the result set, we can add the condition below in the T-SQL statement:

Here, f.x.value(‘.’, ‘BIGINT’) <> 0 excludes the 0 and f.x.value(‘.’, ‘VARCHAR(100)’) <> ” excludes the ” in the query result set. This T-SQL condition can be added into the table-valued functions with the number of input parameter to handle this extra delimiter.

Function:

Execution:

Combination of characters in delimiter-separated string

But what about when the input parameter is a combination of two values with multiple separators? In this case, the input parameter must be extracted twice with the help of a subquery, as shown below:

This example is similar to the above example but we have utilized pipe “|” as a delimiter with a second delimiter, dollar “$”. Here, the input parameter is the combination of two-columns, role_id, and user_id. As we see below, user_id and role_id are in a separate column in the table result set:

T-SQL code for combinational delimiter separated to list

We can use any character in the above function, such as a single character-separated string or combination of character-separated strings. A developer must simply replace the character in the T-SQL statements above. Follow the steps described here to convert any delimited string to a list. Simply remember that the XML function of SQL Server accepts a string input and the input string will be delimited with the specified separator using the tabular function.

Jignesh Raiyani
168 Views