Esat Erkec
SELECT INTO statement and parallel insert

A deep dive into SQL Table Variables

July 28, 2021 by

In this article, we will explore some secrets about SQL Table Variables.

Introduction

Table variables are special variable types and they are used to temporarily hold data in SQL Server. Working with the table variables are much easier and can show remarkable performance when working with relatively small data sets. In spite of that, they have some unique characteristics that separate them from the temporary tables and ordinary data tables. In the next section of this article, we will take a glance at the unique characteristics of the SQL table variables.

Parallel processing

SQL Server can execute the queries using more than one thread in other words, it can process queries in a parallel manner. However, there is an exception for the table variables. The query optimizer can’t generate a parallel query plan when we want to populate data into the SQL table variables. At first, we look at the temporary table behavior pattern in an insert statement. The following query inserts all rows of the SalesOrderDetail table into the #TempSalesOrder temporary table.

Execution plan of a temporary table

The graphical execution plan of the query shows us, the query optimizer has decided to use the parallel operators until the table inserts operation. The Index Scan operators have fulfilled their tasks in parallel. In this execution plan, we need to consider one point about the insert operation. The insert operation did not execute in a parallel manner. The reason we did not use the TABLOCK hints in the insert operator. When we use a TABLOCK hint in the insert query, the insert operations will be performed in parallel.

Temporary table parallel insert and TABLOCK

The Actual Number of Rows attribute shows the number of rows processed by each thread.

Table insert operator in a execution plan

At the same time, the SELECT… INTO statement can enable parallel insertion operations. In this statement type, we don’t need to explicitly define the temporary table.

SELECT INTO statement and parallel insert

Now, we will use the table variable instead of the temporary table in the insert query and then observe its query plan output.

An insert execution plan of a table variable

As we can see clearly, the query optimizer could not use parallel operators because of the SQL table variable. The reason for this issue is explained in the NonParallelReason attribute of the INSERT operator by the optimizer. This attribute value indicates CouldNotGenerateValidParallelPlan and it means that something is preventing generating a parallel plan. For our example, this problem is related to the SQL table variables.

NonParallelReason attribute

Another point about the parallelism and table variables is related to the select queries. When we execute the following query the query optimizer will generate a serial plan for the query.

SELECT statement and parallel plans

This issue is directly related to the estimated number of rows because the optimizer thinks the created table variable has only one row. Despite that, the actual number of rows is very far from this estimation.

Table variables and estimated number of rows relation

If we are using SQL Server 2019 version, there is no need to worry about this issue. The SQL Table Variable Deferred Complication feature resolves this problem without any changes. Only we need to change the database compatibility level to 150.

Change compatibility level of a database

However, if our version is less than SQL Server 2019, we can add the RECOMPILE hint at the end of the query to overcome this problem. With the help of this hint, the query optimizer will make a more proper estimation because the optimizer will consider the accurate number of rows for the table variable.

RECOMPILE option and table variables

Looking at the image above, we can realize that the Actual Number of Rows value is equal to the Estimated Number of Rows so the optimizer makes the proper decisions.

At the same time, we can use TRACE FLAG 2453 to overcome the same problem. We enable the trace flag at the top of the query and then disable it again at the end of the query.

CHECK Constraints

The check constraints are used to enable control of any condition when a row of data is inserted or modified. In addition to this, the check constraints are effectively used by the query optimizer to avoid consuming unnecessarily I/O operations. Such as, the SalesOrderHeader table includes the CK_SalesOrderHeader_Status constraint and does not allow adding a row with a value not between 0 and 8 in the Status column. The following query execution plan does not generate any I/O because of this constraint.

Check constraints effects the query plans

The query optimizer knows that there is no row in the table with the Status column value equal to 9 because of the check constraint. As a result, the optimizer does not perform any read operations.

We can create check constraints on the SQL table variables during the declaration of the table variable but these constraints can not be used by the optimizer. In the following query, we declare a table variable and we also add a check constraint that is similar to the SalesOrderHeader.

Table variable and check constraint performance relation

As we can see, the query optimizer did not consider the check constraint when deciding on the query plan.

Transactions

Transactions are the smallest logical unit that helps to manage the CRUD (insert, select, update and delete) operations in the SQL Server. However, the transaction operations are not possible for SQL table variables. In the following query, we will start an explicit transaction and then insert some rows into the table variable and then we will roll back the transaction. In this case, we think the query result set will be empty.

Table variable and transactions relation

TRUNCATE TABLE statement and SQL table variables

TRUNCATE table statements remove all rows of the data from a table and generate a minimal log in the transaction log. However, we can not use this statement for the table variables. In the following query, we try to truncate the @SpringMonthList table variable but the query will return an incorrect syntax error.

Table variable and TRUNCATE table statement

Transaction Logging

The common misconception about the SQL table variables is that they are thought to be stored in memory. This knowledge is completely wrong because the table variables are stored in the tempdb database. At the same time, another wrong knowledge about the table variables is that they don’t generate any log activity. However, during the life-scope of the table variables, they generate log activity on the tempdb transaction log files. Let’s prove this working mechanism as an example.

As a first step, we will write the flush log buffer to the disk with help of the CHECKPOINT command. In the second step, we will declare a very simple SQL table variable and insert one row into it. The table variable is created and dropped implicitly at the start and the end of the query for this reason we will monitor the log activity during the same scope. The fn_dblog is a function that helps to retrieve information about the portion of the transaction log file. This function takes two parameters start and end Log Sequence Number (LSN) but if we pass these two parameters as NULL, we can retrieve all activities on the log file. When we execute the fn_dblog function after the CHECKPOINT statement only we will see the checkpoint start and end logged.

CHECKPOINT and log file

The LOP_BEGIN_CKPT operation specifies the CHECKPOINT operation starts and the LOP_END_CKPT specifies the CHECKPOINT operation completes.

Now, we will bring the all parts together and execute the following query and try to understand its output.

Table variable and log file interaction

The above image illustrates the generated logs during declaring and inserting a row into the table variable. The LOP_BEGIN_XACT operation indicates the transaction is opened and the LOP_COMMIT_XACT operation indicates the transaction is committed.

Conclusion

In this article, we have explored some secrets about the SQL table variables. Table variables have some unique limitation characteristics so we need to consider these attributes when we decide to use table variables in our queries.

Esat Erkec
Latest posts by Esat Erkec (see all)
659 Views