Ahmad Yaseen

SQL Server Execution Plan Operators – Part 3

October 29, 2018 by

In the previous articles of this series, we discussed a group of SQL Server Execution Plan operators that you will face when studying the SQL Execution Plan of different queries. We showed the Table Scan, Clustered Index Scan, Clustered Index Seek, the Non-Clustered Index Seek, RID Lookup, Key Lookup and Sort Execution Plan operators. In this article, we will discuss the third set of these SQL Execution Plan operators.

At the beginning, we will implement a testing table with 3K records to be use it in the examples of this article. This demo table is created and filled with data using the T-SQL script below:

SQL Server Aggregate Operator – Stream Aggregate

The Aggregate Operator is mainly used to calculate the aggregate expressions in the submitted query, by grouping the values of an aggregated column. The aggregate expressions include the MIN, MAX, COUNT, AVG, SUM operations. Let us run the below SELECT statement that retrieved the top 10 students who got the highest grade. And we will include the Actual SQL Execution Plan of the query. This can be achieved by aggregating the grades of the students to get the maximum grades using the MAX aggregate function. If all information about these top students is required, we should add all these columns in the GROUP BY clause, as shown below:

Checking the SQL execution plan generated after executing that query, you will see that the SQL Server Engine will retrieve the rows from the table, then sort these values based on the columns specified in the GROUP BY clause and aggregate these values using the fastest aggregation method, which is the Stream Aggregate Operator. The Stream Aggregate operator is fast due to the fact that it requires the rows to be sorted based on the columns specified in the GROUP BY clause before aggregating these values. If the rows are not sorted in the Seek or Scan operator, the SQL Server Engine will force the use of the SORT operator, as shown below:

Another example for the Stream Aggregate operator is the AVG aggregate function order, that will compute the SUM and COUNT of the aggregated column and stored the values in Expr1004 and Expr1005 , in order to evaluate the AVG value, as shown below:

SQL Server Compute Scalar Operator

SQL Server Compute Scalar operator is used to calculate a new value from the existing row value by performing a scalar computation operation that results a computed value. These Scalar computations includes conversion or concatenation of the scalar value.

Let us run the below SELECT T-SQL statement to generate a sentence that describes the grade for each student. And we will include the Actual SQL Execution Plan of the query:

You will see from the execution plan, generated after executing the query, that the SQL Server Engine uses the Compute Scalar operator to perform a concatenation for the two specified columns to return a new scalar value, as shown below:

You can see from the previous SQL Execution Plan that the Compute Scalar operator is not an expensive operator, where it only costs 2% of the overall weight of our query, causing a minimal overhead.

SQL Server Concatenation Operator

SQL Server Concatenation operator takes one or more sets of data in sequence and returns all records from each input data set. One of the most popular examples of this operator is the UNION ALL T-SQL statement. Let us run the below T-SQL statement that combines the result of two SELECT statement using the UNION ALL statement and we will include the Actual Plan of that query:

Then checking the Execution Plan generated after executing the query, you will see that the result returned from the two SELECT statements will be concatenated using the Concatenation operator to generate one result set, as showing below:

SQL Server Assert Operator

The SQL Server Assert operator is used to verify if the inserted values meet the previously defined CHECK or FOREIGN KEY constraints on the table. Assume that we have defined the below constraint on the demo table to ensure that only positive values are inserted into the STD_Grade column:

If you try to perform the below INSERT statement, including the Actual SQL Execution Plan of the query:

You will see from the SQL Execution Plan generated after executing the query, that the SQL Server Engine uses the ASSERT operator to validate if the inserted grade for that student meets the defined CHECK constraint, as shown below:

SQL Server Hash Match Join Operator

When joining two tables together, the SQL Server Engine divides the tables’ data into equally sized categories called buckets in order to access these data in a quick manner. This data structure is called a Hashing Table. It uses an algorithm to process the data and distribute it within the buckets. This algorithm called a Hashing Function.

Assume that we have created a new table that follows the absence of the students and fill it as below:

After that, we will run the below SELECT statement that joins the base Students table with the Absence table, including the Actual SQL Execution plan of that query:

You will see from the Execution Plan generated after executing the query that, after reading data from the two joined table, the SQL Server Engine uses the Hash Match Join operator, in which it fills the hash table with data from the small, also called Probe table, then process the second large table, also called Build table, depending on the hash table values, to speed up the access to the requested data, as shown below:

SQL Server Hash Match Aggregate operator

SQL Server Hash Match Aggregate operator is used to process the large tables that are not sorted using an index. It builds a hash table in the memory, calculates a hash value for each record, then scan all other records for that hash key. If the value is not existing in the hash table, it will create a new entry in that hash table. In this way, the SQL Server Engine will guarantee that there is only one record for each group of data.

Let us run the below SELECT statement that returns the number of duplicates for each ID value. And we will include the Actual SQL Execution Plan of the query:

You will see from the SQL Execution plan generated after executing the query that, the SQL Server Engine will use the Hash Match Aggregate operator to perform and speed up the COUNT aggregation operation, as shown below:

SQL Server Merge Join Operator

SQL Server Merge Join operator is used with the different types of the JOIN operation, but only when the two JOIN data sets are sorted according to the join predicate. In this case, the Merge Join operator will read from the two input data sets at the same time, compare it then return the matched results.

Let us run the below query that joins the ExPlanOperator_P3 table with itself, and include the Actual SQL Execution Plan of the query:

You will see from the SQL Execution Plan generated after executing the query that, the SQL Server Engine will scan each input, sort the data based on the ID column, as there is no index created on that table yet, then join the two inputs, that are sorted now based on the JOIN predicate using the fast Merge Join operator, as shown below:

Although the Merge Join operator joins the sorted tables very fact, there still be an overhead, as both inputs will be loaded in memory for comparison purpose. The cost of the Merge Join operator is the sum of the two operator inputs.

SQL Server Nested Loops Join Operator

The SQL Server Nested Loops Operator is used to join the upper input, also known as the outer input, by executing it one time, with the lower input, also known as the inner input, by executing it number of times equal to the number of records that matched the outer input. The SQL Server Query Optimizer decides to use the Nested Loops Join operator only when the outer input table is small, and the inner input table has an index created on the join predicate key.

Assume that we have created the below two indexes on our demo tables, as shown below:

Then run the below SELECT statement that returns the students who missed the classes more than days, by joining the two demo tables together, including the Actual SQL Execution Plan of the query:

From the SQL Execution Plan, generated after executing the query, you can see that the SQL Server Engine joins the two tables using the Nested Loops Join operator, but performing an Index Scan operation for the outer input one time then performing an Index Seek operation 100 times for the inner input, as shown below:

Again, although the cost of the Nested Loops operator is low compared to the overall weight of the query, you need to consider that the cost of that operator is highly depends on the multiplication of size of the outer input table and the size of the inner input table.

Stay tuned for the next article in which we will discuss the fourth set of the SQL Server Execution Plan operators.

Table of contents

SQL Server Execution Plans Overview
SQL Server Execution Plans Types
How to Analyze SQL Execution Plan Graphical Components
SQL Server Execution Plans Operators – Part 1
SQL Server Execution Plans Operators – Part 2
SQL Server Execution Plans Operators – Part 3
SQL Server Execution Plans Operators – Part 4
SQL Execution Plan enhancements in SSMS 18.0
Journey with SQL Server Execution Plan 3rd Party Tool – ApexSQL Plan
SQL Server Execution Plans and Queries Performance Tuning
Saving the SQL Server Execution Plans

Ahmad Yaseen

Ahmad Yaseen

Ahmad Yaseen is a SQL Server database administration leader at Aramex International Company with a bachelor’s degree in computer engineering as well as .NET development experience.

He is a Microsoft Certified Professional with a good experience in SQL server development, administration, performance tuning, monitoring and high availability and disaster recovery technologies. Also, he is contributing with his SQL tips in many blogs

View all posts by Ahmad Yaseen
Ahmad Yaseen
190 Views