One of the main tasks of a SQL Server database administrator is performance tuning. Sometimes, though, coders or developers don’t always prioritize database performance or query optimization. Here is a typical scenario
- Imagine that developers create a new table and then insert some records in a test environment and test their queries to retrieve data from it
- The query executed successfully and does not exhibit any symptoms of performance problems
- The developer team release this table and query into production
- One day you take a telephone from your colleague and he says my report is very slow
- Bingo! In production, this table contains a lot of records and this is resulting in performance bottlenecks when querying it
I think this scenario is all too common to many SQL Server DBAs.
In my opinion developers who work with SQL Server often have a little knowledge about query performance tuning. In this scenario, a little knowledge would have gone a long way to preventing these issues in production.
Sargable is a word that concatenates the three words: search, argument and able.
As per wikipedia SARGable is defined as “In relational databases, a condition (or predicate) in a query is said to be sargable if the DBMS engine can take advantage of an index to speed up the execution of the query. The term is derived from a contraction of Search ARGument ABLE”
Advantage of sargable queries include:
- consuming less system resources
- speeding up query performance
- using indexes more effectively
Now we will shortly discuss about index basics then we will cover some examples of both sargable and non-sargable queries.
Indexes are created on tables or views. Indexes can be likened to a book directory and improve the speed of searches. Some examples of different types of indexes are clustered indexes, non-cluster indexes and columnstore indexes.
An index scan is when SQL Server reads all the data in the index pages. The cost of an index scan is very expensive for the SQL Server Engine.
An index seek is when SQL Server reads only matching data in the index pages. This method is more efficient for query performance because it will reduce IO and time consumption.
Now we will start examples. In our examples we will look at non-sargable queries and their execution plans. Then we will convert a non-sargable query to a sargable query. Finally, we will compare non-sargable and sargable queries time and IO statistics.
Now we will create our dummy table and insert records for queries
DROP TABLE IF EXISTS Dummy_PersonTable
CREATE TABLE Dummy_PersonTable
( ID [int] NOT NULL PRIMARY KEY IDENTITY(1,1),
[PersonType] [nchar](2) NOT NULL,
[NameStyle] [dbo].[NameStyle] NOT NULL,
[Title] [nvarchar](8) NULL,
[FirstName] [dbo].[Name] NOT NULL,
[MiddleName] [dbo].[Name] NULL,
[LastName] [dbo].[Name] NOT NULL,
[Suffix] [nvarchar](10) NULL,
[EmailPromotion] [int] NOT NULL,
[AdditionalContactInfo] [xml](CONTENT [Person].[AdditionalContactInfoSchemaCollection]) NULL,
[Demographics] [xml](CONTENT [Person].[IndividualSurveySchemaCollection]) NULL,
[rowguid] [uniqueidentifier] ROWGUIDCOL NOT NULL,
[ModifiedDate] [datetime] NOT NULL )
CREATE NONCLUSTERED INDEX [NonClustered_FirstName] ON [dbo].[Dummy_PersonTable]
CREATE NONCLUSTERED INDEX [NonClustered_ModifiedDate] ON [dbo].[Dummy_PersonTable]
CREATE NONCLUSTERED INDEX [NonClustered_MiddleName] ON [dbo].[Dummy_PersonTable]
INSERT INTO Dummy_PersonTable
This dummy table includes a non-clustered index on the FirstName, ModifiedDate and MiddleName columns.
Using functions in the WHERE clause
In this first example, we will select records in the Person table which starts with the letter “K”.
SELECT FirstName FROM Dummy_PersonTable where LEFT(FirstName,1)='K'
The SQL Server query optimizer cannot find the result of ths LEFT function values in the index pages. For this reason, the query optimizer chooses a cluster index scan and it needs to read the whole table.
In the below picture, the Estimated Number of Rows to be Read value is 5,991,600. This value is the total row number of the table. But the matching records are approximately 371,400
These functions create the same execution plans in the same conditions
- User defined functions
Now we will make a little syntax change in the previous query
SELECT FirstName FROM Dummy_PersonTable where FirstName LIKE 'K%'
SQL Server query optimizer decides to use an index seek operator when the operator cost is low and it can easily find matching records using the B-Tree structure. An index seek operator provides significant performance gains.
In the below picture, the Estimated Number of Rows to be Read value is equal to the Estimated Number of Rows because index seek operator directly achieves a matching result of the query
|Type||Table||Scan count||Logical reads||Physical reads|
This table show us IO statistics of these two queries. The sargable query performs significantly better than the Non-sargable query
|Type||Table||CPU Time(ms)||Elapsed time(ms)|
This table shows us time statistics of queries and it can be seen that the sargable query demonstrates better performance than the Non-sargable query.
Tip: We will add a calculated column for the LEFT function and create a non-clustered index
DROP INDEX IF EXISTS FirstName_Left ON Dummy_PersonTable
ALTER TABLE dbo.Dummy_PersonTable ADD FirstName_Left AS LEFT(FirstName,1)
CREATE NONCLUSTERED INDEX [NonClusteredIndex_LeftFirstName] ON [dbo].[Dummy_PersonTable]
INCLUDE ( [FirstName])
Now we will look again at the non-sargable query execution plan.
It will use an index seek
This is an alternative solution for performance gain but changing T-SQL syntax is much easier than changing the table structure and adding an index.
In this example, our target is to write a select query for a specific year. We can write this query as shown below.
SELECT ModifiedDate FROM Dummy_PersonTable where YEAR(ModifiedDate)=2009
We will look at this query execution plan
The SQL Server query optimizer cannot use an index seek because it cannot find the values of the YEAR function on the index pages.
Tip: Maybe you can ask this question. Why does SQL Server query optimizer decide to use a parallel plan? The SQL Server query optimizer decides on a parallel plan because the optimizer works on a cost-based method and it decides that a parallel plan is cheaper than a serial plan. We will compare the parallel and serial execution plans Estimated subtree cost
The parallel execution plan cost is 15.03 and the serial execution plan cost is 17.08. This is the main reason for this decision. There are other reasons to take this decision though:
- SQL Server engine has more than one CPU
- The SQL Server maximum degree of parallelism (MAXDOP) setting is more than 1
- The execution plan cost is greater than the cost threshold for parallelism setting value
… then SQL Server query optimizer choice parallel plan.
In this step we will change the condition in this query.
SELECT ModifiedDate FROM Dummy_PersonTable where ModifiedDate BETWEEN '20090101' AND '20091231'
The SQL Server query optimizer performs an index seek. Now we will compare IO and time statistics
|Type||Table||Scan count||Logical reads||Physical reads|
|Type||Table||CPU time(ms)||Elapsed time(ms)|
Our results demonstrate that the sargable query has out-performed the non-sargable query
Using ISNULL function on WHERE clause
In this sample we want to select all of records where the MiddleName is equal to “E” or NULL. We can write the query like this. We will activate Include Client Statistics for comparing non-sargable and sargable queries.
SELECT MiddleName FROM Dummy_PersonTable where ISNULL(MiddleName,'E') ='E'
We will change syntax of query and use an index seek
SELECT MiddleName FROM Dummy_PersonTable where (MiddleName IS NULL OR MiddleName='E')
As seen above, the non-sargable (Trial 1) query execution plan used an index scan and the sargable query (Trial 2) used an index seek. These differences the between two queries affected performance. This case clearly shown in the below client statistics chart.
In this article, we discussed sargable and non-sargable queries. Non-sargable queries can result in poor system performance. We can often improve these queries by changing the code, to leverage indexes and in doing so, convert them to sargable queries. Sargable queries can improve performance on CPU and IO as was demonstrated by our analysis and findings.