Rajendra Gupta
SQL Server clustered index

Overview of Non-Clustered indexes in SQL Server

January 16, 2020 by

This article gives an introduction of the non-clustered index in SQL Server using examples.

Introduction

In a previous article Overview of SQL Server Clustered indexes, we explored the requirement of an index and clustered indexes in SQL Server.

Before we proceed, let’s have a quick summary of the SQL Server clustered index:

  • It physically sorts data according to the clustered index key
  • We can have only one clustered index per table
  • A table without a clustered index is a heap, and it might lead to performance issues
  • SQL Server automatically creates a clustered index for the primary key column
  • A clustered index is stored in b-tree format and contains the data pages in the leaf node, as shown below

SQL Server clustered index

Non-Clustered indexes are also useful for query performance and optimization depending upon query workload. In this article, let’s explore the non-clustered index and its internals.

Overview of the non-clustered index in SQL Server

In a non-clustered index, the leaf node does not contain the actual data. It consists of a pointer to the actual data.

  • If the table contains a clustered index, leaf node points to the clustered index data page that consists of actual data
  • If the table is a heap (without a clustered index), leaf node points to the heap page

In the below image, we can look at the leaf level of non-clustered index pointing towards data page in the clustered index:

Non-Clustered index in SQL Server

We can have multiple non-clustered indexes in SQL tables because it is a logical index and does not sort data physically as compared to the clustered index.

Let’s understand the non-clustered index in SQL Server using an example.

  • Create an Employee table without any index on it

  • Insert few records in it

  • Search for the EmpID 2 and look for the actual execution plan of it

    It does a table scan because we do not have any index on this table:

  • Create a unique clustered index on the EmpID column

  • Search for the EmpID 2 and look for the actual execution plan of it

In this execution plan, we can notice that the table scan changes to a clustered index seek:

View actual execution plan

Let’s execute another SQL query for searching Employee having a specific contact number:

We do not have an index on the EmpContactNumber column, therefore Query Optimizer uses the clustered index, but it scans the whole index for retrieving the record:

Clustered Index Scan

Right-click on the execution plan and select Show Execution Plan XML:

Show Execution Plan XML

It opens the XML execution plan in the new query window. Here, we notice that it uses the clustered index key and reads the individual rows for retrieving the result:

Execution Plan XML

Let’s insert a few more records in the Employee table using the following script:

We have six employees’ records in this table. Now, execute the select statement again for retrieving employee records with a specific contact number:

Property of clustered index scan

It again scans all six rows for the result based on the specified condition. Imagine we have millions of records in the table. If SQL Server has to read all index key rows, it would be a resource and time-consuming task.

We can represent clustered index (not actual representation) in the B-tree format as per the following image:

clustered index in the B-tree format

In the previous query, SQL Server reads the root node page and retrieves each leaf node page and row for data retrieval.

Now Let’s create a unique non-clustered index in SQL Server on the Employee table on the EmpContactNumber column as the index key:

Before we explain this index, rerun the SELECT statement and view the actual execution plan:

Non-clustered index plan

In this execution plan, we can see two components:

  • Index Seek (NonClustered)
  • Key Lookup (Clustered)

To understand these components, we need to look at a non-clustered index in SQL Server design. Here, you can see that the leaf node contains the non-clustered index key (EmpContactNumber) and clustered index key (EmpID):

data representation

Now, if rerun the SELECT statement, it traverses using the non-clustered index key and points to a page with clustered index key:

execution of a select statement

It shows that it retrieves the record with a combination of clustered index key and non-clustered index key. You can see complete logic for the SELECT statement as shown below:

complete logic for the select

  1. A user executes a select statement to find employee records matching with a specified contact number
  2. Query Optimizer uses a non-clustered index key and finds out the page number 1001
  3. This page consists of a clustered index key. You can see EmpID 1 in the above image
  4. SQL Server finds out page no 101 that consists of EmpID 1 records using the clustered index key
  5. It reads the matching row and returns the output to the user

Previously, we saw that it reads six rows to retrieve the matching row and returns one row in the output. Let’s look at an execution plan using the non-clustered index:

Index seek property

Non-unique non-clustered index in SQL Server

We can have multiple non-clustered indexes in a SQL table. Previously, we created a unique non-clustered index on the EmpContactNumber column.

Before creating the index, execute the following query so that we have duplicate value in the EmpAge column:

Let’s execute the following query for a non-unique non-clustered index. In the query syntax, we do not specify a unique keyword, and it tells SQL Server to create a non-unique index:

As we know, the key of an index should be unique. In this case, we want to add a non-unique key. The question arises: How will SQL Server make this key as unique?

SQL Server does the following things for it:

  • It adds the clustered index key in the leaf and non-leaf pages of the non-unique non-clustered index
  • If the clustered index key is also non-unique, it adds a 4-byte uniquifier so that the index key is unique

Non-Unique Non-Clustered index

Include non-key columns in non-clustered index in SQL Server

Let’s look at the following actual execution plan again of the following query:

Include non-key columns in non-clustered Index

It includes index seek and key lookup operators, as shown in the above image:

  1. The index seeks: SQL Query Optimizer uses an index seek on the non-clustered index and fetches EmpID, EmpContactNumber columns
  2. In this step, Query Optimizer uses key lookup on the clustered index and fetches values for EmpName and EmpAge columns

    Key lookup

  3. In this step, Query Optimizer uses the nested loops for each row output from the non-clustered index for matching with the clustered index row

    Index seek non-clustered index

The nested loop might be a costly operator for large tables. We can reduce the cost using the non-clustered index non-key columns. We specify the non-key column in the non-clustered index using the index clause.

Let’s drop and created the non-clustered index in SQL Server using the included columns:

Included columns are part of the leaf node in an index tree. It helps to fetch the data from the index itself instead of traversing further for data retrieval.

In the following image, we get both included columns EmpName and EmpAge as part of the leaf node:

Included columns

Re-execute the SELECT statement and view the actual execution plan now. We do not have key lookup and nested loop in this execution plan:

Actual execution plan using included columns

Let’s hover the cursor over the index seek and view the output columns list. SQL Server can find all the columns using this non-clustered index seek:

Seek Predicates

We can improve query performance using the covering index with the help of included non-key columns. However, it does not mean we should all non-key columns in the index definition. We should be careful in index design and should test the index behavior before deployment in the production environment.

Conclusion

In this article, we explored the non-clustered index in SQL Server and its usage in combination with the clustered index. We should carefully design the index as per the workload and query behavior.

Rajendra Gupta
336 Views