Esat Erkec
B-Tree structure and last page insert issue

SQL Server Performance Tuning: Resolving Last Page Insert Contention

June 25, 2021 by

In this article, we will focus on one major SQL Server performance issue that we may experience during heavy data insertion in a table.


SQL Server allocates some memory from the operating system, and this is because it is designed to do all read and modification operations in memory instead of on disk. The main purpose of this functionality is to reduce physical I/O operations and also reduce the response time of the operations. There is no doubt that this architecture improves SQL Server’s overall performance positively. This reserved memory area is called the buffer pool or buffer cache. A latch is an internal SQL Server mechanism and it is responsible to coordinate and also protect the integrity of the index and data pages in the buffer pool. However, in a heavy workload, we can experience some performance issues related to this internal mechanism.

What is the PAGELATCH?

PAGELATCH is a thread synchronization mechanism and it works as a conductor to synchronize the access to index or data pages that are located in the buffer pool. The goal of this synchronization mechanism is to ensure the consistency of data or index pages in the buffer pool. We can say that surely, the PAGELATCH wait type can be seen on every busy SQL Server and does not affect the SQL Server performance negatively.

The last page insert contention

Primary keys constraints uniquely identify each row in the table and automatically creates a clustered index on the underlining table. This duo is frequently used in table design by database developers. At the same time, if this column is decorated with an identity constraint thus we obtain a sequential incremental index key column. The clustered index creates a sorted data structure of the table for this reason a newly inserted row will be added at the end of the clustered index page until that page is filled. When solely one thread adds data to the above-mentioned table, we will never experience a last page insert contention because this problem will occur with concurrent usage of this table. In the high-volume insert operations, the last page of the index is not accessed by all threads concurrently. All threads start waiting for the last page to be accessible to them because the last page is locked by a thread. This bottleneck affects the SQL Server performance and the PAGELATCH_EX wait type begins to be observed too much.

B-Tree structure and last page insert issue

Now we will demonstrate this issue with a very simple example. At first, we create a sample table through the following query.

As we can see, the Id column is specified as a clustered index, and also an identity constraint enables the auto-increment feature for this column. In order to generate a heavy insert workload on this table, we will use the SQLQueryStress tool. With help of the SQLQueryStrees, we generate an intensive insert workload on the table

After launching the SQLQueryStrees, we will set the database credentials and click the OK button.

Use SQLStress to measure performance

As the second step, we will set the Number of Iterations as 100 and the Number of Threads as 100. So that, SQLQueryStress the query will run 10 times by the 50 threads concurrently.

Use SQLStress for workload tests in SQL Server

After starting the SQLQueryStress tool, we will monitor the system to identify the resource bottlenecks.

How to monitor SQL Server Performance

The above image illustrates that the session status is suspended for various rows and their wait types are PAGELATCH_EX. This situation means that they are waiting for a particular page that is not accessible at that time.

Use the DBCC PAGE command to find the page object

The wait resource gives some number combinations and their explanations :

11: Database Id

1: File Id

14339: Page Id

We can look inside this page through the DBCC PAGE command.

Under the PAGE HEADER section of the report, we can find out all details about this page.

m_type =1 indicates that this page store data records in a heap or clustered index leaf-level

IndexId = 1 indicates this page belongs to the index which is index id is equal to 1

ObjectId = 581577110 specifies the object id of the database object.

We can find out this table and index name very quickly with help of the following query.

Find the table name from its object id

As you can see, the problematic index belongs to our example table. Now let’s examine how we can overcome this problem. Meanwhile, our test was completed in 58 seconds.

Use SQLStress to simulate a workload test


OPTIMIZE_FOR_SEQUENTIAL_KEY feature has been introduced with SQL Server 2019 and it limits the number of threads allowed to request the latch to one per scheduler so it will reduce time spent in the runnable queue once the latch is acquired. To enable this feature, we will execute the following query:

After enabling the OPTIMIZE_FOR_SEQUENTIAL_KEY feature, we will run SQLQueryStress with the same query and with same settings.

OPTIMIZE_FOR_SEQUENTIAL_KEY help to improve SQL Server performance of the insert operation performance

The PAGELATCH_EX wait type gives its place to BTREE_INSERT_FLOW_CONTROL wait type.

Monitor SQL Server performance

Despite this wait type, we gained some SQL Server performance improvements. The chart below shows the comparison of when the OPTIMIZE_FOR_SEQUENTIAL_KEY property is enabled and when it is disabled.

Define Non-Clustered index on the Primary Key column

The main difference between the clustered index and non-clustered index in SQL Server is that the clustered index creates a sorted structure of the table. On the other hand, the heap tables do not contain any clustered index therefore the insert operations are not enforcing a strict order, so the insert operation is faster. To resolve the last page insert problem and reduce the PAGELATCH_EX wait type we can take advantage of this design approach so it improves SQL Server performance. At first, we will drop the clustered index and primary key on the InsertTestTable.

As a second step, we will create a primary key with a non-clustered index.

Non-clustered index and primary key

After creating the non-clustered index, we will run the SQLQueryStress with the same parameters and for the same query.

SQL Server stress test result

The test was completed in 57 seconds and does not much affect the SQL Server performance of the insert operation.

Use In-Memory Optimized Tables

We can use the in-memory optimized tables instead of the disk-based tables. In the context of the SQL Server performance, memory-optimized tables can resolve the latches and lock issues. To create a memory-optimized table, we need to apply the following settings :

  • We add a memory-optimized file group to hold the memory-optimized data in the database

  • We add a new file into the memory-optimized filegroup

After creating the required filegroup and files on the database, we will create durable (SCHEMA_AND_DATA) memory-optimized tables. SQL Server offers two types of memory-optimized table SCHEMA_ONLY and SCHEMA_AND_DATA.


In this table type, the data is lost when the SQL Server is restarted but the schema is always persisted. The SCHEMA_ONLY tables are so fast because they don’t generate any physical I/O.


In this table type, the data and schema are not lost as the disk-based tables.

The following query will create a memory-optimized table and also this table includes a non-clustered index and primary key. The memory-optimized tables do not support clustered indexes.

We will start the same test for this table.

Memory-optimized tables improve performance

As we can see, using the memory-optimized table dramatically improves the SQL Server performance for insert operation and we don’t monitor any PAGELATCH_EX wait type.


In this article, we have analyzed the last page insert issue and also have learned how it occurs. The last page insert issue affects SQL Server performance negatively on the intensive insert operation. To resolve this type of problem, we can apply the following methods :

  • Consider using memory-optimized tables
  • Define Non-Clustered index on the Primary Key column
Esat Erkec
Latest posts by Esat Erkec (see all)