Database checkpoints – Enhancements in SQL Server 2016

When a new row is inserted or an existing one is updated in your database, the SQL Server Database Engine will keep that change in the buffer pool in the memory first, without applying each change directly to the database files for IO performance reasons. These data pages located in the buffer pool and not reflected yet to the database files are called Dirty Pages. The SQL Server Database Engine uses a special kind of processes to write these dirty pages to the data and log database files periodically. These processes are called Checkpoints. The Checkpoint creates a mark that is used by the SQL Server Database Engine to redo any transaction that is committed, written to the database transaction log file without reflecting the data changes to the data files due to an unexpected shutdown or crash. Also, this recovery point that is created by the Checkpoint will be used to roll back any data changes associated with uncommitted transaction, by reversing the operation written in the transaction log file. In this way the SQL Server Engine will guarantee the database consistency. The time that is taken by the SQL Server Database Engine to redo and undo the transactions is called the Recovery Time. All information about the Checkpoints will be written to the database boot page to identify till which point the database files are synchronized with the buffer pool when the system wakes up after crash or shutdown.

There are four types of Checkpoints that the SQL Server Database Engine supports: Internal, Manual, Automatic and Indirect.

Internal Checkpoints are issued in response to several server events, such as the taking a database backup, creating a database snapshot, adding or removing database files, a clean shutdown for the SQL Server service is performed or 70% of the transaction log of a simple recovery model database is full.

Manual Checkpoints are triggered by executing the CHECKPOINT T-SQL command, where you can optionally specify the checkpoint_duration parameter that identifies the amount of time, in seconds, required by the checkpoint to complete. If the checkpoint duration parameter is not specified, the Manual Checkpoint will run to completion, consuming amount of time depends on the number of dirty pages that the operation must write. The below T-SQL statement will issue a Manual Checkpoint for maximum 10 seconds:

CHECKPOINT 10

Automatic Checkpoints are issued automatically in the background when the number of log records in the buffer pool reaches the estimated number of logs that the SQL Server Database Engine can process in a server configurable amount of time called Recovery Interval. The server-level recovery interval option specifies the maximum amount of time required by the SQL Server Database Engine to recover the database after restarting the SQL Server. The default recovery interval value is 0, which issues an automatic checkpoint every 60 seconds.

Recovery interval value can be configured using the sp_configure T-SQL command as below:

USE [master]

EXEC sp_configure 'show advanced options', 1;

RECONFIGURE

EXEC sp_configure 'recovery interval', 0

RECONFIGURE

It can also be configured using the SQL Server Management Studio from the Database Setting tab of the Server Properties window as follows:

Automatic Checkpoints also depend on the number of log records generated in the database; huge number of log records generated will issue the automatic checkpoints more frequently. The automatic checkpoint will run to the completion, same as the manual checkpoint.

The time required by the SQL Server Database Engine to recover a database after a system crash depends mainly on the amount of time required to redo the dirty pages, which is longer than the recovery interval server option most of the time. So that, changing the default recovery interval value may enhance the performance, mainly if rolling back the long transactions in your database takes more than 60 seconds, or if you notice a very frequent checkpoint processes that harm the database performance, as the automatic checkpoint increases the database I/O activities when issued. It is recommended to override the default recovery interval value gradually with small increments each time and test the effect of each increase carefully, as increasing this values will increase the time required to recover your database. The automatic checkpoint is the default checkpoint type in SQL Server versions prior to SQL Server 2016.

Indirect Checkpoints introduced the first time in SQL server 2012, in which the recovery time can be configured at the database level, providing faster and more accurate recovery time comparing to the automatic checkpoint, as it ensures that the number of dirty pages will be always less that a specific threshold. Indirect checkpoints reduce the I/O spikes related to the checkpoints as it keeps writing the dirty pages to the database files in the background smoothly. It also considers random I/O during the REDO process, which keeps the database recovery time within the upper-bound limit.

If the indirect checkpoint is configured on a database with heavy OLTP workload and disk system bottleneck, it may cause a performance degradation, as it will keep writing the dirty pages to the disk very frequently in order to meet the configured recovery interval value, increasing the write load for the SQL Server.

Starting from SQL Server 2016, the default checkpoint type is the indirect checkpoint, and the default value for the recovery time option is 60 seconds for the databases created under that SQL Server version and 0 for the databases created under the older SQL Server versions, indicating that the database uses automatic checkpoints.

The physical data pages transfer in SQL Server 2016 is more efficient in the indirect checkpoints, as the SQL Server Database Engine transfers the data pages in bulks, with each bulk size up to 1 MB, where the maximum transfer size is 256 KB in the older versions. The time that the SQL Server Database Engine will wait for before adjusting the I/O outstanding amount also increased in SQL Server 2016, where the Database Engine will wait 20ms in the previous SQL Server versions, but now it will wait for 50ms before applying that adjustment.

The recovery time option of the database can be checked by querying the sys.databases system table for the target_recovery_time_in_seconds database property as below:

SELECT name, target_recovery_time_in_seconds FROM sys.databases WHERE name = 'SQLShackDemo'

The result for our SQLShackDemo test database will be like:

It can be also checked using the SQL Server Management Studio tool from the Options tab of the Database Properties as follows:

Although we are using SQL Server 2016 version, the 0 value in the previous result indicates that the SQLShackDemo database is using automatic checkpoints, as it is restored from old SQL Server version or the current instance was upgraded from older SQL Server version, which makes sense.

SQL Server also allows you to configure the indirect checkpoint speed as a startup parameter, by providing the (-k) followed by a decimal value that indicates the checkpoint speed in MB per second.

Let’s have a small demo to check how the I/O writes operation enhanced in SQL Server 2016, which by default uses the indirect checkpoints type. Assume that we need to apply a heavy DML operation in the SQLShackDemo database, by filling each table in that database with extra 100 rows using the ApexSQL Generate tool as below:

As we previously found, the SQLShackDemo database uses the automatic checkpoints type. If we try to monitor the Avg. Disk Sec/write performance counter under the LogicalDisk counters group using the Performance Monitor windows tool during the insertion process, the write process will be in spikes form as below:

If we change the checkpoints type to indirect checkpoints, by modifying the database Target_Recovery_Time database option to 60 rather than 0 using the following ALTER DATABASE statement:

ALTER DATABASE [SQLShackDemo] SET TARGET_RECOVERY_TIME = 60 SECONDS WITH NO_WAIT

Or using SQL Server Management Studio, from the Options tab of the Database Properties window as below:

And apply the same DML load on the database while monitoring the I/O write behavior using the Performance Monitor tool, the “smooth” writing result will be like:

It is clear from the previous results that the write process will be in the form of big spikes when using the automatic checkpoints. When the database is configured to use the indirect checkpoints, the I/O stream became smooth small spikes providing more consistent response time on the disk with better overall performance.

Conclusion

A Checkpoint is the means in which the SQL Server Database Engine guarantees the databases consistency, where no data will be lost when the system is crashed as the dirty pages that are located in the memory and resulted from committed transaction will be redone although they are not written to the database files yet. SQL Server provides four main checkpoint types; Internal, Manual, Automatic and Indirect checkpoints. Starting from SQL Server 2016 version, the Indirect Checkpoint is the default checkpoint type that allows you to configure a faster and accurate recovery time at the database level and overcome the large writing spikes issue by writing in smooth small spikes as we clearly saw in the demo shown within the article.

Next article in this series:

SQL Server 2016 Memory-Optimized Tables – The Checkpoint operation

Ahmad Yaseen

Ahmad Yaseen is a Microsoft Big Data engineer with deep knowledge and experience in SQL BI, SQL Server Database Administration and Development fields.

He is a Microsoft Certified Solution Expert in Data Management and Analytics, Microsoft Certified Solution Associate in SQL Database Administration and Development, Azure Developer Associate and Microsoft Certified Trainer.

Also, he is contributing with his SQL tips in many blogs.

View all posts by Ahmad Yaseen

Conclusion

Useful links

Related posts: