Rajendra Gupta
SQL Server Always On Availability Groups: Validate the record timestamps

Session timeouts in SQL Server Always On Availability Groups

February 8, 2021 by

In this 41st article for SQL Server Always On Availability Groups, we will explore the session timeout configurations for AG replicas.

Prerequisites

You should follow articles in the series, SQL Server Always On Availability Groups and prepare the following environment.

  • You require a minimum of two-node synchronous commit AG replicas
  • AG replicas must be in a synchronous and healthy state
  • You should get a basic understanding of the availability group, its components and functionality

Overview of the Hard and Soft errors

In a SQL AG configuration, we configure two or more replicas in either synchronous or asynchronous commit mode. These replicas are part of a Windows failover cluster configuration. We might face errors such as hardware issues, operating system issues, networking or storage issues, DNS issues, packet drops between replicas. We usually classify these errors as Hard errors or Soft errors that could cause issues in database synchronizations. The error detection speed and reaction time for a failure depends on hard or soft errors. For example, a server or network failure is reported immediately. The timeout for specific components is not handled at the SQL Server level. SQL Server handles the soft error as per its internal session-timeout mechanism.

What is a hard error or soft error? Let’s explore it in the upcoming part.

Hard errors

The availability replica could cause a session failure in case of any errors on the hardware or the operating system level. For example, if the log drive goes unresponsive on the primary replica, SQL Server gets notified about the critical error.

The example of hard errors can be:

  • Network issues
  • Firewall issues
  • operating system issues
  • Log drive issues on the primary replica.
  • DNS related issues

In case of a hard error, the affected components notify SQL Server about the failures.

Soft errors

SQL Server Always On Availability Groups implements a session time mechanism to detect soft errors. Suppose you have a synchronous secondary replica, but it went into the hung state. You cannot RDP to secondary replica or connect remotely to SQL instance. In this case, OS would not notify the SQL Server process about the error, and the primary replica might wait for infinite for the server to respond. In the case of the synchronous commit, a long queue might build-up for the transaction to commit first on the secondary replica.

To prevent the above scenario, AG implements a session-timeout mechanism. By default, the session timeout is set for 10 seconds. It sends out a ping request response from each connected replica at a specific interval. The successful ping response reveals that AG replicas can communicate with each other. It also resets the timer on a successful ping response.

If another replica does not give a ping response in a configured session-timeout period, you get a connection timeout message in the error log. The replica goes into the DISCONNECTED state. If you have the synchronous replica, it automatically converts internally into the asynchronous mode that means the transaction does not wait for transaction acknowledgement from the secondary replica.

Error messages

The timeout due to soft errors might happen due to issues such as:

  • Network packet drops, corrupted or incorrect order packet
  • Hung status of replica or database
  • Server timeout issues
  • Insufficient resources such as high CPU utilization, Disk latency, transaction log space
  • Thread exhaustion

Verify Session time configuration in SQL Server Always On Availability Groups

View the configuration using SSMS

To verify the session-timeout configuration, connect to the primary replica and open the AG replica properties. In the availability replicas, you get session timeout (seconds) as shown below.

Verify Session time configuration in SQL Server Always On Availability Groups

View the configuration of SQL Server Always On Availability Groups using DMV

You can also query the sys.availabilty_replicas to check the current value of session timeout for SQL Server Always On availability group.

 

View the configuration using DMV

Validate the behavior of Session timeout for synchronous replicas

We explained earlier that if there is no ping response from the other replica in configured session timeout, AG automatically works as an asynchronous replica whether you have configured it for synchronous commit. In this section, let’s validate and observe this behavior practically.

To see the session timeout configuration in action, we will intentionally generate a timeout scenario for the secondary replica. First, verify that SQL Server Always On Availability Groups is in healthy status.

In the below image, we have the following configurations.

  • Primary AG replica: SQLNode2\INST1
  • Secondary AG replica: SQLNode1\INST1
  • AG database: MyNewDB
  • Session timeout: Default 10 seconds
  • No data loss failover mode

SQL Server Always On Availability Groups: Validate the behaviour of Session timeout

Create the following [AGSessionTimeOut] table on primary replica in availability group database [MyNewDB]. Next, we use a while loop to insert the records into the [AGSessionTimeOut] table and wait for 1 second (using the WAITFOR DELAY command) before inserting the next record.

We have a synchronous AG secondary replica, so it waits for transaction acknowledgement and commits records on the primary replica. After some time, stop the secondary replica SQL instance. It causes connection timeouts for the secondary replica. Now, stop insert the statement on the primary replica after 20-30 seconds.

To troubleshoot the connection timeout, check the SQL Server error log on the primary replica. In the below image, we see that it recorded a connection timeout error at 2020=11-24 11:34:13 AM.

View SQL Server error log

Now, query the [AGSessionTimeOut] table. We can notice the following events in the below image.

  • SQL query inserts records in the [AGSessionTimeOut] table every second till 11:34:03 AM
  • We observe a gap of 10 seconds for the next data insert (between rows 9 & 10)
  • The next record was inserted at 11:34:14 AM

The reason for the delay is as below:

  • SQL Service stopped at the secondary replica at 11:34:03 AM, as shown in the following event viewer entry. Till this point, the AG replica is working in the synchronous commit mode

    View Windows event logs

  • Since the SQL Service is not down on the secondary replica, the primary replica could not receive a ping response, and it waits for 10 seconds as per the configured session timeout value
  • SQL Server error logs record connection timeout after completing the session period, i.e. 11:34:13
  • Query inserts next records every second starting from 11:34:14 AM. At this point, despite synchronous mode, AG works as an asynchronous commit and inserts records without waiting for any acknowledgement from the secondary replica. You do not get any prompt for switching synchronous to asynchronous mode

SQL Server Always On Availability Groups: Validate the record timestamps

Modify session timeouts in SQL Server Always On Availability Groups

We should set an appropriate session timeout in the availability group depending upon our infrastructure requirements. The session timeout value should set session timeout greater than or equal to 10 seconds. It is not recommended to use session timeouts below 10 seconds because we might have the possibility of missing ping response in a busy system, and it could cause a false alarm.

To modify the default value, change the session time in the availability group properties from the primary replica.

Modify session timeout

Alternatively, you use the ALTER AVAILABILITY GROUP statement to modify the session_timeout values. The below scripts modifies the session timeout value to 20 seconds.

You cannot set a value for SESSION_TIMEOUT in less than 5 seconds. If you try to do so, you get the following error message.

Session timeout less than 5 seconds

If there is any delay in data commits on the primary replica due to soft errors, you can notice a high value of HADR_SYNC_COMMIT wait on the primary instance.

Conclusion

In this article, we explored the session timeout in SQL Server Always On Availability Groups for handling soft errors. It enables SQL Server to continuously writing data into the primary replica even if AG is in synchronous mode and the secondary replica does not respond for acknowledgement. It helps to understand the AG related errors, their causes and steps to tackle those errors.

Table of contents

A comprehensive guide to SQL Server Always On Availability Groups on Windows Server 2016
Configure Domain Controller and Active Directory for SQL Server Always On Availability Groups
Configure failover clusters, storage controllers and quorum configurations for SQL Server Always On Availability Groups
Install SQL Server 2019 on Windows Server 2016 with SQL Server Always On Availability Groups
Add a new node into existing SQL Server Always On Availability Groups
Configure Managed Service Accounts for SQL Server Always On Availability Groups
Add or remove a node from SQL Server Always On Availability Groups using T-SQL scripts
Database-level health detection in SQL Server Always On Availability Groups
Automatic Page Repair in SQL Server Always On Availability Groups
Deploy a domain-independent Windows Failover Cluster for SQL Server Always On Availability Groups
Configure a SQL Server Always On Availability Group on the domain-independent Failover Cluster
An overview of distributed SQL Server Always On Availability Groups
Deploy a distributed SQL Server Always On Availability Group
Monitor and failover a Distributed SQL Server Always On Availability Group
Transparent Data Encryption for SQL Server Always On Availability Groups
Configure SQL Server replication for a database in SQL Server Always On Availability Groups
Configuring SQL Server replication for distribution databases in SQL Server Always On Availability Groups
Explore Cross-database MSDTC for distributed transactions in SQL Server Always On Availability Groups
Deploy MSDTC for distributed transactions in SQL Server Always On Availability Groups
Restore an existing availability group database participating in SQL Server Always On Availability Groups
Exploring AG dashboards for monitoring SQL Server Always On Availability Groups
Backup compression in TDE enabled databases in SQL Server Always On Availability Groups
Impact of dropping a login in the active directory tied to SQL Server Always On Availability Groups
Refresh SQL Server Always On Availability Group databases using DBATools PowerShell
Configure SQL Server Reporting Services databases in SQL Server Always On Availability Groups
Suspend and Resume Data Movement in SQL Server Always On Availability Groups
Explore failover types in SQL Server Always On Availability Groups
Explore SQL Server PolyBase External Tables in SQL Server Always On Availability Groups
SQL Server Always On Availability Groups for SQL Server Linux instances
Column-level SQL Server encryption with SQL Server Always On Availability Groups
Make the most of secondary replicas in SQL Server Always On Availability Groups
Apply SQL Server patches or cumulative updates in SQL Server Always On Availability Groups
Monitor SQL Server Always On Availability groups using extended events
The Hub and Spoke model of policy-based management for SQL Server Always On Availability Groups
Custom policies for AG dashboards of SQL Server Always On Availability Groups
Explore dynamic management views for monitoring SQL Server Always On Availability Groups
Dynamic management views for monitoring availability replicas and databases for SQL Server Always On Availability
Configure SQL Server Always On Availability Groups using Windows PowerShell scripts
Configure Integration Services Catalog Database SSISDB in SQL Server Always On Availability Groups
Synchronize logins between Availability replicas in SQL Server Always On Availability Groups
Session timeouts in SQL Server Always On Availability Groups

Rajendra Gupta
Always On Availability Groups

About Rajendra Gupta

As an MCSA certified and Microsoft Certified Trainer in Gurgaon, India, with 13 years of experience, Rajendra works for a variety of large companies focusing on performance optimization, monitoring, high availability, and disaster recovery strategies and implementation. He is the author of hundreds of authoritative articles on SQL Server, Azure, MySQL, Linux, Power BI, Performance tuning, AWS/Amazon RDS, Git, and related technologies that have been viewed by over 10m readers to date. He is the creator of one of the biggest free online collections of articles on a single topic, with his 50-part series on SQL Server Always On Availability Groups. Based on his contribution to the SQL Server community, he has been recognized with various awards including the prestigious “Best author of the year" continuously in 2020 and 2021 at SQLShack. Raj is always interested in new challenges so if you need consulting help on any subject covered in his writings, he can be reached at rajendra.gupta16@gmail.com View all posts by Rajendra Gupta

659 Views