Accelerated database recovery will be the topic of this article, including killing an active query, abnormal shutdown and the accelerate recovery feature itself, in SQL 2019
SQL Server Database recovery is an essential and critical task for the DBA. We take regular database backups to recover databases from any unexpected downtime. We face many scenarios where DBAs do not have control over the actual recovery, and the only solution is to wait for recovery to finish. In this article, we will discuss about SQL Server database recovery scenario along with new feature in SQL Server 2019 Accelerated Database Recovery.
We will first prepare the environment and then explain the recovery issues. In this example I am using SQL Server 2019). You can verify instance version using the select @@Version command.
Create a sample table using the following query
SET ANSI_NULLS ON
SET QUOTED_IDENTIFIER ON
CREATE TABLE [dbo].[tblSQLShackDemo](
[S.No.] [int] IDENTITY(0,1) NOT NULL,
[value] [uniqueidentifier] NULL,
[Date] [datetime] NULL
) ON [PRIMARY]
ALTER TABLE [dbo].[tblSQLShackDemo] ADD DEFAULT (getdate()) FOR [Date]
Scenario 1: Kill an active running query
Suppose you are running a large insert or updates DML query. A query is in executing state but due to some reasons such as high CPU or memory consumption, blocking, deadlock, database performance issues you need to KILL it. Once you execute the Kill command, the query goes into RollBack state, and it might take a long time to complete the recovery process.
We are inserting 500K records into the tblSQLShackDemo table to demonstrate this issue. Execute the following query to begin a transaction.
Declare @Id int
Set @Id = 1
While @Id <= 1000000
Insert Into tblSQLShackDemo(value) values (newid())
Set @Id = @Id + 1
Once we executed the query, we can check its status using sp_who2 ‘SPID’ command.
While the query is still executing, we can check the table count using NoLock hint along with our table name.
select count(*) from tblSQLShackDemo(nolock)
It is executing from 3 minutes and inserted 457134 records until now.
Now, we need to kill the SPID to start the rollback process. Execute the command KILL 55. In this command 55 is the SPID in which insert query is running.
In the sp_who2 command, we can see the status of the query as ROLLBACK.
We can track the progress of rollback command using the following query.
KILL 55 with Statusonly
In the following screenshot, you can see it shows estimated rollback time is 3567 seconds that is approximately 60 minutes.
If the query goes longer before you kill it, it might take a few hours as well before the rollback completes. You need to bear the extra load in terms of CPU, Memory in rollback as well. It also blocks the current transactions on the particular table. We cannot do anything in this scenario except waiting to get it complete.
Scenario 2: Abnormal shutdown while the query is running
Let us imagine another scenario in which you started a transaction to insert a large number into our sample table. Suddenly the system crashed. Once the system is up, you need to start the SQL Services. SQL Server service is online. However, the user database is still performing recovery.
Once the SQL Server is back online, expand the databases. In the following screenshot, you can see that database status is In Recovery.
We cannot access the database at this time. We can check more details in the SQL Server Logs. In the logs, you get the following message.
Recovery of database ‘SQLShackDemo’ (5) is 0% complete (approximately 36351 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.
As per the error log entry, it will take approximately 36,351 seconds that is approximately 10 hrs. Really?! Do we need to wait for SQL Server database to come online for 10 hours? It is true. We need to wait for the database to come fully online. The worst part is that we cannot do anything apart from refreshing the error logs and monitor the progress. It is indeed a helpless condition for DBAs.
As per following screenshot recovery of database recovery phase 3 is started. At this point, database is available for the users. Wait, the database is accessible but SQL Server still making the recovery of the database.
Once the database recovery is completed, we get the following message in the error log.
Recovery completed for database SQLShackDemo (database ID 5) in 1802 second(s) (analysis 1375 ms, redo 551401 ms, undo 1246756 ms.) This is an informational message only. No user action is required.
SQL Server took 1802 seconds approximately 30 minutes to recover this database. It might take longer depending on the work SQL Server to do to bring database in a consistent state after recovery.
We will cover more about recovery stages in the later part of the section.
We can see following pain points for DBA until now in SQL Server.
- Huge recovery time
- Roll back takes longer time
Let us repeat these scenarios in the following step with the new feature of SQL Server 2019 Accelerated database Recovery.
Accelerated Database Recovery with SQL Server 2019
SQL Server 2019 introduced a new database recovery feature Accelerated Database Recovery. It redesigns the database recovery process in SQL Server. We can do an immediate rollback of any query. It also improves the database recovery in case of any disaster such as server crash, cluster or AG failover.
We need to enable the Accelerated Database Recovery feature at the database level. It is disabled by default for all databases.
In this example, we created another database SQLSHACKDEMO_ADR along with the same table tblSqlShackDemo.
We get a new column in sys.databases to check whether Accelerated Database Recovery is enabled or not on a particular database.
select name,create_date,compatibility_level ,physical_database_name,is_accelerated_database_recovery_on from sys.databases
Enable Accelerated Database Recovery using following alter database command
ALTER DATABASE SQLSHACKDEMO_ADR SET ACCELERATED_DATABASE_RECOVERY = ON;
It took approx 7 minutes for me to enable this feature on a blank database. It might get improved in future releases of SQL Server 2019.
Now, run the sys.database command mentioned above. In the following screenshot, we can see that Accelerated Database Recovery is enabled for SQLShackDemo_ADR database.
Let us perform both the scenario with this Accelerated Database Recovery enabled database.
Scenario 1: Kill an active running query
Run the query to insert bulk records in tblSQLShackDemo table and Kill the session after approximately 3 minutes.
Here is the difference
- Rollback without Accelerated Database Recovery database took approximately 60 minutes to finish rollback.
- Rollback with Accelerated Database Recovery database performed the rollback quickly.
Scenario 2: Abnormal shutdown while the query is running
Let us repeat scenario 2 and restart the SQL Server while the query is still executing. Once the server is back, connect to the instance. We can see that the database is online now.
Yes, it is true. We do not wait for long to wait in a painful situation for refreshing the error logs and wait to see the message that database is online.
Let us go to the error log, and we get the following message.
Recovery completed for database SQLShackDemo_ADR (database ID 6) in 12 second(s) (analysis 8162 ms, redo 2593 ms, undo 236 ms.) This is an informational message only. No user action is required.
Here is the difference you can notice between both the executions.
In the following screenshot, you can notice the database recovery time difference in a graphical way.
In SQL Server, we have following three phases of database recovery.
In the following table, we can understand these three phases of recovery.
- Analysis Phase: SQL Server periodically runs the internal checkpoint process. When SQL Server starts, it starts reading the transaction log from the last successful checkpoint. It reads the log forward, rebuilds the transactions table, and dirty pages table. At the end of the analysis phase, we have either committed transaction (requires roll-forward) or uncommitted transaction (requires rollback)
- Redo Phase: In this phase, SQL Server starts reading from the oldest uncommitted transaction and with the help of a dirty page table, it takes system at the point of the crash. SQL Server (from SQL Server 2005 onward) is accessible for the users after Redo phase
- Undo Phase: SQL Server needs to roll back all the active changes at the time of system crash. SQL Server starts reading transaction log in the backward direction and with the help of Active transaction table rolls back the transactions
When we kill an active transaction, SQL Server needs to perform Undo recovery process. Therefore, it might take a long time to roll back as well.
In the following image (Reference – Microsoft Docs) shows the overall Database recovery process.
Accelerated Database Recovery in SQL Server 2019
Once we enabled Accelerated Database Recovery on a SQL Server Database, it stores the version of all modifications. It is similar to versioning in Read Committed Snapshot Isolation level. SQL Server stores the previous version in a secondary memory optimized log called s-log.
- Persisted Version Store (PVS): In Persisted version store, SQL Server stores the row version in the database enabled with Accelerated Database Recovery Feature
- Logical Revert: SQL Server uses the PVS to undo the changes immediately, and it does not need to read the details from the transaction log that is a time-consuming process
- sLog: It stores log records for log records for non-versioned operations. These operations can be DDL command, bulk queries. It makes the redo and undo processing quicker because they only need to process non-versioned operations
- Cleaner: Cleaner process automatically removes the version that is not required by SQL Server for the recovery
In the following image (Reference – Microsoft Docs) shows the overall Database recovery process with Accelerated Database Recovery.
In this article, we explored the SQL Server 2019 Accelerated Database Recovery feature. It improves the database recovery time and resolves DBA painful situations.
Table of contents
|Accelerated Database Recovery; Instant Rollback and Database Recovery|
|Accelerated Database Recovery and Long Running Transactions with Transaction Log Growth|