Backup and Restore (or Recovery) strategies for SQL Server database

One of the most important roles of a database administrator is to constantly protect the integrity of the databases and maintain the ability to recover quickly in case of a failure. In light of this, it’s critically important to have a backup-and-recovery strategy in place in order to be ready for an emergency.

A key responsibility of a database administrator is to ensure that a database is available whenever it’s needed, and prepare for various scenarios wherein the availability or the performance is impacted. Therefore, if a database, for whatever reason, gets corrupted, gets dropped, gets accidentally deleted, or goes into an unusable state, it is a database administrator’s responsibility to bring the database back up in a working state with little to no loss as per the defined service level agreements or government policies.

Database administrators must be prepared to deal with disaster recovery scenarios. One way of doing that is by testing SQL Server backup and restore strategies at regular intervals. This ensures seamless recovery of data. And seamless recovery means quick recovery of systems with minimal or no data loss. Of course, a database administrator’s responsibility is also to safeguard data from the various data failures.

While designing the backup and restore plan, we need to consider the disaster recovery planning with reference to specific needs of business and the environment. For example, how do we recover from a case of multiple data failures across three prime locations in the environment? How long would it take to recover the data and how long would the system be down? What amount of data loss can the organization tolerate?

Another important point that database administrators have to concentrate on is the nature of the storage of data. This directly impacts the usefulness as well as the efficiency of the backup-and-restore process.

There are plenty of advanced techniques available such as Clustering, AlwaysOn, LogShipping and Mirroring that help ensure higher availability but still disaster recovery is all about having a well-defined and tested backup-and-restore process.

Points to consider defining good backup strategy including the:

frequency at which data changes
online transaction processing
frequency of the schema changes
frequency of change in the database configuration
data loading patterns
nature of the data itself

In this article, we will touch upon the aforementioned points, and take a look at the requirements as well, based on attributes such as the size of the databases, their internal details, and the frequency of changes. Let us now go ahead and look at the considerations that are important to building a solid backup strategy.

SQL Server backups include all database objects: tables, indexes, stored procedures and triggers. These backups are commonly used to migrate databases between different SQL Server instances running on-premises or in the cloud. They can be used for data ingestion, disaster recovery, and so forth as well, though. Native backups simplify the process of importing data and schemas from on-premises SQL Server instances, and they will be easy for SQL Server administrators to understand and use. We use backups to:

migrate databases
refresh development environments
recover from accidental deletes
move a database to a different drive
import and export data
offload reporting needs using high-availability Log Shipping
Handle single or multiple database instances
create database copies for testing, training and demonstrations
transfer backup files into and out of cloud storage to give an added layer of protection for disaster recovery

The following transact-SQL script can be used to fetch the growth trend of the database using the backup_history table. The backup information is always stored in the msdb database. In some cases, you may get a limited data, based on the maintenance tasks or clean up tasks created to purge data. The backup size details can be found in the dbo.backupset table which contains a row for every successful backup operation.

DECLARE @endDate datetime=getdate(),

-- last six month trends

@months smallint=6

SELECT

[database_name] AS DatabaseName,

MAX(YEAR(backup_start_date) * 100

+ MONTH(backup_start_date)) AS YearMonth ,

CONVERT(numeric(10, 1), AVG([backup_size]/1024/1024)) AS BackupSizeMB,

CONVERT(numeric(10, 1), AVG([compressed_backup_size]/1024/1024)) AS Compressed-BackupSizeMB,

CONVERT(numeric(10, 1), AVG([backup_size]/[compressed_backup_size])) AS Compres-sionRatio

FROM msdb.dbo.backupset

WHERE [database_name] = N'MES_DW2_PROD'

AND [type] = 'D' and

backup_start_date BETWEEN DATEADD(mm, - @months, @endDate) AND @endDate

GROUP BY [database_name],DATEPART(mm,[backup_start_date])

Order by YearMonth desc

The above query can be modified to get an overview of the backup sizes on a daily or weekly basis depending on the granularity you want.

The following T-SQL uses msdb.dbo.backup_history table to analyze the growth of the database over a given period. The metrics are useful for capacity planning and forecasting with reference to backup

-- Transact-SQL script to analyse the database size growth using backup history.

DECLARE

--Define the range to fetch growth metrics

@months smallint=6

@endDate datetime=getdate(),

;WITH HISTORY AS

(SELECT DBS.database_name AS DBName

,YEAR(DBS.backup_start_date) * 100

+ MONTH(DBS.backup_start_date) AS YearMonth

,CONVERT(numeric(10, 1), MIN(DBF.file_size / 1048576.0)) AS MinMB

,CONVERT(numeric(10, 1), MAX(DBF.file_size / 1048576.0)) AS MaxMB

,CONVERT(numeric(10, 1), AVG(DBF.file_size / 1048576.0)) AS AvgMB

FROM msdb.dbo.backupset as DBS

INNER JOIN

msdb.dbo.backupfile AS DBF

ON DBS.backup_set_id = DBF.backup_set_id

WHERE NOT DBS.database_name IN

('master', 'msdb', 'model', 'tempdb')

AND DBF.file_type = 'D'

AND DBS.backup_start_date BETWEEN DATEADD(mm, - @months, @endDate) AND @endDate

GROUP BY DBS.database_name

,YEAR(DBS.backup_start_date)

,MONTH(DBS.backup_start_date))

SELECT H.DBName

,H.YearMonth

,H.MinMB

,H.MaxMB

,H.AvgMB

- (SELECT TOP 1 H1.AvgMB

FROM HISTORY AS H1

WHERE H1.DBName = H.DBName

AND H1.YearMonth < H.YearMonth

ORDER BY H1.YearMonth DESC) AS DBGrowthMB

FROM HISTORY AS H

ORDER BY H.DBName

,H.YearMonth

Summary

One of the key aspects of safeguarding or protecting the data is performing backups.

To define the right backup and recovery strategy, one should understand:

The type and nature of the data
The frequency of backup
The nature and speed of the underlying hardware
The backup media: tape drive, local disc, network share, remote share or cloud services
The network bandwidth

We should also consider testing the backups frequently because knowing the best-suited way to restore the data is equally important.

In reality, most of the organizations can absorb some sort of data loss, along with unavailability of the database for a few minutes or hours.

A zero-data-loss backup solution strategy may require special design and that may attract additional cost to manage the resource required to ensure that the data streams are backed up to multiple destinations regularly.

The other important aspect is the Recovery-Time-Objective, or RTO. This takes into consideration the amount of time that the system can be down for, in order to balance the cost of the system being offline against the cost of restoring the system quickly and maintaining the infrastructure and personnel to do this at a moment’s notice.

Let’s calculate the time required for the backup and restoration process that meets the objectives of the RTO (Recovery Time Objective). Assume that the size of database “X” is 32 GB, (or 32*1024=32768 MB). Let’s assume that the backup throughput is 120 MB/minute. Therefore, the time required to backup database “X” would be 32768/120 = 273 minutes (which is ~ 4 hours 33 minutes). Let’s assume that the restore throughput 100 MB/minute, therefore, the time it takes to restore database”X” would be 32768/100 which is 327 minutes (or ~5 hours 27 minutes).

As we can see, the time taken to create a backup is about ~4.5 hours, and the restoration process takes ~5.5 hours. This is the RTO (Recovery Time Objective) for this database. This is the calculation we need to base our RTO timelines on, which we need to keep in mind while planning our backup-and-restore strategy.

The frequency of the backups defines the dimension of the data when a disaster occurs. The more the backups, better is the restore process and able to meet the RPO (Recovery Point Objective)

Backup and storage are complimentary to each other. If the frequency of the database backup is higher, more storage is needed, which therefore brings in some budgetary constraints on just how practical it is to backup more frequently. The storage location is another point that is important for the strategy. The onsite and offsite storage have determined significance on Recovery-Point-Objective. If you store all the backup data on premises and some physical disaster occurs at your location, it will not only destroy the servers but also destroy all the backup media, meaning there is no way to restore the lost data. If you’re storing some of the backup off site, it may lead to an increase in the restoration time. So in most of the designs, during the initial few weeks, the data is stored locally. The backups are later moved offsite. Storing data offsite requires mandatory measures to protect the data using proper encryption and security policies. Once a backup has left your physical control, you have given up some amount of security.

Multiple rounds of testing will help us identify most of the problems.

Tape media is fragile and has a finite lifespan. It would be less than desirable if we proceeded to do a restore and found that the tapes were unreadable. If you’re testing them on a regular basis, hopefully, you will catch this problem during one of your tests rather than during a real disaster. If you’re testing the restore on a regular basis, then you should be very comfortable performing it and should be able to perform it quickly.

The larger the database, the more time it will take to restore the database. The only way to be sure about how long it’s going to take is to do some trial runs and share the results with your organization’s leadership so that the decision-makers know what to expect when it comes to the time required for a restore.

Database Backup and Restore process in SQL Server – series intro

An overview of the process of SQL Server backup-and-restore

Understanding the SQL Server Data Management Life Cycle

Understanding SQL Server database recovery models

Understanding SQL Server Backup Types

Backup and Restore (or Recovery) strategies for SQL Server database

Discussing Backup and Restore Automation using SQLCMD and SQL Server agent

Understanding Database snapshots vs Database backups in SQL Server

SqlPackage.exe – Automate SQL Server Database Restoration using bacpac with PowerShell or Batch techniques

Smart database backup in SQL Server 2017

How to perform a Page Level Restore in SQL Server

Backup Linux SQL databases Using PowerShell and Windows Task Scheduler

SQL Server database backup and restore operations using the Cloud

Tail-Log Backup and Restore in SQL Server

SQL Server Database Backup and Restore reports

Database Filegroup(s) and Piecemeal restores in SQL Server

In-Memory Optimized database backup and restore in SQL Server

Understanding Backup and Restore operations in SQL Server Docker Containers

Backup and Restore operations with SQL Server 2017 on Docker containers using Azure Data Studio

Interview questions on SQL Server database backups, restores and recovery – Part I

Interview questions on SQL Server database backups, restores and recovery – Part II

Interview questions on SQL Server database backups, restores and recovery – Part III

Interview questions on SQL Server database backups, restores and recovery – Part IV

Prashanth Jayaram

I’m a Database technologist having 11+ years of rich, hands-on experience on Database technologies. I am Microsoft Certified Professional and backed with a Degree in Master of Computer Application.

My specialty lies in designing & implementing High availability solutions and cross-platform DB Migration. The technologies currently working on are SQL Server, PowerShell, Oracle and MongoDB.

View all posts by Prashanth Jayaram

Latest posts by Prashanth Jayaram (see all)

Stairway to SQL essentials - April 7, 2021
A quick overview of database audit in SQL - January 28, 2021
How to set up Azure Data Sync between Azure SQL databases and on-premises SQL Server - January 20, 2021

SQLShack

Backup and Restore (or Recovery) strategies for SQL Server database

Summary

Table of contents

References

Summary

Table of contents

References

Related posts: