Prashanth Jayaram

Planning a SQL Server Backup and Restore strategy in a multi-server environment using PowerShell and T-SQL

August 15, 2017 by

Introduction

Database availability is critical to every enterprise and conversely, unavailability directly can create a severe negative impact to the business in today’s world. As database administrators, it is important that we ensure we take all possible steps to minimize data loss. While it is naïve to think that our databases are invincible because of all such precautions, we can always aim to bring the databases back into operation as quickly as possible by meeting the RPO and RTO. This way, the business is unaffected, and we also meet our SLAs.

Apart from good knowledge of the system, it is important we have a well-defined data and tested backup-and-restore strategy. Testing the backup-and-restore procedure to ensure quick recovery with minimal administrative overhead is one means towards that end. While advanced techniques such as AlwaysOn and Mirroring help ensure higher availability, disaster recovery is all about well-defined and tested backup and restoration methods.

When planning for disaster recovery techniques, here are some points about volatility that we need to keep in mind:

  • How frequently is the table updated?
  • Are your databases transactions highly frequent? If so, one must perform more frequent log backup than a full backup.
  • How often does the database structure change?
  • How often does the database configuration change?
  • What are the data loading patterns and what is the nature of the data?

We must plan our backup-and-restore strategy keeping these aspects in mind. And to ensure the strategy is close to airtight, we need to try out all possible combinations of backup and restoration, observe the nuances in the results, and tune the strategy accordingly. So, we must:

  1. Consider the size of the database, the usage patterns, nature of the content, and so on.
  2. Keep in mind any of the configuration constraints, such as hardware or backup media.

In this article, we touch base on the aforementioned points, and decide on the requirements as well, based on attributes such as the database sizes, their internal details, and the frequency of changes. Let us now go ahead and look at the considerations that are important to build a backup strategy.

By the end of the article, we would be equipped with a couple of scripts that would give us all vital information in a nice tabular format, that would help us with planning and decision-making when it comes to backups and their restoration.

Background

Take a look at the business requirements, as each database may have different requirements based on the application it serves. The requirements may be based on:

  1. How frequently does the application access the database? Is there a specific off-peak period when the backups can be scheduled?
  2. How frequently does the data get changed? If the changes are too frequent, you may want to schedule incremental backups in between full backups. Differential backups also reduce the restoration time.
  3. If only a small part of a large database changes frequently, partial and/or file backups can be used.
  4. Estimate the size of a full backup. Usually, the backup is smaller than the database itself, because it does not record the unused space in a database.

This post demonstrates one of the ways to gather an inventory of database backup information. The output of the script includes various columns that show the internals of a database, backup size, and gives the latest completion statuses and the corresponding backup sizes. Though the output is derived from the msdb database, having the data consolidated at one place gives us better control and provides us with greater visibility into the database process. A clear understanding of these parameters is a good way to forecast storage requirements. We can schedule a job to pull the data to a central repository and then develop a report on capacity planning and forecasting on a broader spectrum of database backup. That way, we’d have an insight into the sizing of every backup type, such as Full, Differential and Log. With such information, we can easily decide on the type of backup required at a very granular level, i.e., at the database level.

Going this way, we can optimize our investments in storage and backup. It helps us decide on alternatives if needed. The base for every decision is accurate data. The accuracy of information gives us a baseline to decide the viable options to back up the SQL databases. The idea behind this implementation is to get the useful data using a technique native to SQL, and have brainstorming sessions based on that, to identify the best design and solution for the backup problem.

Getting started

There are many ways to gather data in a central server repository such as using T-SQL and PowerShell. In this section, I’m going to discuss the data gathering tasks using PowerShell cmdlets.

The pre-requisites are

  • Require SSMS version 16.4.1
  • SQL Server PowerShell module

New cmdlets have been introduced with the SQL Server module, which is going to replace SQLPS, by retaining the old functionality of SQLPS with added set of rich libraries. It is safe to remove the SQLPS and load the SQL Server module.

You can remove the SQLPS using below command:

Remove-Module SQLPS

Load the SQL Server Module using the below command:

Import-Module SqlServer

The new cmdlets, Read-SQLTableData and Write-SqlTableData are introduced to load the data as SQLTABLE

With these cmdlets, a query can be executed and the results can be stored in a very simple way. It’s like storing everything as a table.

  • In this case we are writing T-SQL data into the table using the Write-SqlTableData cmdlet
  • I will be discussing these two parameters of Write-SqlTableData
    • Passthru – It just inserts the records into the destination table
    • Force – Creates an object if the object is missing in the destination
  • The below T-SQL provides 3 different levels of output:
    • Base details – 1 row per backup.
      • Shows backup size, previous backup size, and the change between the two
      • Previous backup size based on the type and database
    • Latest backups per database per type
      • with size changes from previous backup
      • row per database per backup type
  • Latest backups ONLY
    • One row per database, containing details about the latest backup
    • Each row includes individual columns for the last time each backup type was completed
    • Each row includes backup sizes, types, previous sizes, and the difference between the two
  • The output shows the nature of the backup and its corresponding sizes. This gives a clear indication of size estimates required for future growth which is in line with the retention period objective.

Let’s run the T-SQL to see the output yielding the desired data; the T-SQL is available at Appendix A. The output of the T-SQL is given below.

Let us now try to get the org-level information that would help us with planning. We’re going to use another T-SQL script, coupled with PowerShell. The PowerShell script is available at Appendix B.

The PowerShell output window shows the data in a simple manner. With very few lines of code, the data is pulled and stored in a central repository (SqlShackDemo, in the BackupInfo table).

The T-SQL code is shown in gray. It’s just those few lines that do all the magic.

Summary

As administrators, we should understand the business impact that any amount of downtime causes. Business impact is measured based on the cost of the downtime and the loss of data, against the cost of reducing the downtime and minimizing data loss.

One of the ways of cost optimization is proper backup and restoration. To optimize costs, decision-making is required, and for that, data is required. We fetch the necessary data, and store it in a centralized repository, which is a great place to understand how the system is functioning. This data also helps us generate projections, pertaining to capacity planning and forecasting in terms of database backup.

Having a good understanding of the data and various test procedures for backup will help an administrator prevent various forms of failure by helping him maintain a low MTTR (Mean-Time-To-Recover) and MTBF (Mean-Time-Between-Failures).

Also, the use of the SQL Server PowerShell module along with the new cmdlets has made our life a lot easier, in that the PowerShell script output can be stored as Stables, with minimal efforts. The SQL Server module replaces the SQLPS—the old functionality still exists, but has been merged with a bunch of new functionalities.

The implementation discussed in this post gives us an ample amount of data that helps decide a good backup strategy and planning methodology, that meets our business needs.

References

Appendix (A)

Appendix (B)

The T-SQL is executed across all the SQL instances listed in the input file


Prashanth Jayaram
168 Views