Mustafa EL-Masry

Microsoft SQL Server database deployment process

December 14, 2016 by

Introduction

Database deployments are critical tasks that can affect negative in on performance in production. In this article we’ll describe some performance related best practices for database deployments.

In this article, I’ll cover several aspects of Microsoft SQL Server databases deployment process including :

  1. Database deployment prerequisites
  2. Database Schema design assessments
  3. Database stress tests
  4. Application stress tests
  5. Database index analysis

Prerequisites

The first thing we should do is to collect some fundamental information about this new database to be familiar with the business logic of it, usage, capacity plan, etc. The following are some examples

  1. Is there any field(s) in this database that contains(s) XML or Binary information? IF << YES >> we need a bit more clarity about the usage type of those data entities,
    • Are they used as transactional or lookup data entities?
    • If transactional, are there any reports to be generated from XML
    • What is the maximum expected number of records for further one year?

  2. Are there any special configuration(s) that should be considered during the deployment such as:
    • DTC (Distributed Transaction Coordinator)
    • Service Broker
    • CLR and assembly, if so what type of assembly, external access, unsafe, safe are supported by GAC (Global assembly cache) or it might need any other external Common Language Runtime modules or PowerShell files?
    • RCSI (Read Committed Snapshot Isolation level)
    • Encryption and if so what kind of encryption algorithms used
    • What is the key provisioning used there, I mean what DMK, Certificate, asymmetric key, etc. ?
    • Elevated privileges for application service account?  
    • Certain data engineering solution like data archiving, data migration, data cleansing jobs?  
    • Special scheduled jobs to send reports or do some DML operations?
    • SSRS (SQL Server Reporting service) Reports? if so please mentioned their URLs and grant content manager privileges for us  
    • Any existing OLAP cubes or BI solution?
    • Linked server or distributed queries
    • Replication or transaction log shipping with other Databases on other servers?
    • Any direct access from outside by any other users or systems?

  3. What the AD (Active Directory user) that will be Application services account?
  4. What the application server (Name/IP)? To open the SQL port with it

These provide a good set of example checks to help conclude the Database size, usage, requirements, configuration etc

Analysis

Before we start we should do a pre-deployment analysis of potential performance issues including:

  • Database Schema Design review
  • Stored procedures Stress tests
  • Application Stress tests
  • Index analysis

Database schema design review

The following are some example

  1. What are the fields and objects we should check?
  2. What is the performance risks that we should keep in mind while designing the database deployment process, for example:
  • Columns data types should be appropriate for the usage of this column. Therefore Nvarchar (MAX) or Varchar (MAX) are generally inappropriate because we can’t include them by any index because the index has size limitation 900 bytes and these columns store more than 8000 bytes.

  • BOLB and XML Columns: Check the columns with data types Text, Image, Next, FILESTREAM, XML, varbinary} as ideally, we ought to avoid these, unless absolutely needed

Uniqueidentifier Columns AS primary key: is not preferred for many reasons

  • GUID is 16 byte but the INT columns is 4 byte so the GUID will be required to have more read and write logical reads apart from the storage requirements which GUIDs require more.
  • If we use the GUID as clustered index for every insert we should change the layout of the data as it is not like the Identity INT columns

For better results, consider using integer identity columns. Or at least newsequentialid() instead of a Primary Key on GUID

  • Columns size : the columns should be created with most suitable size to be able to cover it by the index model,
    An index key length is 900 byte. If we assume we have column NVARCHAR (500) the actual size is 1000 byte and if we created any index with key columns size > 900 bytes it would impact on the OLTP transaction, and it will show for us the below warning:

    The maximum key length is 900 bytes. The index has a maximum length of 1200 bytes. For some combination of large values, the insert/update operation will fail

3.png

  • Heap Tables: This means the tables without primary key or clustered indexes, and these kinds of tables should be not accepted because they will generate extremely poor SQL Server Execution Plans for queries trying to get the data from the application.

Attached is a Stored Procedure Check_Heap_Tables and this SQL Server Stored Procedure will generate a T-SQL script for each Database and Table which is Heap Table and it will print out the required clustered index to be created for highlighted tables.

Stored procedures stress test

In this second phase our main focus in the most used stored procedures that we should execute it on 200 concurrent user minimum and 1000 concurrent user maximum for the purpose of stress testing the Application Level Stored Procedures for performance testing.

We can do this process by utilizing any stress tool, but we should fill the tables by at least 1 million of rows. In our case we’ll use ApexSQL Generate to help us highlight any potential issues via the query execution plan

Although there are multiple ways to tune-up TSQL some examples given below:

  • Use table hint {MAXDOP, FAST, Keep FixdPlan, Set Nocount on, With Nolock , Index forceseek ..Etc}
  • Create sufficient indexes to reduce the IO and CPU
  • Use dynamic queries

Application stress test

Most recently, many applications use Entity Framework at the application layer to access the database. This Framework does not allow the queries to be written directly; instead it generates queries on its own as required by the application and business logic based on the underlying tables. So, the database does not have the business logic inside the SQL Server.

For optimizing and stress testing the code we need to capture the business logic code by running stress test scenarios for like 200~300 concurrent users. And ultimately able to capture poorly written or worst performing TSQL.

Eventually, the goal here is to get the bad TSQL in hand and optimize it and give it back to the application team so that they can integrate it and merge inside the application for optimizing the business by converting the queries into SQL Server Custom Written Stored Procedures.

Index Analysis

After all of the stress testing performed on the database for potential problematic Stored Procedures and T-SQL queries, SQL Server engine will save the required missing indexes in the system tables which we can now query.

After analyzing the Missing Indexes from the System Stored Procedures, we can start to create the appropriate missing indexes on the tables. This process should be done only by experienced DatabaseAs and in collaboration with the Development team to enhance the performance of the poorly performing quires and Stored Procedures.

Conclusion

Be aware that your production environment is a restricted area and never deploy anything without a clear and well documented assessment process, to ensure the best performance of your production database.

References


Mustafa EL-Masry
General database design, Performance

About Mustafa EL-Masry

Mustafa EL-Masry is a Senior database consultant and one of the experts in Database performance tuning in the Middle East. Currently, he is working as a Senior consultant production DBA and Development DBA in many projects in multiple government sectors. He is a Top SQL Server blogger in the Middle East, founder of the community mostafaelmasry.com, and is the second Arabic author on Microsoft MSDN in SQL Server. Based on his current position, he solved fairly interesting problems on fairly large databases and highly sensitive performance cases. View all posts by Mustafa EL-Masry

168 Views