Prashanth Jayaram

How to setup custom SQL Server transactional replication with a central publisher and multiple subscriber databases

August 29, 2018 by

In this article, you’ll learn how to setup a simple, custom distributed database replication system.

Introduction

In general, a typical setup of transactional replication model of a central publisher with multiple subscribers includes the creation of a replica database(s) which may serve multiple purposes including:

  • for reporting
  • the population of operational data for business analysis
  • population of a subset of data when it is in conjunction with a data warehouse

In this system, it is important to understand and consider the scope of the objects that need to be replicated across databases and servers. Replication allows to you choose particular tables, or even subsets of tables, to be replicated to other databases or you can replicate entire databases

I have been in a process to setup a mission-critical replication setup many times. It is very difficult to setup 100% synchronous data propagation between publisher and subscriber. In this article, I’ll explore some alternate approaches and technologies to create a custom replication solution, independent of traditional SQL Server replication

Replication

One Publisher can have multiple Subscribers, and in many cases a Publisher can also act as a Distributor. This replication model is referred to as central publisher-multiple subscribers. In this case, you’ve got one publisher server publishing data that is sent to several subscriber servers. The publisher server the central piece in a process that distributes data into multiple subscriber servers. The publisher server partitions the data and only sends out data that is relevant to the subscribing servers. The publishing server transmits the transactions to subscribing servers based on some set of conditions. For example, sales data at the regional level or data for specific manufacturing plants or data update for specific insurance policy etc. In any case, the users log onto the local subscriber server and retrieve the data set that is relevant to their respective job and responsibilities.

Design

Let’s take a brief look at the overall schematic of this system. It shows a simple layout of a single publisher database, where the transactions are read and distributed to four different subscriber databases.

In general, this model is used when you want to setup isolated read-only subscriber databases. Let us discuss the setup in detail:

  1. In the above depiction, the setup has one publisher and four subscribers, although in this article, we’ll run the example for only two subscribers
  2. In the publisher database, the subsets of tables are replicated to other subscriber databases using a filter mechanism
  3. The transaction reading mechanism is configured to track LSNs so that no transactions are missed or duplicated. This technique ensures and maintains the integrity of the transactions.
  4. The data is segregated based on the condition clause that is defined at the object level
  5. Using batch file automation, the segregated data SQL files are generated at regular intervals e.g. every 5 minutes
  6. The generated SQL files, containing all of the transactions, are replayed at the subscriber database

Technical implementation

For this custom SQL Server database replication system, we are going to use a ApexSQL Log as our SQL Server database transaction log reader agent and a combination of Batch files, PowerShell scrips and SQL Agent jobs

Scenario

Let us discuss the setup further in detail to understand the data availability scenarios.

  1. If the publisher goes down for some reason, no data is replicated across the subscriber database
  2. If subscriber 1 goes down, the data is still available in the publisher database. The SQL files are generated at the publisher are kept intact with the reference to the corresponding subscription database. You can synchronize the data at some later date.
  3. If all subscribers are down, the segregated data still available at the publisher.

Demo

Let us walk through the entire process using Adventureworks 2016 database.

  1. Backup the source database—Publisher
  2. Restore the backup file on all the target instance
  3. Initial preparation
    1. Load the data into the publisher database. For example, the cities table is loaded with sample data.
    2. Query the publisher and subscriber database. You noticed that there is a difference in the number of rows

  4. Build a batch file using ApexSQL Log
    1. Start ApexSQL Log
    2. Type in Server and Database details to connect to the Publisher database, AdventureWorks2016

    3. Configure ApexSQL Log to use the default online log setup to ensure the transaction availability for the defined interval

    4. Now, create Undo/Redo script

    5. Next, choose continuous auditing. This maintains the LSN (Log Sequence Number) value. The configuration ensures transactions integrity.

    6. Let’s setup a filter. In this case, the table dbo.cities is included. In this step, transactions are audited at the granular level. In this way, we can opt for tables to be included and/or excluded from the setup.

    7. In the Advanced options, Select the Filed Values to setup a filter on the selected table. Browse the selected table (Cities) to apply the condition clause.

    8. Select the operator, Field, Comparison operator and define the value. In this case, the data segregation or subset of data is derived for “US East”.

    9. You can see that the condition is placed in the filter condition column which is shown below

    10. The configuration is done in the way we want it, let’s save the ApexSQL Log CLI commands in the ‘Batch file’. Save the command replication.BAT file.

    11. Click finish to run the solution. You can see that 160 insert statements are generated in this process. This concludes that the setup and filter are working in the way we want it.

    12. Run through the same steps again to generate another subset from the Cities table using a condition clause. In this case, the data is segregated based on the “US West” region.

      Now you can see that 110 insert statements are generated in this process.

  5. Now, let’s take a look at the bath file. You can see that both the condition clauses are in the same batch file. It is very easy to add additional clauses with very few lines of code.

    @echo off

    SET “filename=%1”

    “E:\Program Files\ApexSQL\ApexSQL Log\ApexSQLLog.com” /server:HQDBT01 /database:AdventureWorks2016 /trusted /redo:G:\BKP\Subscriber1_%filename%.sql /operations:DMLALL /transactions:COMMIT BEGIN UNKNOWN /continuous:g:\bkp\AWPublisher.axtr /tables:[dbo].[Cities] /fields:” ([Cities].[C_Region] = ‘US East’ )”

    “E:\Program Files\ApexSQL\ApexSQL Log\ApexSQLLog.com” /server:HQDBT01 /database:AdventureWorks2016 /trusted /redo:G:\BKP\Subscriber2_%filename%.sql /operations:DMLALL /transactions:COMMIT BEGIN UNKNOWN /continuous:g:\bkp\AWPublisher.axtr /tables:[dbo].[Cities] /fields:” ([Cities].[C_Region] = ‘US West’ )”

  6. Prepare the PowerShell script. Save the below content in Replication.ps1 and run the script


    We can see that files are generated with the unique filename corresponding to the subscriber database.

  7. Test the data

    We can see that they are now equal numbers of rows have been replicated to the Subscriber databases.

  8. Schedule job

    To automate, schedule the PowerShell script to run at any interval you want e.g. 5 minutes. For more information on scheduling, in this type of scenario, you can refer to the scheduling section in this article How to set up a DDL and DML SQL Server database transactional replication solution

    Wrapping Up

    If you’re thinking of setting up scale-out, distributed replication model, evaluate the environment and make sure there is room for acceptable latency for subscriber access. In real-time, it is near impossible to replicate synchronous data propagation. In this article, we reviewed how to setup a custom transactional replication system with a central publisher and multiple subscriber model. In the article we reviewed the multiple subscriber use-case and then designed and built the system using a 3rd party SQL Server transaction log reader and common scripting technology e.g. PowerShell. If you feel like giving it a try, feel free to let me know how it went on the comments below

    Table of contents

    SQL Server replication: Overview of components and topography
    SQL Replication: Basic setup and configuration
    How to Add/Drop articles from existing publications in SQL Server
    How to do a quick estimated compare of data in two large SQL Server databases to see if they are equal
    SQL Server transactional replication: How to reinitialize a subscription using a SQL Server database backup
    How to setup a custom SQL Server transaction replication model with a Central Subscriber and Multiple Publisher databases
    How to setup custom SQL Server transactional replication with a central publisher and multiple subscriber databases
    How to set up a DDL and DML SQL Server database transactional replication solution
    How to setup cross-platform transactional SQL Server replication for database reporting on Linux
    SQL Server database migrations with zero data loss and zero downtime
    Using transactional data replication to replay and test production loads on a staging server
    How to setup SQL Server database replication for a reporting server
    SQL Server transactional replication: How to reinitialize a subscription using a “Replication support only” –TBA
    SQL Server Replication Monitoring and setting alerts using PowerShell –TBA

    Prashanth Jayaram

    Prashanth Jayaram

    I’m a Database technologist having 11+ years of rich, hands-on experience on Database technologies. I am Microsoft Certified Professional and backed with a Degree in Master of Computer Application.

    My specialty lies in designing & implementing High availability solutions and cross-platform DB Migration. The technologies currently working on are SQL Server, PowerShell, Oracle and MongoDB.

    View all posts by Prashanth Jayaram
    Prashanth Jayaram
3,217 Views