Prashanth Jayaram

How to setup a custom SQL Server transaction replication model with a Central Subscriber and Multiple Publisher databases

September 12, 2018 by

In this data-driven era, replication is often a critical requirement for achieving a modern, agile database management environment. It is believed designing an enterprise-grade dataset is the to achieving this requirement but building datamarts from datasets always presents certain challenges

In this article, we’ll discuss what it takes to setup “central subscriber with multiple publishers” replication model, to create an aggregate dataset from multiple sources, and you’ll also see how to scale with the data.

Here’s a sneak peek of the topics:

  • Replication model; Central subscriber-multiple publisher
  • Understanding the state of the data
  • Highlights of a custom transaction log reading/replication
  • Setup and internals
  • And more!

Technology

For our SQL Server transaction log reading agent we will use ApexSQL Log, a 3rd party tool that can read the SQL Server transaction log and convert transactions into “redo” statements that can be executed on our various subscribers

Replication Model

The “central subscriber–multiple publishers” model is commonly used in situations where data from multiple sites needs to be consolidated at a central location while providing access to the local site with local data. A shipment data warehouse is a typical example. Where orders are placed at all publisher locations and the order confirmation details are pushed to central subscriber from which the order is shipped to the respective parties.

In short, the multiple publisher databases replicate their data to the same subscription table but the process may pose a unique problem. We need to have a model to include the information to create a unique identifier for every row in every publisher database. For example, in this model, several publishers replicate data to a single, central subscriber. Basically, this is a process to support the data consolidation concept at a central database.

In some other instances, the publisher databases replicate the data at the table level. In this case, the synchronization is straight-forward, as we deal with higher granularity.

For each article to be published in a Central Subscriber topology, a unique identifier should be defined to leverage the action. We can use a lookup table to generate the unique combination of rows in all the publishers. To publish rows from Publisher 1, unique combination of keys is defined in the article so that it will not end up in creating a duplicate row. Likewise, to publish rows from Publisher 2 and Publisher 3, and Publisher 4, the same logic is placed across all the articles.

Setup

Let us walk through the entire process in detail

  • Initial preparation
    • Load the data into the Orders table on entire 4 publisher database. For example, the insert statement is generated in combination with region-specific details. In this case, the Order_id and Region_ID is a primary-key constraint.
    • Query the publisher and subscriber database. You noticed that there are missing numbers of rows at the subscriber database.

  • Build ApexSQL CLI commands
    • Start ApexSQL Log
    • In the Select database, type in Server and Database details and Click Next

    • In Select data sources, leave the default Online transaction log option. This option allows the application to read online transaction log files.

    • Next, prepare redo script by selecting Undo/Redo script

    • Now, In the Filter setup, select continuous auditing feature. The tracking file is a unique feature and it induces an intelligence to track the entire transactions by maintaining the LSN (Log Sequence Number) value. The safeguards integrity of the entire transactions.

    • In the Tables, select the object to be published on the central subscriber. In this case, the table Orders is selected as publishing article.

    • The configuration is all most done for the publisher database. Let’s save the batch script in the “Batch file”. Save the command as replication.BAT.

    • The batch file content is shown below

      “E:\Program Files\ApexSQL\ApexSQL Log\ApexSQLLog.com” /server:HQDBT01 /database:AdventureWorks2016 /trusted /redo:G:\Replication\redo.sql /operations:DMLALL /transactions:COMMIT BEGIN UNKNOWN /continuous:g:\bkp\AdventureWorks2016.axtr /tables:[dbo].[Orders]

    • Next, prepare the CLI commands for other publishers.
      • You also have an option to test individual set of CLI commands or you can have all in a batch file and run it through PowerShell
      • In this case, the CLI commands are integrated. It is very simple and straight-forward. You need to copy and paste the content one after the other as per the number of publisher databases.
      • As you see, the ‘/database’ switch has four different databases and the ‘/continuous’ switch has four unique tracking files.
      • The SQL file is uniquely identified with a name. The name of the file <Publisher Name_<Date and timestamp>.sql is prepared and fed to ‘/redo’ switch

    Build PowerShell Script Replication.ps1

    • Load the SQL Server Module
    • Assign the current date and timestamp value to $datetime variable
    • Pass the variable $datetime to ApexSQL Log CLI command
    • The batch file “replication.bat” is invoked and it generates four SQL replay files
    • Execute the SQL file in the same cadence using Invoke-SQLCMD

    Execute PowerShell script Replication.ps1

    Test the data

    We can see that the aggregated set of rows got inserted into the subscriber databases.


    Schedule a job

    To automate, schedule the PowerShell script to run at any interval you want e.g. 5 minutes. Refer to the scheduling section in this article How to set up a DDL and DML SQL Server database transactional replication solution

    Wrap Up

    In the article we reviewed the “central subscriber with multiple publishers” use-case and then designed and built the system using a 3rd party tool, ApexSQL Log, a SQL Server transaction log reader, and common scripting technologies e.g. batch files and PowerShell.

    I re-iterate the importance of replication design. Data replication design is a balancing act between managing data at its various forms. Replication design techniques are vital and things get complicated quickly if we’ve not designed properly. For example, in this model, subscribers should be treated as read-only because changes are not propagated back to the publisher. So inserts, updates, and deletes should be avoided at the subscriber database as this could lead to non-convergence.

    That’s all for now… If you feel like giving it a try, feel free to let me know how it went on the comments below

    Appendix

    Batch Script

    @echo off

    SET “redofile=%1”

    “E:\Program Files\ApexSQL\ApexSQL Log\ApexSQLLog.com” /server:HQDBT01 /database:AdventureWorks2016 /trusted /redo:G:\Replication\publisher1_%redofile%.sql /operations:DMLALL /transactions:COMMIT BEGIN UNKNOWN /continuous:g:\bkp\AdventureWorks2016.axtr /tables:[dbo].[Orders]

    “E:\Program Files\ApexSQL\ApexSQL Log\ApexSQLLog.com” /server:HQDBT01 /database:AdventureWorks2014 /trusted /redo:G:\Replication\publisher2_%redofile%.sql /operations:DMLALL /transactions:COMMIT BEGIN UNKNOWN /continuous:g:\bkp\AdventureWorks2014.axtr /tables:[dbo].[Orders]

    “E:\Program Files\ApexSQL\ApexSQL Log\ApexSQLLog.com” /server:HQDBT01 /database:AdventureWorks2012 /trusted /redo:G:\Replication\publisher3_%redofile%.sql /operations:DMLALL /transactions:COMMIT BEGIN UNKNOWN /continuous:g:\bkp\AdventureWorks2012.axtr /tables:[dbo].[Orders]

    “E:\Program Files\ApexSQL\ApexSQL Log\ApexSQLLog.com” /server:HQDBT01\SQL2017 /database:WideWorldImporters /trusted /redo:G:\Replication\publisher4_%redofile%.sql /operations:DMLALL /transactions:COMMIT BEGIN UNKNOWN /continuous:g:\bkp\WideWorldImporters.axtr /tables:[dbo].[Orders]

    SQL

    Table of contents

    SQL Server replication: Overview of components and topography
    SQL Replication: Basic setup and configuration
    How to Add/Drop articles from existing publications in SQL Server
    How to do a quick estimated compare of data in two large SQL Server databases to see if they are equal
    SQL Server transactional replication: How to reinitialize a subscription using a SQL Server database backup
    How to setup a custom SQL Server transaction replication model with a Central Subscriber and Multiple Publisher databases
    How to setup custom SQL Server transactional replication with a central publisher and multiple subscriber databases
    How to set up a DDL and DML SQL Server database transactional replication solution
    How to setup cross-platform transactional SQL Server replication for database reporting on Linux
    SQL Server database migrations with zero data loss and zero downtime
    Using transactional data replication to replay and test production loads on a staging server
    How to setup SQL Server database replication for a reporting server
    SQL Server transactional replication: How to reinitialize a subscription using a “Replication support only” –TBA
    SQL Server Replication Monitoring and setting alerts using PowerShell –TBA

    Prashanth Jayaram

    Prashanth Jayaram

    I’m a Database technologist having 11+ years of rich, hands-on experience on Database technologies. I am Microsoft Certified Professional and backed with a Degree in Master of Computer Application.

    My specialty lies in designing & implementing High availability solutions and cross-platform DB Migration. The technologies currently working on are SQL Server, PowerShell, Oracle and MongoDB.

    View all posts by Prashanth Jayaram
    Prashanth Jayaram

    Latest posts by Prashanth Jayaram (see all)

114 Views