The volume of data retained, managed, and accessed today is unprecedented. Businesses expect the IT department to keep data online and accessible indefinitely, putting intense pressure on the databases required to store and manage it. To meet today’s needs; we need to replace outdated and inefficient legacy processes with new, more agile techniques. SQL Server Replication is one of the techniques to accommodate such demands.
In this article, let’s you shape your understanding of the full SQL Server replication topography including components, internals and the SQL to bind it all together. After you complete reading this article, you‘ll understand:
- SQL Server replication, in general
- Components of transactional SQL Server replication, in particular
- How to get distributor properties
- How to find the publisher using the same distributor
- What are the databases used for SQL Server replication
- The general topology of a replication environment
- What are the articles that are mapped to the type of SQL Server replication model
- How to get publication details
- How to get subscription details
- SQL Server Replication agents
- And more…
SQL Server replication is a technology for copying and distributing data and database objects from one database to another and then synchronizing between databases to maintain consistency and integrity of the data. In most cases, replication is a process of reproducing the data at the desired targets. SQL Server replication is used for copying and synchronizing data continuously or it can also be scheduled to run at predetermined intervals. There are several different replication techniques that support a variety of data synchronization approaches; one-way; one-to-many; many-to-one; and bi-directional, and keep several datasets in sync with each other.
Transactional SQL Server replication components
The following diagram depicts the components of transactional SQL Server replication.
Including the SQL Server replication …
- Publication database
- Distribution database
- Subscription database
- Replication agents
SQL Server replication diagram
An article is the basic unit of SQL Server Replication. An article can consist of tables, stored procedures, and views. It is possible to scale the article, horizontally and vertically using a filter option. We can also create multiple articles on the same object with some restrictions and limitations.
Using the New Publication wizard, the Article can be navigated. It allows us to view the properties of an article and provide options to set properties for the articles. In some case, the properties can be set during the time of publication creation and it’s a read-only property.
After the creation of a SQL Server replication publication, for instance, if some property requires a change, it will, in turn, require a new replication snapshot to be generated. If the publication has one or more subscriptions then the change requires all subscriptions to be reinitialized. For more information, see How to add/drop articles to/from existing publication in SQL Server article.
To list all the articles that are published, run the following T-SQL
,Art.[article] [Article Name]
INNER JOIN [distribution].[dbo].[MSpublications] Pub
ON Art.[publication_id] = Pub.[publication_id]
To get the details of articles in transactional or merge SQL Server replication in a published database, run the following T-SQL.
SELECT st.name [published object], st.schema_id, st.is_published , st.is_merge_published, is_schema_published
FROM sys.tables st WHERE st.is_published = 1 or st.is_merge_published = 1 or st.is_schema_published = 1
SELECT sp.name, sp.schema_id, 0, 0, sp.is_schema_published
FROM sys.procedures sp WHERE sp.is_schema_published = 1
SELECT sv.name, sv.schema_id, 0, 0, sv.is_schema_published
FROM sys.views sv WHERE sv.is_schema_published = 1;
To get detailed information about an article in the listed publisher, run the following T-SQL
DECLARE @publication AS sysname;
SET @publication = N'PROD_HIST_Pub';
@publication = @publication;
To get column level details, run the following T-SQL
sp_helparticlecolumns @publication = N'PROD_HIST_Pub' , @article = 'tb_Branch_Plant'
To list the columns that are published in transactional replication in the publication database, run the following T-SQL
SELECT object_name(object_id) [published table], name [published column] FROM sys.columns sc WHERE sc.is_replicated = 1;
A Publication is a logical collection of articles from a database. The entity allows us to define and configure article properties at the higher level so that the properties are inherited to all the articles in that group.
The publisher is a database that contains a list of objects that are designated as SQL Server replication articles are known as publication database. The publisher can have one or more publications. Each publisher defines a data propagation mechanism by creating several internal replication stored procedures.
select * from MSpublications
The Publisher is a database instance that makes data available to other locations through SQL Server replication. The Publisher can have one or more publications, each defining a logically related set of objects and data to replicate.
The Distributor is a database that acts as a storehouse for replication specific data associated with one or more Publishers. In many cases, the distributor is a single database that acts as both the Publisher and the Distributor. In the context of SQL Server replication, this is commonly known as a “local distributor”. On the other hand, if it’s configured on a separate server, then it is known as a “remote distributor”. Each Publisher is associated with a single database known as a “distribution database” aka the “Distributor”.
The distribution database identifies and stores SQL Server replication status data, metadata about the publication, and, in some cases, acts as a queue for data moving from the Publisher to the Subscribers.
Depending on the replication model, the Distributor might also be responsible for notifying the Subscribers that have subscribed to a publication that an article has changed. Also, the distribution database maintains the integrity of the data.
Each Distributor must have at least one distribution database. The distribution database consists of article detail, replication meta-data and data. A Distributor can hold more than one distribution database; however, all publications defined on a single Publisher must use the same distribution database.
To find out whether …
a server is a distributor or not?1SELECT @@ServerName Servername, case when is_distributor=1 then 'Yes' else 'No' end status FROM sys.servers WHERE name='repl_distributor' AND data_source=@@servername;
a distribution database installed or not?1SELECT name FROM sys.databases WHERE is_distributor = 1
a Publisher is using this Distributor or not?1EXEC sp_get_distributor
or just to interrogate various Distributor and Distribution database properties?123EXEC sp_helpdistributor;EXEC sp_helpdistributiondb;EXEC sp_helpdistpublisher;
A database instance that consumes SQL Server replication data from a publication is called a Subscriber. The subscriber can receive data from one or more publishers and publications. The subscriber can also pass data changes back to the publisher or republish the data to other subscribers depending on the type of the replication design and model.
A subscription is a request for a copy of a publication to be delivered to a Subscriber. The subscription defines what publication data will be received, where, and when.
There are two types of subscriptions: push subscriptions and pull subscriptions
- Push subscription: Distributor directly updates the data in the Subscriber database
- Pull subscription: the Subscriber is scheduled to check at the Distributor regularly if any new changes are available, and then updates the data in the subscription database itself.
A target database of a replication model is called a subscription database.
SQL Server replication uses a pre-defined set of standalone programs and events are known as agents, to carry out the tasks associated with data. By default, SQL Server replication agents run as scheduled jobs under SQL Server Agent. Replication agents can also be run from the command line and by applications that use Replication Management Objects (RMO). SQL Server replication agents can be monitored and administered using Replication Monitor and SQL Server Management Studio.
Replication snapshot Agent
The Replication snapshot Agent is used with all types of SQL Server replication technology as it provides the required data set to perform the initial data synchronization of the publication database with the subscription database. It prepares schema and initial data of published articles, snapshot files, and records information about the synchronization type in the distribution database.
Log Reader Agent
The Log Reader Agent is used only with transactional replication. It moves replication transactions from the online transaction log of the publication database to the distribution database.
The Distribution Agent is used only with Replication snapshot and Transactional SQL Server replication. This agent applies the initial replication snapshot to the subscription database and later, the data changes are tracked and recorded in the distribution database and applied to the subscription database.
The Merge Agent is used with the merge replication model. By default, the Merge Agent uploads changes from the Subscriber to the Publisher and then downloads changes from the Publisher to the Subscriber. Each subscription has its own Merge Agent that connects to both the Publisher and the Subscriber and updates both. The Merge Agent runs at either the Distributor for push subscriptions or the Subscriber for pull subscriptions. Here, the synchronization is bi-directional. The data conflicts are handled by a set of triggers that supports the entire process
Thus far, we’ve seen a walk-through of some of the important concepts of SQL Server replication. Also, T-SQL scripts are shown to query system tables and replication stored procedures to answer most of the commonly asked question about SQL Server replication.
I’ll discuss more about SQL Server replication in upcoming articles. If you feel, something can be improved in this article, feel free to leave your comment below…
Table of contents
My specialty lies in designing & implementing High availability solutions and cross-platform DB Migration. The technologies currently working on are SQL Server, PowerShell, Oracle and MongoDB.
View all posts by Prashanth Jayaram