Rakesh Patil

SQL Interview Questions on SQL Server replication – Part 2

October 30, 2018 by

The following is the second part of SQL interview questions series, focusing mostly on the replication monitoring, most common SQL Server Replication errors and different replication scenarios.

SQL Interview Questions 1: How to monitor latency in replication?

There are three methods:

  • Replication monitor
  • Replication commands
  • Tracer Tokens

Replication Monitor

In replication monitor from the list of all subscriptions, double-click on the desired subscription. There we find three tabs:

  • Publisher to Distributor History
  • Distributor to Subscriber History
  • Undistributed commands

Replication Commands

  • Publisher.SP_ReplTran – Checks the pending transactions at p
  • Distributor.MSReplCommands and MSReplTransactions – Gives the transactions and commands details. Actual T_SQL data is in binary format. From the entry time, we can estimate the latency.
  • Distributor.SP_BrowseReplCmds – It shows the eaxct_seqno along with the corresponding T-SQL command
  • sp_replmonitorsubscriptionpendingcmds – It shows the total number of pending commands to be applied at Subscriber along with the estimated time.

Tracer Tokens

Tracer tokens are nothing but transactions which are written into Publisher’s transaction log. It is read by Log reader agent and put into the Distribution database, then it is written to the Subscriber. All these records can be seen on the replication monitor.

Below is the T-SQL code to use Tracer tokens to troubleshoot the latency issues.

SQL Interview Questions 2: What are the various ways to monitor replication?

  • Replication Monitor
  • T-SQL Commands & system CATALOGS
  • Tracer Tokens
  • Performance Monitor
  • Extended Events
  • Verbose Logging

SQL Interview Questions 3: Which is the best way to monitor replication? Have you ever implemented automations in replication monitoring?

  • Replication Monitor: One of the best ways to monitor but not all the time. It showcases the current situation to answer the question “How are things right now?” but it doesn’t baseline. Also, replication monitor internally runs the commands to get the required details. On a busy topology it might be a performance intensive
  • T-SQL Commands & system CATALOGS: We absolutely suggest this as it directly gets only the required things. It allows us to baseline and customizing the monitoring
  • Tracer Tokens: We don’t suggest this. We have seen some scenarios where it took a lot of time to respond
  • Performance Monitor: This is also a resource intensive method, but sometimes it might be helpful. Even for the performance counters we would suggest using T-SQL scripts with the required DMV
  • Extended Events: We never used this method. Moreover, in SQL Server 2012 Books Online Microsoft mentioned that this method is for customer support engineers (Microsoft FTE) to collect the information for troubleshooting
  • Verbose Logging: This method may not be useful for monitoring replication but it’s helpful at the time of troubleshooting. In this method, we just enable Agent History profile for the detailed logging and it helps us understanding the detailed progress and error message. Make sure we are using this only for the troubleshooting as it’s also a performance intensive

We suggest using a customized solution

  1. Create a list of tables to capture the replication agents, latency and pending transactions/commands details using T-SQL code
  2. Define baselines for parameters
  3. Create scripts to capture the replication health details and store it on predesigned
  4. Tables
  5. Create scripts to compare the captured values with the baseline values and to send a health check report to the DBA team if there are any issues or delays

SQL Interview Questions 4: What are the things that we need to monitor periodically to know the replication health?

There are basically three things that helps us to understand the health of replication:

  • Replication Agent status
  • Latency – Publisher-Distributor-Subscriber
  • Outstanding commands to be subscribed

SQL Server Interview Questions 5: In transactional replication setup, Distribution agent failed with the error message “Cannot insert duplicate key row in object ‘XXXX’ with unique index ‘PK_XXXXX’. The statement has been terminated.” Have you ever seen this kind of error? If yes, can you explain how to resolve?

We may get this error when a row on the Subscriber already exists on the Subscriber database. We can fix this error by skipping failed transactions or deleting already existing row at Subscriber.

SQL Server Interview Questions 6: In transactional replication setup Distribution agent failed with the error message “The row was not found at the Subscriber when applying the replicated command.” Have you ever encountered this kind of error? If yes, can you explain how to resolve?

This error occurs when row on which we are requesting transaction does not exist at Subscriber. We can skip failed error to move forward.

SQL Interview Questions 7: How to skip Distributed Agent errors in transactional replication?

By default, when the Distribution Agent encounters an error, the agent stops. If you use the -SkipErrors parameter and specify expected errors or errors that you do not want to interfere with the replication, the agent will log the error information and then continue running.

For example, if you want to specify that the Distribution Agent should log duplicate key violations, but continue to process subsequent transactions, specify that the agent should skip errors 2601 (Cannot insert duplicate key row in object ‘%.*ls’ with unique index ‘%.*ls’.) and 2627 (Violation of %ls constraint ‘%.*ls’. Cannot insert duplicate key in object ‘%.*ls’.):

-SkipErrors 2601:2627
The most common way to use the -SkipErrors parameter is to use the Distribution Agent profile titled Continue on Data Consistency Errors. The Distribution Agent will then skip errors 2601, 2627, and 20598 (The row was not found at the Subscriber when applying the replicated command).

SQL Interview Questions 8: What are counters to monitor for transactional replication?

Object: SQL Server: Replication Log Reader – Counter:
Log Reader: Delivered Cmds/Sec and Tx/Sec

These two counters are repeated for each publication; they display the commands or transactions read per second and indicate how many commands or transactions are read by the Log Reader per second. If this counter increases, it indicates that the Tx rate at the Publisher has increased.

Object: SQL Server: Replication Dist. – Counter:
Dist: Delivered Cmds/Sec and Tx/Sec

These two counters are repeated for each subscription and display the commands or transactions per second delivered to the subscriber. If this numberis lower than the Log Reader delivered number, it is an indication that commands may be backing up on the Distribution database. If it is higher than the Log Reader rate and there is already a backlog of commands, it might indicate that replication is catching up.

SQL Interview Questions 9: How to resolve the error “Could not execute sp_MSadd_repl_commands.” Here is the full error message: “The process could not execute ‘sp_MSadd_replcmds’ on ‘server\instance’. (Source: MSSQLServer, Error number: 1007) Cannot insert duplicate key row in object ‘dbo.MSrepl_commands’ with unique index ‘ucMSrepl_commands’. (Source: MSSQLServer, Error number: 1007)”

If any other user is executing the replication commands in the Publisher and his session is open for a long time, you will get this error

OR

One of the publications receives many inserts in one transaction. After the transaction is committed, the Log Reader Agent starts to process and split the transaction according to the MaxCmdsInTran parameter. The Snapshot Agent on another publication starts before the Log Reader Agent finishes the task.

In this scenario, the Log Reader Agent fails, and you receive the error message

Resolution:

  • If this issue occurs 2 or 3 times, we can ignore it and simply start the logreader jobs to succeed.
  • If the issue persists, then find out the session that’s causing the problem, kill the SPID and restart the agent.
  • If it’s frequently occurring set parameters as below:
    • @sync_method <> “concurrent”
    • @immediate_sync <> “True”
    • MaxCmdsInTran = 0

Microsoft declared it’s a known issue in SQL Server 2008 R2 SP1 and SQL Server 2012. To prevent these error messages, apply the required Cumulative Update Packages released after SQL Server 2012 and SQL Server 2008 R2 SP1

SQL Interview Questions 10: Log reader fails with “The process could not execute ‘sp_replcmds’” error. How do you troubleshoot it?

We need to give DBO rights to the job owner, which will remove above error.

SQL Server Interview Questions 11: Have you ever encountered the error message” Subscriptions getting expired” and “subscriptions being marked as inactive”? What does it mean?

  • Subscription Expired

    When subscription is expired, subscription references are deleted from Publisher, and we need to re-create subscription.

  • Subscription Inactive

    When subscription is Inactive, it means it will not receive replicated commands, and it needs to be re-created to make sync with Publisher.

SQL Interview Questions 12: We have transactional replication on database. Replication is running successfully and now we need to add a new Article. Can we add a new Article without generating snapshot for all existing Articles?

It is possible. To generate snapshot only for the newly added Articles only we need to change two publication properties:

  • allow_anonymous
  • Immediate_sync

Disable Publication Properties:

Add new Article: In SSMS from the publication properties add new Article Start the Snapshot Agent. Right click the publication and select “View Snapshot Agent Status” and start the snapshot agent. If you check the snapshot folder you can see the snapshot generated only for the newly added Article.

SQL Interview Questions 13: In a Transactional replication, a transaction failed on Publication. Does the failed transaction replicate to Subscriber?

Only committed transactions are replicated to Subscribers. We can verify using sp_browsereplcmds

SQL Server Interview Questions 14: A transaction successfully completed on Publisher but subsequently failed on Subscriber. What happens to this transaction on Subscriber and Publisher?

If transaction is successful on Publisher, it means transactions are committed on Publisher. If it is failed on Subscriber, then according to ACID properties, it will be rolled back automatically.

SQL Server Interview Questions 15: My Publisher is in SQL Server 2014 and we need to subscribe to SQL Server 2012 instance. Is it possible?

Yes, it is possible.

SQL Interview Questions 16: As per the business requirement one of the Articles needs to be published in two publications. Is that possible?

Yes, we can do it.

If we need to publish an Article in a transactional publication and a merge publication, then we need to set @published_in_tran_pub property TRUE. The Article will not be published in the merge and transactional publication with queued updating Subscriptions.

SQL Interview Questions 17: In transactional replication log reader agent is failing with the error message “The process could not execute ‘sp_repldone/sp_replcounters’ on ‘Publisher’.” Have you ever encountered this issue? If yes can you explain why this happens and how to resolve it?

There might be an orphan Log Reader Agent from the situation where user manually executes sp_repldone, sp_replcmds, or sp_replshowcmds against the same database and forgot to close the connection.

Resolution:

  1. Check the replication monitor to determine if any extra log reader agent is running against the given database. It mostly never showcases even an orphan log reader running
  2. Check any open connections for published databases
  3. Restart the Log Reader Agent.
  4. Execute sp_replflush at the Publisher on the publication database, and then restart the Log Reader Agent.

SQL Interview Questions 18: The log reader agent fails with this error message: “Timeout expired”. How to deal with this?

Quickly check the network connection between the Publisher and Distributor (if it is a remote Distributor). Also, cross check if the service/logins are working without any issues.

If you can connect to the Publisher, run the following query under the published database

The sp_replcmds will find out what transactions must be picked up and must be sent to the Distribution database. If the sp_replcmds stored procedure returns data, gradually increase the parameter value of the sp_replcmds Ex. 10, 50, 100 etc. Repeat this until the same timeout expired error occurs.

Reduce the value for the “ReadBatchSize” parameter from the default of 500 transactions to the working value you found through your sp_replcmds testing.

Increase the “QueryTimeOut” parameter from 300 seconds to 3,000 seconds

If no record returns when you run “sp_replcmds 1,” but you know that there are replication transactions that are waiting in the log, then we must run DBCC CHECKDB

SQL Interview Questions 19: Replication has broken as the Subscriber was down. When it comes online we were trying to reinitialize the Subscription. It was failed with the error message “Cannot Reinitialize the Subscriber.” Any idea how to fix this?

We need to connect Subscriber, it contains Msreplication_subscriptions table. If value of immediate_sync parameter changed to 1.0, then it indicates non-immediate sync publications, after that it will initialize Subscriber.


Rakesh Patil

Rakesh Patil

Data Architect at Phoenix Innovations
I am Rakesh Patil, having 6+ years of experience in SQL database Administration, worked with Top MNCs with major Financial clients. currently working as Data Architect with Phoenix Innovations.
I have hands-on Experience in Performance tunning, Database Disaster recovery,
AlwaysOn,Replication and AWS.
Rakesh Patil
Replication, SQL interview questions

About Rakesh Patil

I am Rakesh Patil, having 6+ years of experience in SQL database Administration, worked with Top MNCs with major Financial clients. currently working as Data Architect with Phoenix Innovations. I have hands-on Experience in Performance tunning, Database Disaster recovery, AlwaysOn,Replication and AWS.

110 Views