Rakesh Patil

SQL interview questions and answers on SQL Server Failover Clusters – I

November 5, 2018 by

SQL interview questions 1: What is a Windows failover cluster?

The cluster contains Group of servers called a node, those servers work together to act as a single entity for a client. The cluster is used in the context of server failover technology.

SQL interview questions 2: What is a cluster Node?

A cluster node is a server within the cluster with the Cluster service installed.

SQL interview questions 3: What is a cluster Service?

A Cluster service is used to operate Failover operations.

SQL interview questions 4: What is a cluster Group?

A cluster group contains a group of cluster Resources.

SQL interview questions 5: What is a Private Network in windows cluster?

It is used to monitor each node status and for node communication.

SQL interview questions 6: What is a Heartbeat in the context of a Windows cluster?

Heartbeats are messages that Cluster Service regularly sends between the instances of the Cluster Service that are on each node to manage the cluster.

SQL interview questions 7: What do the terms ‘Failover’ and ‘Failback’ mean in the context of a Windows Cluster?

  • Failover: Failover is a process of transferring or moving one server’s functionality operations, once its failed, to another server in the same cluster
  • Failback: Failback is the process which returns server functionality back to the server where it was running before it failed

SQL interview questions 8: What is a Quorum?

A quorum is an element which should be online to allow the cluster to continue running.

Example:

A cluster with 7 Nodes can still have a Quorum (4 Nodes Online) after 3 nodes fail, as the majority wins, but remember it does not just depend on the Quorum; the other 4 nodes should have the capacity to serve clients.

  • Case 1: On 7 Node Cluster 3 Failed and 4 Online and these 4 are capable of handling the load Cluster to allow it to continue serving
  • Case 2: On 7 Node Cluster 3 Failed and 4 Online and these 4 are not capable of handling the load Cluster which leads to all nodes being offline
  • Case 3: On 7 Node Cluster 4 Failed and 3 remain Online Cluster but this makes all nodes offline as the majority votes lead to it being offline

SQL interview questions 9: What are the various Quorum modes available?

  • Quorum Mode: Strategy to define the elements in the cluster which can cast a vote to make the Quorum.
  • Node Majority: Each node that is available and in communication can vote.
  • Node and Disk Majority: Each node plus a designated disk in the cluster storage (the “disk witness”) can vote, whenever they are available and in communication.
  • Node and File Share Majority: Each node plus a designated file share created by the administrator (the “file share witness”) can vote, whenever they are available and in communication.
  • No Majority: Disk Only: The cluster has a quorum if one node is available and in communication with a specific disk in the cluster storage.

SQL interview questions 10: What is Node Majority model?

Node Majority quorum has an odd number of nodes that can vote

SQL interview questions 11: What is Disk Majority model?

The Disk majority model has even number of nodes in the cluster.

SQL interview questions 12: Where are the results of validation tests stored?

These reports are automatically stored for you in <WinDir>\Cluster\Reports as HTML files.

SQL interview questions 13: We have a SQL Server cluster instance. Based on business requirements, we would like to install a new database, which is not critical, and failover. Can we keep this database files on local disk instead of clustered disk?

No, it’s not possible. There should be a database on clustered resources. We can’t use local drives or drives which are not part of the cluster group for holding user database files.

SQL interview questions 14: Can the TEMPDB database be configured on a local drive?

Yes, we can Configure the TempDB database on a local drive, as it supports cluster configurations.

SQL interview questions 15: What is a SMB share?

It is a Server Message Block Protocol, which is used for file sharing so that application can read and write files.

SQL interview questions 16: What are the advantages of using a SMB File Share?

  • It is not as expensive as SAN
  • It can be used for non-production servers when they want to use cluster installation
  • It can be used for Temporary storage

SQL interview questions 17: How is clustering is different from AlwaysOn?

  • AlwaysOn Availability Groups is an advanced feature of Database Mirroring which allows adding multiples databases in a single group for failover.
  • AlwaysOn Failover Cluster is the same as SQL Server Failover Cluster with some new features.

SQL interview questions 18: Can you tell me the best feature in SQL Server 2012 AlwaysOn Failover Cluster when compared to SQL Server 2008 R2 Failover Cluster?

AlwaysOn offers cluster nodes on different subnets.

SQL interview questions 19: In a Failover Cluster, what are the elements should be matched between nodes?

Failover cluster nodes should have the same hardware and software configuration, Operating system version and patch level.

SQL interview questions 20: What are the different services managed by Windows Server Failover Cluster (WSFC) service?

The WSFC service manages

  • Server cluster configuration
  • Quorum configuration
  • Failover policy
  • Failover operations
  • Virtual Network Name (VNN)
  • Virtual IP addresses

SQL interview questions 21: What are the various failures that cause Cluster Failover from one node to other?

  • Hardware failures
  • Operating System Failures
  • Application/Service failures
  • Planned/Manual Failover

SQL interview questions 22: What are the primary elements of a failover cluster?

  • WSFC Resource Group:
    A SQL Server FCI runs in a WSFC resource group. Each node in the resource group maintains a synchronized copy of the configuration settings and check-pointed registry keys to ensure full functionality of the FCI after a failover and only one of the nodes in the cluster owns the resource group at a time was known as an active node.
  • SQL Server Binaries:
    These are installed on a node of the failover cluster
  • Storage:
    Failover cluster uses shared storage between nodes
  • Virtual IPs:
    Failover cluster instance can be multi-subnet, virtual IP is assigned to each subnet in the cluster.

SQL interview questions 23: Can you explain how you know the current quorum configuration on your cluster?

We can choose cluster name in summary in quorum configuration in failover cluster manager.

  • Command Prompt: c:\cluster/quorum

SQL interview questions 24: Can we change the Quorum settings after installing the windows cluster?

Yes, we can change the Quorum setting after the Windows Cluster installation.

SQL interview questions 25: What meant by a Split Brain situation in the context of a Cluster?

Split brain means the same resources will be online on multiple nodes at the same time. This happens when nodes are not able to communicate with each other.

SQL interview questions 26: How is the Split-Brain situation resolved?

The Cluster resource should be online on a single node; Resources will be online on a group which has more votes.

SQL interview questions 27: Where we can check whether a node is active or Passive.

We can check Active and Passive node in SQL Server group in the Cluster Administrator.

SQL interview questions 28: What is the difference between an Active\Passive and Active\Active cluster?

  • Active-Passive: One node is active, and another node is passive. A passive node is online. When an active node is failed then the passive node will take over.
  • Active-Active: Both nodes are active and online. The instance will be run on either nodes.

SQL interview questions 29: On active directory, what the permissions are required for a cluster SQL Server service account?

A clustered SQL Server Service account should have OBJECT CREATION permissions in the Active Directory.

SQL interview questions 30: How can I estimate the approximate time required for failover?

There are lots of factors that are involved in failover time. Ex: SQL Server database recovery time and buffer cache that needs to be written to disk etc. I would recommend the below points:

  • Create a test load (Should match with peak load) on your server and capture the failover time.
  • To make sure the failover time is matching with your service level agreement, have a look into indirect checkpoints.

SQL interview questions 31: Can we implement clustering on a virtualized server?

Yes, we can create failover clusters with virtual servers with VMware or Hyper-V and try SQL Server clustering.

SQL interview questions 32: You have an incident on a premium server due to a cluster failover. While fixing that what is your recommendation; cluster management console or PowerShell?

I always recommend using PowerShell script.

  • Most of the times Cluster Management Console will respond very slowly in case of failures and failovers
  • Make sure you are prepared with the all require PS commands to fix production issues
  • You should run PS in admin mode

SQL interview questions 33: What are “LooksAlive” and “IsAlive” checks?

  • Windows cluster service uses “HeartBeat” process to check nodes availability for both OS and SQL Server.

    It performs 2 health checks:
    • “LooksAlive”
    • “IsAlive”
    • “LooksAlive” Health Check

  • It’s a quick lightweight health check, By default runs for every 5 Seconds. It doesn’t perform a detailed check therefore it may not identify the services which are running but responding or in a hung state.

  • If in case LooksAlive check fails it calls the “IsAlive” check Polling interval can be changed by adjusting LooksAlivePollInterval property of Cluster service “IsAlive” Health Check:

    It’s a detailed health check by default runs for every 60 Seconds Run @@SERVERNAME to ensure that SQL Server is responding to queries.

  • It can’t identify database failure which means it makes sure SQL Server is up, running and responding for queries but it doesn’t check a database level check. When unable to connect to SQL Server it retries for 5 times After continues, 5 failures Windows cluster service initiate the failover to another node. Polling interval can be changed by adjusting IsAlivePollInterval property of Cluster service.

SQL interview questions 34: How can I add a new node to an existing SQL Server Cluster?

For 2008 R2 and 2012: –

  • It’s as simple as installing a new SQL Server instance
  • Run SQL Server Setup
  • Select “Add node to SQL Server Failover Cluster”
  • Next, give all the required details
  • On Cluster Node Configuration Page Name on the node will automatically populate again, continue with the next steps and finally “Add Node Progress” after successfully adding node we need to make sure the node is working as expected
  • Manually Move/Failover SQL Server Service to newly installed node
  • After moving, cross check all services are online on the newly added node

SQL interview questions 35: How to remove a SQL Server cluster node from a failover cluster configuration?

Let’s say we have 2 node cluster and we are going to remove Node2:

  • Connect to the node2 instance
  • Mount SQL Server installation setup
  • Run setup.exe as administrator
  • Go to tab “Maintenance” and select “Remove Node from SQL Server Failover Cluster”

After that, it checks the rules and takes you through the removal steps. A successful cluster node removal, you can verify at Failover Cluster Manager or using DMV

sys.DM_OS_Cluster_Nodes.

SQL interview questions 36: What are the various log files that can help us in troubleshooting issues and in Root Cause Analysis?

  • SQL Server error logs: Check SQL Server error logs to find out the errors.
  • Windows Event Viewer (System/Application): As a usual check, this place to find if anything suspicious.
  • Failover Cluster Manager:
    We can see the high-level details from Cluster Wizard Summary page. Also, we can see events and logs by filtering the required events.
  • Generate Cluster Log Files Using Cluster.exe:

    To generate Cluster.log issue below command from CMD:

    C:\> cluster/cluster log/g

    Generate Cluster Log Files Using PowerShell:

    Cluster.exe is in deprecated features from SQL Server 2012 whereof we can use the Powershell command to get the Cluster.log:

    PS C:\Import-Module FailoverClusters
    PS C:\get-clusterlog

    To get cluster log on required location

    PS C:\get-clusterlog –destination C:\Temp

    To get Cluster log for last 30 Min

    PS C:\get-clusterlog -TimeSpan 30

  • Failover Cluster Instance Diagnostics Log using T-SQL:

    Start Diagnostic Logging
    ALTER SERVER CONFIGURATION SET DIAGNOSTICS LOG ON;

  • Set Diagnostic File Location:

    ALTER SERVER CONFIGURATION SET DIAGNOSTICS LOG
    PATH = ‘C:\logs’;
    Specifying the maximum size of each diagnostic log
    ALTER SERVER CONFIGURATION SET DIAGNOSTICS LOG
    MAX_SIZE = 30 MB;
    Stop Diagnostic Logging
    ALTER SERVER CONFIGURATION SET DIAGNOSTICS LOG
    OFF;
    Server\MSSQL11.MSSQLSERVER\MSSQL\Log\SQLNODE1_MSSQLSERVER_SQLDIAG_)
    AS XEventData
    ORDER BY Time;
    Stop Diagnostic Logging
    ALTER SERVER CONFIGURATION SET DIAGNOSTICS LOG
    OFF;

SQL interview questions 37: How can you revalidate your cluster?

In the Failover Cluster Console select Failover Cluster Management and click on “Validate a configuration”. Follow the wizard instructions and on Summary page click on “View Report”.

The same report will be stored on folder.

“%SystemRoot%\Cluster\Reports\Validation Report date and time.html”

SQL interview questions 38: Can you tell me a few T-SQL commands that you use to quickly know the cluster status?

SQL interview questions 39: Any SQL DBA who works on cluster instances should have a minimum knowledge on using PowerShell. Do you know any PowerShell commands which can be useful in knowing the cluster status?

We use PowerShell commands to get the cluster instance details and, most of the times PS is the only way to fix problems in a clustered environment. There are 2 things should be done while using PowerShell:

Start PowerShell with Run as Administrator Import the FailoverClusters module. To Import the module FailoverClusters PS C:\> Import-Module FailoverClusters

Most common cmdlets we use:

  • Get-ClusterGroup:
    To know the status of all cluster groups in our cluster: Cluster Name “SQLCUST01”

    PS C:\> Get-ClusterGroup -Cluster SQLCLUSTER
    Get-ClusterResource & Where-Object & Sort-Object:
    Get-ClusterResource: Get all clustered resources information
    Where-Object: Filter only SQL Server Group resources.Sort-Object: Sort based on the Group name
    To start a cluster resource SQL Server Input Cluster Name and Resource Name

    Example: PS C:\> Start-ClusterResource -Cluster SQLCLUSTER -Name “SQL Server (INST1)”

  • Stop-ClusterResource:
    To stop a cluster resource SQL Server Input Cluster Name and Resource Name Ex: PS C:\> Stop-ClusterResource -Cluster SQLCLUSTER -Name “SQL Server (INST1)”

  • Move-ClusterGroup:
    It can move an entire resource group from one node to other in a cluster we need to input Resource Group Name, Cluster Name, and Node to which the group has to be moved

    PS C:\> Move-ClusterGroup “SQL Server (INST1)” -Cluster
    SQLCLUSTER -Node SQLNODE02

  • Note:
    Also, we can run T-SQL commands at PowerShell to do that we need to import the module sqlps

    PS C:\> Import-Module sqlps
    Now check SQL Server ERRORLOG using Powershell:
    PS SQLSERVER:\> Invoke-Sqlcmd -Query “EXEC
    master.dbo.XP_READERRORLOG 0, 1, N’error’, `
    null,null,null, N’desc’” -ServerInstance “SQL1\INST1”

SQL interview questions 40: I need a solution for both my services and data what the best will be solution available in SQL Server?

There are a few ways:

  • SQL Server 2008 R2: SQL Server Failover Cluster + Database Mirroring/Log shipping/Replication
  • SQL Server 2012/2014: AlwaysOn Failover Clustering + AlwaysOn Availability Groups/Log shipping/Replication
  • SQL Server 2008 R2/2012/2014: Geo-Clustering/Multisite Failover/Active-Active Cluster configuration: In this, we need not use a central storage and us can use individual disks. But we need a disk level replication solution. We can use a third-party solution for replicating data between disks Example: Data Keeper Cluster Edition. Also, the nodes can be in different subnets this is the new feature added in SQL Server 2012.

SQL interview questions 41: Can we implement clustering on a virtualized server?

Yes, we can create failover clusters with virtual servers with VMware or Hyper-V and try SQL Server clustering.

SQL interview questions 42: Can we make a Cluster Node online without Quorum?

We were not able to connect to SQL Server. We tried to connect to Cluster Manager Console even it’s not responding for a long time. We found that SQL Server Instance is offline, and Windows Failover Cluster Instance is offline too.

  • RCA:
  • 2 Node Active/Passive Cluster
  • Node 1 – Active is having 1 vote
  • Node 2 – Stand By is having 1 vote
  • Quorum Disk is having 1 vote
  • Standby node and Quorum Disk went offline
  • Windows Failover Cluster Instance (WFCI) checks that majority of votes are offline
  • WFCI is also going to offline Of course, SQL Server too went offline

Resolution:
We tried to make Windows Failover Cluster Instance Online without Quorum.
Connect to the current Active Node and make sure Cluster Service is not running from services.msc.

If it is running stop it. Connect to Power shell console with Admin rights and import required module FailoverClusters.Start the cluster node using the below command to start cluster node without the quorum.

Start-ClusterNode –Name “WSCLUSTER-Premium” -FixQuorum
(Get-ClusterNode –Name “WSCLUSTER-Premium”). NodeWeight = 1

Once the WSFC is online, clustered instance also online. Then fix the issues on Quorum Disk. Try to bring the other cluster node online.


Rakesh Patil

Rakesh Patil

Data Architect at Phoenix Innovations
I am Rakesh Patil, having 6+ years of experience in SQL database Administration, worked with Top MNCs with major Financial clients. currently working as Data Architect with Phoenix Innovations.
I have hands-on Experience in Performance tunning, Database Disaster recovery,
AlwaysOn,Replication and AWS.
Rakesh Patil
Clustering, SQL interview questions

About Rakesh Patil

I am Rakesh Patil, having 6+ years of experience in SQL database Administration, worked with Top MNCs with major Financial clients. currently working as Data Architect with Phoenix Innovations. I have hands-on Experience in Performance tunning, Database Disaster recovery, AlwaysOn,Replication and AWS.

52 Views