In this 35th article in SQL Server Always On Availability Groups, we will implement a custom policy for monitoring AG groups.
The previous article, The Hub and Spoke model of Policy-based Management for SQL Server Always On Availability Group explores the following topics:
- Policy-Based Management (PBM) components
- AG replica health model
- Explore PBM policy categories for Always On
- Evaluate policy and view conditions states
- Integrate the AG dashboard and its equivalent PBM policy
SQL Server integrates a few systems defined PBM policy, conditions for monitoring purposes. However, you might want to customize the AG dashboard or AG health model as per your requirements. Let’s begin our journey for a custom AG policy.
The Custom Policy for SQL Server Always On Availability Groups
Suppose you implemented a two-node synchronous commit replica for your environment. In the synchronous replica, primary waits for transaction commit acknowledgement from the secondary replica. In this case, your secondary asynchronous replica might be lagging depending upon your transactions, network bandwidths.
You do not want your secondary replica to lag behind more than 30 minutes. If it lags, the PBM policy evaluation should fail. The AG dashboard should show a warning in the AG dashboard regarding the sync lag. Similarly, this policy also evaluates whether the estimated recovery time is less than 5 minutes or not.
Define the conditions
Previously, we learned about the different parts of defining policy in the hub-spoke model of policy-based evaluation for the SQL Server. It starts with defining the conditions that your policy will evaluate. To define a condition, connect to the primary replica in SSMS and navigate to Management-> Policy Management->Conditions. Here, right-click on Conditions and select New Condition.
In the Create New Condition window, enter the following values.
- Name: Specify a condition name
Facet: As we described earlier, a facet is a set of logical properties. We need to select the facet to define the expressions for a condition. You can browse the available facet and choose the properties that suit your requirements
For our demonstration, we use the Database Replica State that contains the properties for an AG database replica
- Expression: Here, we define the Boolean expressions for policy evaluation. From the field section, select the field, operator and specify the value for your condition. During the policy evaluation, if the conditions are not met (The boolean expression is false), it returns an error
You can specify the optional description to describe the conditions for your team members.
Click Ok, and it creates the conditions for PBM, as shown below.
Create a Policy for the defined condition
Once we have defined the condition, we can create a policy for PBM. To create the policy, in the primary replica SSMS connection, navigate to Management->Policy Management-> Policies. Right-click on the Policies and choose New Policy.
Enter the following values for a PBM policy:
- Name: Enter a policy name
- Check Condition: In the check condition column, select the defined policy from the drop-down. It automatically selects the PBM targets
- Evaluation mode: By default, the policy evaluation mode is On Demand. We can schedule it as well, but for the AG dashboard, it is acceptable to run while we run the evaluation
Now, click on the description from the left-hand menu. In this page, we can select in case of policy failure, AG dashboard displays it as an error or warning. Here, select the appropriate category for the PBM policy. As we want to use a warning for policy evaluation failure, we select the Availability database warnings category.
Put an optional description and text to display. You can put a familiar message or the suggested steps to take in case of policy failure.
It also allows you to specify a URL. For example, you defined policy, and in case of a policy failure, you want to follow specific steps from a specified URL, enter the URL address here.
Click Ok, and it creates a policy for SQL Server Always On Availability Group as shown below.
Let’s evaluate the policy manually. It evaluates successfully, as shown below.
Click on the view to get details of the conditions to check. In the detailed view, we see the followings:
- Boolean expression expected and actual value. For a successful policy evaluation, the actual value should meet with the expected value or range
- Policy description: Here, it shows the text we entered in the policy description
- Additional help: In the additional help, it shows the text to display string. If you specified a URL in the Policy address field, it shows a hyperlink in the additional help section
Enable User-Defined Policies in Availability Group dashboard
By default, SQL Server Always On Availability Group dashboard, runs the system defined policies that we explained earlier. Now, to implement an additional user-defined policy, we need to modify the properties in SQL Server Management Studio.
Navigate to SSMS Tools->Options -> SQL Server Always On. In this option, put a tick on the enable user-defined Always On Policy checkbox as highlighted below.
It implements the user-defined policy for SQL Server Always On Availability Group. You can refresh the AG dashboard or relaunch the dashboard to see it in action.
User-defined policy in action
For demonstration purposes, I modify the condition so that it fails the evaluation intentionally and displays a warning in the AG dashboard.
The updated policy conditions are shown below. Ideally, the estimated data loss should be zero for a healthy AG replica. In this case, we modified the condition that policy evaluates successfully if it is less than zero.
Click Ok and evaluate the policy manually. It fails because policy expects the estimated data loss less than zero, but the actual value is zero.
Now, launch the AG dashboard, and you see a warning symbol for the secondary replica.
Click on the warnings(1), and it shows the policy evaluation result. It failed, and you get the description for that we specified during policy creation. It is the reason you should specify an accurate description so that DBA’s can read it and take action accordingly.
AG Failover and Custom policies for SQL Server Always On Availability Groups
It would be best if you defined the custom policies on all participating nodes in SQL Server Always On Availability Group replicas. Let’s perform a manual failover and verify if the custom policies are available on the secondary replica (after failover new primary replica) as well.
Before failover changes the synchronization to Synchronous mode and initiates a manual failover as shown below, the new primary replica is SQLNode2\INST1.
My AG dashboard looks healthy after the failover, and we do not have the custom policies in the new primary replica instance SQLNode2\INST1.
It shows that if we deploy a custom policy, you need to implement on all participating replicas in SQL Server Always On Availability Group.
We can still connect to the old primary replica (current secondary replica) and export the policy. Right-click on the policy you wish to export and click on Export Policy…
It generates an XML for the policy.
Click on Save and copy this policy XML file into the new primary replica. In the new primary replica, right-click on Policies and Import Policy.
Specify the policy path and import it with the default options. You can choose to override or replace duplicates with items imported.
Now, you get the policy and its defined condition in the new primary replica, as shown below.
You might need to change the SSMS configuration for the user-defined policy in the AG dashboard if you are connecting it from a different SSMS.
Now, relaunch the AG dashboard. We see that despite the no data loss in the failover readiness, we see the secondary replica shows a warning. It is due to our policy that we intentionally modified to generate a warning in the AG dashboard.
In this article, we explored defining and deploying a custom policy to monitor the SQL Server Always On Availability Group using the AG dashboard. You can define the policies and their conditions as per your requirement and deploy to satisfy your requirement. You should carefully define these policies and conditions as the irrelevant policy can increase your team efforts irrespective of AG is working fine or not.