James Rhoat

SQL Server Setup – Database Alerts

August 31, 2018 by

Why should you enable alerts

While the article “How to create and configure SQL Server Agent Alerts” created by Minette Steynberg a SQLShack author discusses the features around Alerting through SQL Agent and some conditions for testing. It does not go through some of the common alerts that you should have enabled on your SQL Server.

To give a brief overview, Agent Alerts can provide monitoring of important events or conditions using the alerts feature three different ways:

  • SQL Server event alerts – alerts when a specific SQL Server even occurs
  • SQL server performance counters – alerts when a performance counter reaches the specified threshold
  • WMI events – alerts when WMI reaches a specific threshold

This is important to make sure you are properly keeping an eye on your SQL Server. In my opinion, monitoring WMI events or performance counters with SQL Server is an expensive substitute for a monitoring solution. Thus, we will be focusing on the SQL Server event alerts.

Important Severity Alerts

When implementing this, it is a common practice for DBAs to enable alerts for Severity 17 or higher on their SQL Servers. This is because these are not correctable by end users. Again, I believe everyone should have a monitoring solution in place even if it is just monitoring resources on the machine, for this reason, I only enable them for 18 and above. If you don’t have a monitoring solution though, please enable alerts for 17 also. Error 17 indicates that a statement caused SQL Server to run out of resources.

Full documentation around the severities can be found here.

Error Severity

What it indicates

18

There is a problem with the database engine software.

19

Nonconfigurable Database engine limits were exceeded, and the batch was terminated.

20

A statement has encountered a problem with the current task, unlikely to cause damage to the database itself.

21

A problem was encountered that affects all tasks in the database, unlikely to cause damage to the database itself.

22

The table or index specified in the message has been damaged by a software or hardware problem.

23

The integrity of the entire database is in question because of a hardware or software problem.

24

Media failure. Most likely means a restore of the database and a call to your hardware vendor.

25

Unexpected errors, this is the catch all for Microsoft SQL Server.

As you can see from the descriptions in the chart, the errors typically take your machine offline and can even be results of some serious corruption and loss of data.

Error messages

Next, it is important to also mention that you should set up alerts for the following error messages 823, 824 and 825. These are signs that your underlying storage system having issues and should be investigated by your system administrator and hardware vendor. Additionally, if you receive these messages as a DBA you should check the suspect pages table in SQL server and run a CHECKDB. This will confirm your state of your database. To query your suspect pages table, use the query below, more details about the event types can be found here.

This is what you should see, a nice empty table.

However, to monitor this daily or more frequently, you can use a job to check this table and confirm that the count is zero, if not send an email using Database mail.

SQL Server logs all high severity issues and error messages 823,824 and 825 in the error log inside SQL Server. So, for more details, if the error does not destroy your DB, you will be able to get more information about what happened in there. If your database is lost, Check the event viewer application logs for the SQL Server source. This should contain the same errors with potentially the same or roughly the same information.

Building on this topic a bit more, since we mentioned potential corruption. It is something that can happen to anyone, this will not prevent corruption to your database. In which, you will ask well what can we do to prevent it? Sadly, there is not a good answer, you really can’t prevent it. This is due to the fact you may just be unlucky enough that the first error can be the worst one and you get causes corruption in your database. For example, if the damage is caused to your boot page (1:9), you have no choice but to restore from a backup. On top of that, corruption can come from memory, a bad checksum on a page or your disk.

However, in monitoring these errors it may give you a head start to plan a migration to a new disk subsystem or work with the vendor prior to encountering a problem. This sounds like a ton of work, but it is less painful when compared to recovering corruption or restoring from backup when your system is down.

To summarize at a high level, alerts should be created for:

  • Events with Severity >= 18 if you have a monitoring solution, but if you don’t, enable 17 and above alerts
    • These are high-severity errors that should be investigated by the system administrator/DBA
  • Error 823,824,825 read-retry errors
    • These errors spell doom for your disk subsystem

Finally, as a word of warning, always validate your backups, there is a saying that I have taken a liking to over the years, it goes, that your last backup is only as good as the one you verified you can restore too. There are many spins on this saying, but it is true, if you don’t know it will work, you are flying blind. Secondly, when receiving one of these errors, take it seriously especially when speaking to hardware vendors and your system admins. Make sure they understand the gravity of these errors and stress the importance of acting on these errors. After all, it could very well keep you from updating your resume and leaving town.

To quickly enable all these alerts and severities, you can run the following code block. This gives you an easy way to customize alert names along with the Database mail operator name without manually configuring all of these alerts by yourself. However, to note, validate your settings are populated or if you want to add anything in the fields additionally, make sure you want these alerts to go to email instead of that old pager in the office somewhere (wink).


FAQs

What alerts should I enable?

High severity alerts, along with error messages 823, 824 and 825.

How can I quickly enable Database mail alerts in SQL Server?

Run the script in the article above.

Where are Database alerts logged?

High severity alerts are logged to the event viewer in the application log along with the SQL Server error log. Read more in the article above.

James Rhoat
Installation, setup and configuration, Maintenance

About James Rhoat

I am a healthcare information IT professional with a passion for SQL Server and other data technologies. I have two bachelor’s degrees, the first being business administration and the second in management information systems with a specialty in business intelligence. I have grown from a support specialist for an electronic medical record company to a cloud engineer who is the certified system administrator of the business intelligence platform (Qlik Sense). However, my heart still lies with SQL Server as it is what I polished my skills on. My curious nature leads to me learn about different methodologies for accomplishing tasks more efficiently without compromising on the quality. This does tend to lead one down the rabbit hole, but it often ends in valuable experience that I enjoy sharing with anyone willing to take the time. You can find me on LinkedIn

168 Views