Ahmad Yaseen

Monitor the Query timeout expired message from a SQL Server Agent job

January 11, 2016 by

SQL Server provides you with a good solution to automate a lot of your administrative tasks using the SQL Server Agent jobs. These jobs are handled from the operating system side by a Windows service that is responsible for executing these jobs and feeding the SQL Server systems tables with the metadata about these jobs. The system database that is used by the SQL Server Agent for the job management called the msdb database. All information related to the job steps, schedules and the history can be found in the msdb database tables. The msdb system database is also responsible for the SQL Server Mail, Service Broker, SQL Server Maintenance Plans and the databases backup history.

The msdb database tables that store information about the SQL Server Agent jobs are: the dbo.sysjobactivity which contains information about the SQL Server Agent job activates, the dbo.sysjobhistory table that keeps information about the execution history for the agent jobs , the dbo.sysjobs table that contains information about all the SQL Agent jobs, the dbo.sysjobschedules table that stores information about the schedules for all SQL Agent job, the dbo.sysjobservers table that contains the association of the SQL Agent jobs with the target servers, the dbo.sysjobsteps table that stores information about the SQL Agent jobs steps and the dbo.sysjobstepslogs table that stores logs about the SQL Agent jobs steps.

There are three main database fixed roles in the msdb database that control the access to the SQL Server Agent. The first role is SQLAgentUserRole that is least privileged one in the msdb database. This role members have access on the local jobs and local job schedules that they owned only. This role members can see only the Jobs node from the SQL Server Agent.

The second role is SQLAgentReaderRole, that members have access to see all the jobs and job schedules they owned and the ones they don’t own. But they can’t change on the jobs and schedules that they don’t own. This role members can see only the Jobs node from the SQL Server Agent.

The last role is SQLAgentOperatorRole, which is the most privileged role in the msdb database. The members of this role can start and stop the execution of the local jobs, enable and disable it, and they have the ability to delete the jobs history for these local jobs. But they are not able to change in the jobs that they don’t own. This role members can see all the SQL Server Agent nodex except the Error Logs one.

How could we get the correct job result on the step level?

One of the DBA’s daily tasks is checking the SQL Server Agent jobs that are scheduled at night and report back any failure to the corresponding system owners. A system owner reported to me that her scheduled job failed last night even though the evidence indicates that it was completed successfully. To get to the bottom of this I went through the job steps, and I found out that the job contains remote query that is using linked server and it gave Query Timeout Expired error.

Doing the below steps, I was under the impression that the system owner’s job was completed successfully, even though at later stage I found out the opposite.

The result of the SQL server agent jobs can be checked by drilling down the job to View History as below:

The job history window will be displayed showing valuable information about the last few runs of the chosen scheduled job including Date, Server, Job Name, Result Message and the Duration in seconds:

As shown in the snapshot above, the last run of our job succeeded. But the system’s owner complained that the job failed?

Let’s expand the job log to show the result of each step as below:

The step’s result still misleading us by showing the green tick sign beside the step, showing that the step also completed successfully. Going through the step result message carefully, we are shocked with this message: “Query timeout expired". [SQLSTATE 01000] (Message 7412).  The step succeeded.”. This means that the query duration exceeded the server’s remote query timeout duration, raising this message from the SQL linked server OLE DB provider.

SQL Server remote query is a query that contains outgoing connection to a remote database. A SQL server parameter called remote query timeout is used to decide how long a remote query will take before initiating timeout message. The default remote query timeout value is 600 seconds. Setting the value to 0 will disable the timeout, so the query will wait until it is canceled.

If there is a network latency between you server and the remote SQL server, you can change this value to resolve the query timeout issue from the Connections tab of the Server Properties dialog box below:

Our main concern here, how we could have an automated way to trace the query timeout message within the SQL Agent job steps result in our case, or any failure message in your case, in order to get a better indication of the job status.

Let’s try first to view the job steps result that contains query timeout message, which is stored in the sysjobhistory and sysjobs tables from the msdb system database:

Executing the query, the result will be as follows:

To be initiative and to avoid future complains related to the same problem, it is better to automate this process and review it as a part of your daily check. To achieve that, we will create a table that will host the query results, modify the previous script to fill that table with the previous day’s jobs result.

Let’s start with the table creation using the simple T-SQL script below:

Once the table is ready, we will modify our script to insert into that table, any query timeout message during the last day. It is better to schedule the below script as a SQL Agent job that will be run the first thing in the morning, to make sure that all the scheduled jobs are finished:

What is required from your side now, is a simple select statement from that table, every morning, to make sure that no query timeout occurred last night on your scheduled jobs. Also you can have more advanced automated way to send you the content of that table by email. You can also modify the previous script to trace any type of misleading messages you may face by replacing the query timeout message in LIKE statement inside WHERE clause.

You can simulate a test scenario for the “Query Timeout Expired” problem by creating a SQL stored procedure in the remote database, including WAITFOR delay exceeding the default 10 minutes.

Ahmad Yaseen
Latest posts by Ahmad Yaseen (see all)
Jobs, Maintenance, Monitoring

About Ahmad Yaseen

Ahmad Yaseen is a Microsoft Big Data engineer with deep knowledge and experience in SQL BI, SQL Server Database Administration and Development fields. He is a Microsoft Certified Solution Expert in Data Management and Analytics, Microsoft Certified Solution Associate in SQL Database Administration and Development, Azure Developer Associate and Microsoft Certified Trainer. Also, he is contributing with his SQL tips in many blogs. View all posts by Ahmad Yaseen

168 Views