Gerald Britton

How to see the schema of a result set before running a query

November 30, 2016 by

Introduction

Suppose you’ve been asked to run a query against some SQL Server database but you don’t know anything more than the name of a stored procedure that you need to call. If you are going to do anything with the result set, you need to know at least the names and types of the columns to expect. This is actually a problem faced by many applications, including SQL Server Integration Services (SSIS) and SQL Server Reporting Services (SSRS).

In this article, we’ll look at how SSIS and SSRS handle this problem and how you can exploit the solution to your advantage.

The Tried and True Method

If you wanted to discover the schema of the (first) result set returned by a stored procedure, you needed some way of executing the stored procedure without – well – executing it! After all, you want to know what to expect, so that you can set up any data structures you need to receive the results. You need to know that the first column is called “AnInteger” and that it is of datatype integer, the second column is called “ADate” and is a date type, and so on. At the same time, you don’t want to just run the stored procedure, about which you know little. Perhaps it queries local and remote servers, does a massive join and takes an hour to run before the first row is returned! Clearly that is not acceptable.

Before SQL Server 2012, you could achieve this by using the following SET command:

The name implies its function. It ONLY returns the ForMaT of the results, not the results themselves. With this setting, when a stored procedure is executed, SQL Server will follow all possible logic paths until it has some result set. Then, it returns the format of that set. No actual results are returned.

When I said “all possible logic paths” I meant it literally. No actual conditions are checked. That means that if my stored procedure reads, in part:

SQL Server will return the format of the first select! It ignores the condition in its quest for the first result set. In this case, it will inform the caller that the result set consists of two columns called “One” and “SaintNick” respectively, with datatypes integer and character.

This type of call is used in a .NET program, which can then read the results into a data table for further processing. See “How To Retrieve Column Schema by Using the DataReader GetSchemaTable Method” in the Useful Links section for more details.

Here’s a snippet of what that might look like in C#:

After the boilerplate code, you can see that the ExecuteReader method is called with a special flag asking for the KeyInfo. This causes the query to be executed with FMTONLY ON. The results are not a normal result set but are passed as a .NET object which can then be processed with a DataReader using the GetSchemaTable method. The name, data type, size and other attributes are available for each column returned by the query, among other information. Refer to the official documentation for more details.

SQL Server 2012 and Forward

Starting with SQL Server 2012, a new extended stored procedure is available to accomplish the same thing as SET FMTONLY ON.

sp_describe_first_result_set takes a T-SQL batch, containing one or more T-SQL statements (nvarchar(n) or nvarchar(max)) as input and returns a result set with 1 row per column detailing the names and datatypes of each column for the first result set found. As with the previous method, conditions are not checked and all paths are tried until a result set is found.

There is, however, one major difference with the previous method. The column information is returned as a normal result set. This means that you can easily see it in SQL Server Management Studio. For example:

Yields, in part:

There are several more columns available showing attributes such as collation, key information, identity status, case sensitivity and other things.

What’s In It For Me?

At this point, you may be saying, “Hey, this is interesting, but how can I use it? I’m not writing my own ETL engine!”, or something like that. Good question!

Back at the start of this article, I commented, “queries local and remote servers.” Suppose that you are not just executing, but writing a view or stored procedure that does query remote servers. Further, suppose that at least one of those remote servers is not SQL Server. Perhaps it is Oracle, MySQL, DB2 or some other RDBMS. Now imagine that the processing to be done takes minutes or hours. As mentioned before, you don’t want to wait that long just to see the format of the result set. But, what can you do? You have to send the query to the remote machine! You have to await the results!!

Here’s where the FMTONLY trick comes in. (I still call it FMTONLY, even since SQL Server 2012, since it works the same way by testing all code paths.) Let’s start with an arbitrary query to a MySQL database:

If the table has billions of rows and there is no index in column Three, the query may take a while. Plus we have no control over when MySQL starts sending rows back to SQL Server. However we know that we could add a LIMIT clause like this:

This will likely come back instantly. But what’s the point? You need that WHERE clause! Well, suppose your query was inside an Integration Services package or a Reporting Services data set. When you first create the Source component (SSIS) or Data Set (SSRS), those systems will first query the format of the result set. Knowing that that means that the query will be running in FMTONLY mode, you can do something like this:

Here, we test if the query is running in FMTONLY mode. Since we know that in FMTONLY mode, conditions are ignored, the SET statement will be executed, and the query will be altered to remove the WHERE clause and replace it with a LIMIT clause. Then the finished query is sent to MySQL (using OPENQUERY in this case). This should execute quickly, especially since we’re asking that no rows be returned!

The consequence of this sleight of hand is that when you or SSIS or SSRS or one of many other similar programs is asking about the shape of the result set, the query executes quickly. However, when it is run “for real”, the WHERE clause is in effect! This speeds up your development without changing the final result.

Does the Program I’m Using Work This Way?

We know that SSIS works this way but suppose you have a third-party package and do not know (or the vendor won’t tell you) if it uses SET FMTONLY ON or sp_describe_first_result_set? SQL Server Profiler to the rescue!

If I fire up SQL Server Profiler and start a default trace against the target database, I can see all the SQL statements being executed. Then, I look for something like this:

Aha! The evidence is plain to see! The program is using FMTONLY mode and I can use the trick to get fast results when in that mode.

Solving the Problem of Temp Tables

If you use integration services, you’ probably hit this situation. To get data, you execute some stored procedure. However, the procedure is complicated and builds temp tables on its way to producing the finished result set. Here’s a toy example:

Now, let’s call this stored procedure in an Integration Services package:

If I click on the Preview button, I receive a scary-looking error message:

Visual studio cannot preview the results. It’s not that it cannot execute the stored procedure. The problem is that it cannot determine the format of the result set. Since the temp table is created at runtime, the format of the table is unknown when calling the sp_describe_first_result_set stored procedure or using SET FMTONLY ON, in SQL Server 2008. Now, you may think it should be able to figure it out! How hard can it be parsing a CREATE TABLE statement? After all, isn’t DDL one of the things SQL Server is supposed to be good at? Well, imagine a long, convoluted stored procedure that creates several temporary tables, sometimes with run-time-dependent schemas, then joins them together at the end to return a result set. Sound far-fetched? Think again! This kind of thing is actually quite common.

There are, however, limits to what SQL Server can do. Deriving a result set when temporary tables are used is one of its limitations. So, what do we do to make this package work? There are two methods, depending on which version of SQL Server we are running.

If running SQL Server 2008 or earlier, we can change the stored procedure to use the FMTONLY trick! The revised procedure could look like this:

You can see the changes at the beginning of the BEGIN/END section. I test for FMTONLY and if that is the case, return a dummy result set with a format that matches the real result set. Changing the stored procedure this way will enable SSIS to successfully retrieve the result set format.

Since SQL SERVER 2012, a better way is used: WITH RESULT SET. In the OLEDB Source Connector, we write the query like this:

This will enable SSIS to successfully retrieve the format of the result set for systems running SQL Server 2012 and above.

Summary

SET FMTONLY ON was devised as a way to instruct SQL Server to return only the format of the first result set. In SQL Server 2012 and up, this has been superseded with the new sp_describe_first_result_set extended stored procedure.

We’ve seen that this can be useful when you want to short-circuit a long-running query that may use external databases. This is especially useful when running queries from SSIS, SSRS, and many other applications.

Knowing these techniques will save you time and energy whenever you use software that queries the result format or just want to know what a query will give you without actually executing it.

Useful Links

Gerald Britton
Latest posts by Gerald Britton (see all)
168 Views