Sifiso Ndlovu

Introduction to pagination in SQL Server

May 12, 2014 by
Microsoft SQL Server distinguishes between paging and pagination. Paging refers to handling of memory bottlenecks whereas pagination, the focus of this article, refers to dividing T-SQL query result set into discrete parts. The general concept of Pagination is prevalent in various professional disciplines (i.e. journalism, publishing, web development etc.). According to Wikipedia Pagination is the process of dividing content (i.e. website search results, newspaper article etc.) into separate yet related pages. The major benefit to such dividing of content is usually a cleaner and clearer structuring of content which enhances the reading (or browser-navigation) experience to the consumer of such content.

An example of pagination in a website search results is illustrated in Figure S1 which shows a Microsoft Bing search engine results regarding a search for “ApexSQL”. The numbers at the bottom of image indicate to the user that there are further search results regarding “ApexSQL”. It is then up to the user to navigate to more of these search result pages.

Figure S1
Microsoft Bing search engine results regarding a search for ApexSQL

Pagination in non-Microsoft database platforms

In most database platforms, pagination refers to reducing the number of records from a structured query language (SEQUEL or SQL) query result sets. Similarly, the implementation of pagination has been mostly consistent across database platforms. For instance, most database platforms implements SQL result set pagination by using the OFFSET/FETCH and LIMIT clauses.

The following sub-section provides examples of how the OFFSET/FETCH and LIMIT clauses is used in database platforms such as IBM’s DB2, MySQL, Oracle and PostgreSQL.

  1. Pagination using the OFFSET/FETCH clause in Oracle and DB2

    Figure S2 shows a sample pagination SQL query in Oracle:

    Figure S2
    Figure showing a sample pagination SQL query in Oracle

    DB2 has different implementation of pagination: one method is by specifying the OFFSET clause (i.e. as shown in Figure S3) whereas another method is by using only the LIMIT clause with two comma-separated values – as shown in Figure S4:

    Figure S3
    Specifying the OFFSET clause
    Figure S4
    Using only the LIMIT clause with two comma-separated values
  2. Pagination using the LIMIT clause in MySQL and PostgreSQL

    Figure S5 shows a sample pagination of MySQL query:

    Figure S5
    Sample pagination of MySQL query

    Finally, Figure S6 shows a sample pagination script in PostgreSQL:

    Figure S6
    Sample pagination script in PostgreSQL

Pagination in Microsoft SQL Server

Unlike other database platforms that conveniently handled pagination through the OFFSET/FETCH and LIMIT clauses, you’ve had to jump through a lot of hoops (i.e. use ranking functions) in order to get the pagination of query result sets to work in SQL Server. The following sub-section takes a closer look at some of the pagination work-around options that have been implemented in SQL Server prior to the SQL Server 2012 version.

  1. Pagination-Work Around in SQL Server Versions Prior to SQL Server 2012

    A popular work-around technique to paginating result sets in SQL Server has been by using the ROW_NUMBER function. The ROW_NUMBER function, like many of its siblings (i.e. RANK etc.), is a window function that returns a sequential integer value per row of a given partition of result set. The windows function was introduced in the ISO SQL:2003 revision of SQL database query language. Figure S7 shows a T-SQL query that returns the first 10 rows from SalesOrderDetail table of sample AdventureWorks2012 database.

    Figure S7
    Figure showing a T-SQL query that returns the first 10 rows from the SalesOrderDetail table of sample AdventureWorks2012 database

  2. Pagination in SQL Server 2012 Using the OFFSET-FETCH Filter

    The OFFSET-FETCH filter is Microsoft’s implementation of the ISO SQL:2008 standardised FETCH FIRST clause. The standardised clause works by limiting the result set in a given SQL query. Prior to the introduction of OFFSET-FETCH filter, Microsoft had introduced the TOP clause for limiting the result set of a T-SQL query. The syntax for using the OFFSET-FETCH clause is as follows:

    Figure S8
    The syntax for using the OFFSET-FETCH clause

    A detail description of the parameters of the [<offset_fetch>] argument is available in the Microsoft TechNet site.

    The OFFSET and the FETCH clauses have different usages in the OFFSET-FETCH filter. For instance, the OFFSET argument is used to indicate the number of rows to omit whereas the FETCH clause is used to indicate the number of rows that ought to be retrieved after the OFFSET is applied. Figure S9 shows a T-SQL query that extracts the top 10 rows off the SalesOrderDetail table.

    Figure S9
    Figure showing a T-SQL query that extracts the top 10 rows off the SalesOrderDetail table

    Similarly to T-SQL functions, there are always terms and conditions to successfully using the OFFSET-FETCH filter. Some of the rules that you ought to bear in mind when applying the OFFSET-FETCH filter are as follows:

    1. Rule Number 1: You can’t have it both ways

      Since the OFFSET-FETCH clause was introduced as an alternative to the TOP clause, it makes perfect sense then that you are not allowed to have both the OFFSET-FETCH and TOP clauses in the same T-SQL query.

      Should you decide to go ahead (thus, defying Rule Number 1) and implement both the OFFSET-FETCH and TOP clauses in the same T-SQL query then you are likely to run into an error message as shown Figure S10.

      Figure S10
      An error message shown when trying to implement both the OFFSET-FETCH and TOP clauses in the same T-SQL query

    2. Rule Number 2: Mind your Data Types

      If you want to successfully paginate your results in SQL Server 2012 then you need to comply with data type usage: the integer or expression values for the OFFSET arguments are not allowed to be less than zero (i.e. negative numbers) and the integer or expression values for the FETCH are not allowed to be less than one (1).

      Figure S11 illustrates what would when happen when incorrect integer values are provided in an OFFSET-FETCH filter. In this example, I asked SQL Server to skip negative one (-1) rows.

      Figure S11
      Error message thrown when incorrect integer values are provided in an OFFSET-FETCH filter

      In Figure S12 I have asked SQL Server to skip the first 5 rows and then fetch nothing (zero rows). SQL Server stands up to me and questions why I would want zero rows to be returned especially after I had asked it to skip the first 5.

      Figure S12
      Error message thrown when asking SQL Server to skip the first 5 rows and then fetch zero rows

    3. Rule Number 3: You can’t have one without the other

      For a T-SQL query with an OFFSET-FETCH filter to successfully parse, both the OFFSET and FETCH clause should be included in a given query. In scenarios whereas no rows are intended to be omitted then zero (0) should be provided along with the OFFSET argument. For instance Figure S13 shows an incorrect implementation of fetching 10 records without skipping any records whereas Figure S14 shows the correct implementation of fetching 10 records without skipping any records.

      Figure S13
      Incorrect implementation of fetching 10 records without skipping any records
      Figure S14
      Correct implementation of fetching 10 records without skipping any records
    4. Rule Number 4: Without ORDER, you are nothing

      If you want to successfully paginate your results in SQL Server 2012 then you need to include the ORDER by clause. Figure S15 shows the error message that SQL Server 2012 will return whenever an ORDER By clause is omitted in a T-SQL query that ought to paginate a given result set.

      Figure S15
      Figure showing the error message returned whenever an ORDER By clause is omitted in a T-SQL query that ought to paginate a given result set

    5. Rule Number 5: Know your ORDER

      The OFFSET-FETCH filter should always succeed the ORDER BY clause and not the other way around. Figure S16 shows the error message that SQL Server 2012 will return whenever an ORDER By clause is forced to succeed the OFFSET-FETCH filter in a T-SQL query that ought to paginate a given result set.

      Figure S16
      Error message returned by SQL Server 2012 when an ORDER By clause is forced to succeed the OFFSET-FETCH filter in a T-SQL query that ought to paginate a given result set

    6. Rule Number 6: These Interchangeable Clauses Still Do the Same Thing

      The OFFSET-FETCH filter does not dictate on whether ROW or ROWS argument should be used – anyone of the aforementioned arguments are valid. However, it would grammatically be correct to use ROW instead of ROWS when you intend fetching/skipping one (1) row. Both the syntax used in Figure S17 and Figure S18 is correct but if you are very pedantic about the usage of English grammar in T-SQL – the syntax shown in Figure S18 should be most desirable for you.

      Figure S17
      Figure showing sample of the correct SQL syntax
      Figure S18
      Figure showing a sample of grammatically most desireable and correct SQL syntax

      Furthermore, usage of FIRST and NEXT clauses is interchangeable in a T-SQL query with OFFSET-FETCH filter. Consequently, both syntaxes in Figure S19 and Figure S20 are valid.

      Figure S19
      An example of a valid SQL syntax using the FIRST clause

      Figure S20
      An example of a valid SQL syntax using the NEXT clause

The query execution plan of a paginated T-SQL query result set

In SQL Server, the query execution plan illustrates the steps undertaken by the SQL Server query optimiser to execute a given T-SQL query. In this section we will take a look at the execution plan of T-SQL queries that use the OFFSET-FETCH filter versus the queries that uses the TOP and the ROW_NUMBER clauses. For the purposes of this query execution demo, I have started off by clearing all existing cached entries. SQL Server 2012 permits the use of Database Console Command (DBCC) to clear the cache. Figure S21 shows the sample script that we have ran to clear all plan cache.

Figure S21
Figure showing sample script for clearing all plan cache

  1. The Execution Plan of a Paginated T-SQL Query Result Set: OFFSET-FILTER filter vs TOP Clause

    Remember that the OFFSET-FETCH has been introduced in SQL Server 2012 as an alternative against the usage of the TOP clause? So how does SQL Server handle a T-SQL query that uses a TOP clause versus a T-SQL that uses OFFSET-FETCH clause? Well, it would seem that the execution plan of a T-SQL query that uses both these data limiting clauses (i.e. TOP and OFFSET-FETCH clauses) is the same.

    Figure S22 and Figure S23 illustrate execution plans for T-SQL queries that use a TOP and OFFSET-FETCH clause, respectively. You will notice that similar steps are taken to return a result set – regardless of the fact that one query uses TOP clause and the other uses the OFFSET-FETCH clause.

    Figure S22
    Figure illustrating execution plans for T-SQL queries using a TOP clause

    Figure S23
    Figure illustrating execution plans for T-SQL queries using an OFFSET-FETCH clause

  2. The Execution Plan of a Paginated T-SQL Query Result Set: OFFSET-FILTER filter vs ROW_NUMBER Clause

    Unlike the similar execution plans of OFFSET-FETCH vs TOP clause, the execution plan for a T-SQL query that uses the OFFSET-FETCH filter is different from a T-SQL query that uses the ROW_NUMBER ranking function. Figure S24 and Figure S25 show the T-SQL pagination queries that have been implemented via the ROW_NUMBER and OFFSET-FETCH clauses respectively. The queries are basically returning the top 10001 rows after skipping the first 100000 rows.

    Figure S24
    T-SQL pagination queries that have been implemented via the ROW_NUMBER clause

    Figure S25
    T-SQL pagination queries that have been implemented via the OFFSET-FETCH clause

    Prior to looking at the execution plans of both queries, I have analysed the IO and Time statistics of both queries. Figure S26 and Figure S27 show that for both queries, SQL Server reads data pages from a cache (logical read) instead of a hard disk (physical read). Figure S26 shows that the T-SQL pagination query that uses the OFFSET-FETCH filter performs less logical reads to extract the data from the SalesOrderDetail object (i.e. 15 logical reads) as compared to the query that uses the ranking function (i.e. 1371 logical reads). Noticeably too are the differences in the scan counts that are done: the plan of the query that uses the ranking function causes the SalesOrderDetail object to be read 5 times as compared to a single scan which is performed by the query that uses OFFSET-FETCH filter.

    Figure S26
    Figure shownig that the T-SQL pagination query that uses the OFFSET-FETCH filter performs less logical reads

    Figure S27
    Figure shownig that the T-SQL pagination query that uses the ROW_NUMBER ranking function performs more logical reads

    Finally, Figure S28 shows that the graphical execution plan for paginating a T-SQL query result set that uses the OFFSET-FETCH filter is significantly different from the plan (Figure S29a and Figure S29b – the image of the plan was dissected into two parts so to fit into the width of this article) used for the result set that uses the ROW_NUMBER ranking function. Some of the significant differences are as follows:

    1. The first thing that stands out between the two graphical execution plans is the number of total icons generated per query. There were more graphical icons (thus, more operations) generated for executing a query that uses the ROW_NUMBER clause as compared to the one that uses the OFFSET-FETCH filter. Remember, and as Tim Ford highlights this point – Despite most operations in the query that uses the ROW_NUMBER clause having a cost of zero percent – there was still some time incurred to carry out those operations.
    2. The clustered index scan performed by the plan that uses the OFFSET-FETCH clause costs 2% less than the one conducted by the ROW_NUMBER function
    3. 16 percent of the overall execution plan for the query that uses the OFFSET-FETCH filter went to parallel query plan operation (i.e. Gather Streams) whereas the other query incurred less costs of parallelism.
    4. For both queries, the majority of the costs were incurred during sorting the result sets. However, the query that uses the ROW_NUMBER clause incurred 12 percent more to sort the result sets as compared to the other query

      Figure S28
      Execution plan for query using an OFFSET-FETCH filter

      Figure S29a
      Execution plan for query using ROW_NUMBER function

      Figure S29b
      Execution plan for sorting the result sets for query using ROW_NUMBER function

Conclusion

The notion of pagination has been around in many forms and across different professional bodies. In Structured Query Language, the ISO standard on SQL has provided some level of consistency in as far as the implementation of pagination across database platforms is concerned. Paginating a T-SQL query result set in SQL Server 2012 has provided a much more flexible alternative to the TOP clause. However, for a successful implementation of the pagination feature in SQL Server 2012 – rules governing the usage of the OFFSET-FETCH filter should be obeyed. Finally, in terms of the execution plan, SQL Server 2012 handles the queries that implement the OFFSET-FETCH filter similarly to those T-SQL queries that uses the TOP clause but the cost of a query that uses OFFSET-FETCH filter is significantly less compared to a T-SQL pagination query that uses the ROW_NUMBER ranking function.

Sifiso Ndlovu
168 Views