Prashanth Jayaram

Overview of the SQL ROW_NUMBER function

November 13, 2018 by

In this article, we’re going to discuss the SQL ROW_NUMBER function. This is a continuation of the SQL essential series. In this guide, I’ll explain what a window function is all about, and you’ll see sample examples to understand the concepts behind the SQL ROW_NUMBER function.

Introduction

The most commonly used function in SQL Server is the SQL ROW_NUMBER function. The SQL ROW_NUMBER function is available from SQL Server 2005 and later versions.

ROW_NUMBER adds a unique incrementing number to the results grid. The order, in which the row numbers are applied, is determined by the ORDER BY expression. Most of the time, one or more columns are specified in the ORDER BY expression, but it’s possible to use more complex expressions or even a sub-query. So, it creates an ever-increasing integral value and it always starts off at 1 and subsequent rows get the next higher value.

You can also use it with a PARTITION BY clause. But when it crosses a partition limit or boundary, it resets the counter and starts from 1. So, the partition may have values 1, 2, 3, and so on and the second partitions again start the counter from 1, 2, 3… and so on, and so forth.

Basics:

  1. The SQL ROW_NUMBER function is a non-persistent generation of a sequence of temporary values and it is calculated dynamically when then the query is executed.
  2. There is no guarantee that the rows returned by a SQL query using the SQL ROW_NUMBER function will be ordered exactly the same with each execution.
  3. ROW_NUMBER and RANK functions are similar. The output of ROW_NUMBER is a sequence of values starts from 1 with an increment of 1 but whereas the RANK function, the values are also incremented by 1 but the values will repeat for the ties.
  4. If you’ve an experience with Oracle then the ROWNUM is more familiar to you. It is a Pseudo-Column. It starts off with 1 and goes all the way down increasing by one, to the end of the table.
  5. The SQL ROW_NUMBER function is dynamic in nature and we are allowed to reset the values using the PARTITION BY clause
  6. The ORDER BY clause of the query and the ORDER BY clause of the OVER clause have nothing to do with each other.

Syntax

ROW_NUMBER

ROW_NUMBER followed by OVER function and then in the parentheses use an ORDER BY clause. It is required to use the ORDER BY clause in order to impose sort of order for the result-set.

OVER

The OVER clause defines the window or set of rows that the window function operates on, so it’s really important for you to understand. The possible components of the OVER Clause is ORDER BY and PARTITION BY.

The ORDER BY expression of the OVER Clause is supported when the rows need to be lined up in a certain way for the function to work.

PARTITION BY

The Partition By clause is optional. On specifying the value, it divides the result set produced by the FROM clause into partitions to which the SQL ROW_NUMBER function is applied. The values specified in the PARTITION clause define the boundaries of the result- set. If PARTITION BY clause is not specified, then the OVER clause operates on the all rows of the result set as a single data-set. This clause may consist of one or more columns, a more complex expression, or even a sub-query.

order_by_clause

The Order by clause is a mandatory clause. It determines the sequence and association of the temporary value to the rows of a specified partition. The ORDER BY clause is an expression of the OVER Clause and it determines how the rows need to be lined up in a certain way for the function.

Demo

In this section, we’ll take a look at the SQL ROW_NUMBER function. For the entire demo, I’ve used AdventureWorks2016 database.

How to use ROW_NUMBER in SQL Query

The following examples, we’ll see the use of OVER clause.

Let us get the list of all the customers by projecting the columns such as SalesOrderID, OrderDate, SalesOrderNumber, SubTotal, TotalDue and RowNum. The Row_Number function is applied with the order of the CustomerID column. The temporary value starts from 1 assigned based on the order of the CustomerID, and the values are continued till the last rows of the table. The order of CustomerID is not guaranteed because we don’t specify the ORDER BY clause in the query.


How to use Order by clause

The following example uses the ORDER BY clause in the query. The ORDER BY clause in the query applied on the SalesOrderID column. We can see that the rows in output are still ordered and returned. The Row_Number is still applied to the CustomerID. The output indicates that the ORDER BY of the query and the ORDER BY of the OVER Clause are independent of the output.


How to use multiple columns with the OVER clause

The following example you can see that we have listed customerID and OrderDate in the ORDER BY clause. This gives the customer details with the most recent order details along with the sequence of numbers assigned to the entire result-set.


How to use the SQL ROW_NUMBER function with PARTITION

The following example uses PARTITION BY clause on CustomerID and OrderDate fields. In the output, you can see that the customer 11019 has three orders for the month 2014-Jun. In this case, the partition is done on more than one column.

The partition is a combination of OrderDate and CustomerID. The Row_Number will start over for each unique combination of OrderDate and CustomerID. In this way, it’s easy to find the customer who has placed more than one order on the same day.


How to return a subset of rows using CTE and ROW_NUMBER

The following example we are going to analyze SalesOrderHeader to display the top five largest orders placed by each customer every month. Using the Month function, the orderDate columns is manipulated to fetch the month part. In this way, the sales corresponding to specific month (OrderDate) along with customer (CustomerID) is partitioned.

To list the five largest orders in each month for each customer, a CTE is used. A window is created on the partition data and it is assigned with the values and then the CTE is being called to fetch the largest orders.


Summary

So far, we reviewed the SQL ROW_NUMBER function in detail. We’ve discussed several examples from simple to complex. Also, we discussed how to use the SQL ROW_NUMBER function with CTEs (Common Table Expressions). In most cases, you’ll always see an over clause with every window function.

The over clause defines the window that each row sees. Within the over clause, there is a partition by, again it is supported by every window function, followed by the order by clause. That’s all for now…Hope you enjoyed reading the article.


Prashanth Jayaram
168 Views