In this article, we are going to learn about the different types of MySQL Window Functions and how to use them as per different use cases. MySQL is one of the most commonly used databases in the software world today. Almost every other web application running uses MySQL as a database, either on-premise or on the cloud. It becomes extremely necessary for developers and database users to master their querying abilities such that they can query the databases as per the requirements and produce suitable results. In this article, we are going to focus on the MySQL Window Functions and how to use them with practical examples. An important point to note is that most of the MySQL Window Functions are applicable only to versions 8 or higher.
What is a Window Function?
In order to understand MySQL window functions, let us first understand what a window function in SQL means. In SQL, window functions are special types of pre-built functions that return a value from a group of rows for each row. This might sound confusing in the beginning but is not that complex. Simply put, a window function, calculates the value from multiple records for every single row in the context. Let us understand this with a simple example.
Figure 1 – Sample Data to understand how window functions work in SQL
Let us consider that we have data from a class of eight students grouped into two departments. Every student in the class has a dedicated unique Student Number. However, for the sake of ease, the head of the departments decided that each student should receive a student number within the department as well. If you consider the figure above, the column on the right does that exactly, i.e. providing each of the students a department-specific student number. This is an example of a window function where the student number resets as soon as the department changes. In this case, since the number of students is quite less, the department-specific student numbers can be assigned manually. However, in the case of classes that have more than hundreds of students across multiple departments, using SQL to assign the numbers is the safest option.
Let us now prepare our dataset to understand the MySQL window functions in more detail. You can use the following script to create the table in a MySQL database and then insert the data as well: https://gist.github.com/aveek22/7a895dd2dd2bf7eb89d9cd6b94e25fa8
Types of Window Functions
In SQL, there are basically two types of window functions – Aggregate window functions and analytical window functions.
- Aggregate Window Functions – As the name suggests, these types of window functions calculate the aggregated values of a group of rows from the table. Some examples of aggregate window functions are SUM, AVG, MIN, MAX etc. You need to use the GROUP BY clause in order to use these aggregate window functions with some other columns. This usually returns a scalar value
- Analytical Window Functions – These types of functions are used to calculate some window based on the current row and then calculate the results based on that window of records. The result is often returned in the form of multiple records in SQL. Common examples include RANK, DENSE_RANK, CUME_DIST, RANK, LEAD, LAG, etc.
Let us explore these functions in detail now.
Aggregate Window Functions in MySQL
Let us try to explore the simple aggregate window functions now.
SUM Window Function
The SUM window function calculates the total of a numeric column in a given window. This returns a scalar value if there are no columns to be grouped by. In case, we need to find the totals for each group, we can use a GROUP BY clause to get the totals for each row.
Figure 2 – Total Marks grouped by students
AVG Window Function
The AVG window function is used to find the average values of a numeric column. It works in the same manner as that of the SUM function, except that it returns the average.
Figure 3 – Average marks for each student
MIN and MAX Window Functions
The MIN and the MAX window functions are used to calculate the minimum and the maximum values within a window. If there is only a single row within the window, then it will return the value only from that row specifically.
Figure 4 – Calculating MIN and MAX in MySQL Window Functions
As you can see in the figure above, the students, Jason and Tom have two subjects each and hence the minimum and the maximum are calculated separately for them.
COUNT Window Function
The COUNT function is used to count the number of records within the given window. It can be used with a DISTINCT clause to calculate the distinct values within the given window.
Figure 5 – Calculating total students within the window using the DISTINCT clause
Analytical Window Functions in MySQL
As already mentioned in the previous section, analytical window functions are a bit special because they work with a window of rows within the context of a single row. Let us try to explore the different types of analytical window functions.
ROW_NUMBER Window Function
This is one of the most simple analytical window functions in MySQL. This function simply assigns an incremental row number to each of the records present in the table or within the selected window of records.
Figure 6 – Calculating Row Numbers in MySQL
As you can see in the figure above, all the records are sorted in the ascending order of the marks, and then an incremental row number has been allocated to each of the rows. This row number is generated dynamically while the query executes and will change as the marks of the students change in the table.
Additionally, we can also use the PARTITION BY clause with the department name to create a window that will reset the row numbers as soon as the department changes.
Figure 7 – Using the PARTITION BY clause to generate the Row Number
As you can see in the figure above, the row number automatically resets as the department changes. The sort order is still maintained in the increasing order of the marks as mentioned in the SQL query.
RANK and DENSE RANK Window Functions in MySQL
The Ranking is almost similar to the row number, with the fact that ties are handled in ranking, whereas they are not handled in the row number function. A tie is when two or more records have the same numeric value to which the function has been applied to. In the RANK function, the ties will be allocated the same number. For example, if you have three ties at rank 2, then the next listed rank will be 5 instead of 3. This is because RANK skips the rows that are within the tie. In order to evade this, we have something called the DENSE RANK function. The DENSE RANK will handle the ties without skipping the row counts. Let us see this with some examples now.
Figure 8 – Using RANK and DENSE_RANK Functions in MySQL
As you can see in the figure above, we have two columns for RANK and the DENSE RANK function. The RANK function ranks the last record as 10 whereas the DENSE RANK function ranks it as 8.
These are some of the most common MySQL Window Functions that are being used. Although there are some other analytical functions like Cumulative Distribution, NTile, etc., these are beyond the scope of this article.
In this article, we have understood the different types of MySQL window functions that are available to be used within a MySQL database. MySQL is largely used by many organizations across the globe and hence it is absolutely necessary to learn the advanced concepts of MySQL. MySQL window functions are pretty much similar to window functions in other popular databases like SQL Server and PostgreSQL etc. Understanding the concepts of window functions will help you better understand how multiple rows are selected for each row in context and then the functions are applied to them. You can also refer to the official reference guide to learn more about the MySQL window functions.
- Getting started with PostgreSQL on Docker - August 12, 2022
- Getting started with Spatial Data in PostgreSQL - January 13, 2022
- An overview of Power BI Incremental Refresh - December 6, 2021