Rajendra Gupta

How to use the CROSSTAB function in PostgreSQL

February 17, 2023 by

This article will explore the use of the CROSSTAB function in PostgreSQL.

Introduction

PostgreSQL is an open-source,multi-operating system supported (Windows, Linux, and Unix), object-oriented relational database with reliability, performance, and feature robustness. It uses multi-version concurrency control (MVCC) for managing the concurrency.

A pivot table is a popular tool in Microsoft Excel that shows summarized data and helps you analyze it in various ways. Pivot tables collect and organize data from different rows, columns, and tables. Pivot tables are a great way to summarise data, and a handy tool for analyzing sales revenue, products sold, sales performance, etc.

Relational database tables store data in multiple rows and columns. You can calculate data using various functions such as count, sum, and average. SQL Server provides the PIVOT and UNPIVOT functions for working with pivot tables. How do we create the pivot tables in PostgreSQL? Let’s find it out.

This article will explore the crosstab function and its different use cases.

Pre-requisites

You need the following requirements to work with this article.

  • In this article, I am using PostgreSQL 14.5 on Ubuntu. If you do not have Postgres installed on your system, refer to the URL https://www.postgresql.org for downloading software and installation guide. The SELECT Version() command returns the Postgres version, as shown below.

Postgres installation

  • PgAdmin4:The PgAdmin4 is an open-source tool with a graphical interface for interacting with the Postgres database. The PgAdmin tool is suitable for developing, maintaining, and performing administration activities. To use the latest PgAdmin4 tool, download it from the URL https://www.pgadmin.org.
  • Create a new (sample) database for executing the queries. I used the Postgres database [DemoDb] for this article, as shown below.

New database

  • I would recommend you be familiar with existing articles on PostgreSQL.

PostgreSQL CrossTab function

Crosstab function

The Crosstab function generates the Pivot tables in the PostgreSQL database. The crosstab gets the select statement as a parameter, and this select should be compliant with the following conditions:

  • The Select statement output must have three columns.
    • The first column identifies each row of the result table.
    • The Second column identifies the category of the pivot table
    • The third column represents values assigned to each cell.

Let’s create a sample table to explore the crosstab function in SQL Server. The query creates a table [ProductSales]. It inserts data for a few product sale counts in different years.

The query returns data in the tabular format, as shown below.

Sample data

Let’s look at the following crosstab query. It has the following sections:

  • Select columns output: The query uses a select statement that retrieves three columns, [Productname],[year], and [sales], from the [ProductSales] table. The query is enclosed in the single quotes. You will get an error if the query is not specified in the single quotes.
  • The crosstab function is invoked for the select statement using the FROM clause – Select * from crosstab()
  • We need to define the columns for the pivot table and their data types. For this example, we require the following columns in the output:

Column

Data type

Productname

Varchar(50)

Year1

Int

Year2

Int

Year3

Int

Year4

Int

Year5

Int

Year6

int

Before running the crosstab query, we need to enable the crosstab function. The crosstab function is part of the PostgreSQL extension called tablefunc. The tablefunc module contains various functions such as crosstab, normal_rand, connectby.

Execute the following query to enable the tablfunc module.

After you enable the tablefunc extension, it will be visible in the Postgres database under the extensions group.

tablefunc extension

Now, execute the crosstab query and view the result. The pivot table shows the product sales for different years. It makes it easy for you to understand the sales journey of a product over various years in a single row.

use the CROSSTAB function in PostgreSQL

Advanced Crosstab function

In the advanced form of the crosstab function, it can take two input parameters. The multiple (two) parameter helps to handle the null entries. To understand the advanced form better, let’s create three tables, [Departmentstores],[products], and [sales], using the below scripts.

The advanced crosstab function has the following syntax:

Crosstab(source_sql,category_sql)

  • The source_sql contains the select statement that returns the source set of data. As stated earlier, the crosstab must have three columns. The source_sql first column is used as the row_name. The last two columns are the category and value columns. For the source_sql, we will use the following select statement that has four columns as output:
    • Storename: Row name in pivot table
    • Productname: category column
    • Total_units: Value column
  • The category_sql is another select statement that returns the unique product names from the products table. It contains the DISTINCT to produce unique values.

We can apply the crosstab function using the source_sql and category_sql. We defined pivot columns and their data types explicitly.

The query generates the pivot table as shown below. We get a null value for row 1 and column (soap) because there is no matching data for the product sold in the specified store. For the remaining column, it returns the value as the respective stores sold those products.

check crosstab function output

crosstabN(text)

You might have noticed that we need to define the Pivot columns and their data types in the crosstab function. If your query returns text data types for the row_name and value columns, you can use the crosstabN function. The crosstabN function is a wrapper for the crosstab function. It works the same as the crosstab function, except that you do not need to determine the pivot columns.

Let’s create another productsales_t table with the text data types for this example.

As we have all text columns, we can use the crosstab3 function without specifying the pivot columns’ name and their data types.

The query returns the pivot table as shown below. It shows the column names as row_name, category_1, category_2, and category_3.

use the CROSSTAB3 function in PostgreSQL

What if you do not have text columns in the source tables? Can you use the crosstabN function? Yes, we can use that. We can create custom types and functions for the underlying crosstab() function.

  • Create the composite type with the desired output columns and data types. The following query creates a composite type my_crosstab_custom and defines the columns and data type similar to a regular table.
  • Create the function that accepts one text parameter and returns the desired output columns. The function crosstab_custom below uses the type my_crosstab_custom for the crosstab function.

Instead of the crosstab function, use the new function crosstab_custom without specifying the Pivot table columns and their data types.

Here is the output of the crosstab_custom function. You can see the column names as you define in type my_crosstab_custom.

create a custom crosstab function

Conclusion

This blog post was an introduction to the CROSSTAB function in PostgreSQL. It’s a powerful function for creating pivot tables in PostgreSQL that can be used in various ways. If you have any comments feel free to leave them below.


Rajendra Gupta
Latest posts by Rajendra Gupta (see all)
168 Views