Prashanth Jayaram

SQL Union overview, usage and examples

September 25, 2018 by

This article will provide a deep dive into the SQL Union operator, describing its many uses along with examples and explore some common questions like the differences between Union vs Union All.

To address real-world data requirements, we may need to combine result sets from multiple data sources so that we could do data analysis or create new datasets. The datasets may be identical but there are chances that they reference different tables. Is there a way to combine the data in a single query? Are Set Operators a viable option? Let’s get started and see how some of the existing operators can be used to help us address these common challenges.

In this article, we’ll review:

  1. What a Set operator is
  2. Union vs Union All and how they work
  3. Discuss the rules for using Union vs Union All
  4. SQL Operator Syntax
  5. How to use simple SQL Union clause in the select statement
  6. How to use SQL Union with the queries that have the WHERE clause
  7. How to use the SELECT INTO clause with Union
  8. How to use SQL Union with the queries that have a WHERE clause and order by clause
  9. How to use SQL Union and SQL Pivot
  10. How to use SQL Union with GROUP and HAVING clauses

Operators

An operator is a symbol or a keyword defines an action that is performed on one or more expressions in the Select statement.

Set Operator

Let’s get into the details of Set Operators in SQL Server, and how to use them

There are four basic Set Operators in SQL Server:

  1. Union
  2. Union All
  3. EXCEPT
  4. INTERSECT

Union

The Union operator combines the results of two or more queries into a distinct single result set that includes all the rows that belong to all queries in the Union. In this operation, it combines two more queries and removes the duplicates.

For example, the table ‘A’ has 1,2, and 3 and the table ‘B’ has 3,4,5.

The SQL equivalent of the above data is given below

In the output, you can see a distinct list of the records from the two result sets

Union All

When looking at Union vs Union All we find they are quite similar, but they have some important differences from a performance results perspective.

The Union operator combines the results of two or more queries into a single result set that includes all the rows that belong to all queries in the Union. In simple terms, it combines the two or more row sets and keeps duplicates.

For example, the table ‘A’ has 1,2, and 3 and the table ‘B’ has 3,4,5.

The SQL equivalent of the above data is given below

In the output, you can see all the rows that include repeating records as well.

INTERSECT

The interest operator keeps the rows that are common to all the queries

For the same dataset from the aforementioned example, the intersect operator output is given below

The SQL Representation of the above tables

The row ‘3’ is common between the two result sets.

EXCEPT

The EXCEPT operator lists the rows in the first that are not in the second.

For the same dataset from the aforementioned example, the Except operator output is given below

The SQL representation of the above tables with EXCEPT operator is given below

List the non-common rows from the first set.

Note: It is very easy to visualize a set operator using a Venn diagram, where each of the tables is represented by intersecting shapes. The intersections of the shapes, where the tables overlap, are the rows where a condition is met.

Syntax:

The syntax for the Union vs Union All operators in SQL is as follows:

SELECT Column1, Column2, … ColumnN
FROM <table>
[WHERE conditions]
[GROUP BY Column(s]]
[HAVING condition(s)]
UNION
SELECT Column1, Column2, … ColumnN
FROM table
[WHERE condition(s)];
ORDER BY Column1,Column2…

Rules:

There are a few rules that apply to all set operators:

  1. Expressions in each row or the number of columns that are defined in each query must have the same order
  2. Subsequent SQL statement row sets must match the data type of the first query
  3. Parentheses are allowed to construct other set operators in the same statement
  4. It possible to have an ORDER BY clause, but that should be the last statement of the SQL
  5. GROUP BY and HAVING clauses can be applied to the individual query

Note:

  1. All of these Set operators remove duplicates, except for the Union All operator
  2. The output column names are referred from the first query i.e. when we run the SELECT statements with any of the Set operators and result set of each of the queries may have different column names, so the result of the SELECT statement refers the column names from the first query in the operation.
  3. SQL JOIN is more often used combine columns from multiple related tables whereas SET Operators combines rows from multiple tables.
  4. When the expression types are the same but differ in precision, scale, or length, the result is determined based on the same rules for combining expressions

Examples:

The following T-SQL queries are prepared and run against the Adventureworks2014 database. You can download the sample AdventireWorks2014 database here

  1. How to use simple SQL Union clause in select statement

    In this example, the result set includes a distinct set of rows from the first set and second set.The following example is based on the rule 1, 3 & 5.

    The output is a combination of Union and Union All operators using parenthesis.

    .

  2. How to use SQL Union with the queries that have the WHERE clause

    The following example shows the use of Union in two SELECT statements with a WHERE clause and ORDER BY clause.

    The following example is based on the rule 1,2 and 3


  3. How to use SELECT INTO clause with SQL Union

    The following example creates a new dbo.dummy table using the INTO clause in the first SELECT statement which holds the final result set of the Union of the columns ProductModel and name from two different result sets. In this case, it is derived from the same table but in a real-world situation, this can also be two different tables. The following example is based on the rule 1 , 2 and 4.


  4. How to use SQL Union with the queries that have a WHERE clause and ORDER BY clause

    This is only possible when we use TOP or aggregate functions in every select statement of the Union operator. In this case, top 10 rows are listed from each result set and combined the rows using Union clause to get a final result. You also see that the order by clause is placed in all the select statement.


  5. How to use of SQL Union and SQL Pivot

    In the following example, we’re trying to combine more than one set of results. In a real-world situation, you may have financial numbers from various regions or departments and the tables might have the same columns and data types, but you want to put them in one set of rows and in a single report. In such a scenario, you would use the Union clause and it is very easy to combine results and transform the data into a more meaningful report.

    In this example, the ProductModel is categorized into Top10, Top100, Top 100 and transforming the rows as an aggregated set of the values into the respective columns. The following example is based on the rule 2.

    NULL values are very important with the Set Operators and are treated as second-class database citizens. Because NULLs are considered unique and if two rows have a NULL in the same column, then they would be considered identical, so in that case, you are actually comparing a NULL to a NULL and getting equality. In the following example, you see the use of NULL values. In this case, it is operated with the aggregate function, max.


How to use SQL Union with Group and Having clauses

The following examples use the Union operator to combine the result of the table that all have the conditional clause defined using Group by and Having clause.

The lastname is parsed by specifying the conditions in the having clause.

The following example is based on rule 5.

We can see that the last names are derived into two different columns using the Union operator

That’s all for now…

Summary

Thus far, we addressed various available options and rules to use Set operators and understand when to use them. When determining whether to use Union vs Union All there are a few points to consider. Use Union All if you know that the result sets from multiple queries don’t overlap or generate duplicates and remember, if you need to use parentheses, you can do that. You can also pivot and transform the output.

Just make sure your data types match up and if you need to do an ORDER BY, do it at the end after you’ve run all those set queries. By nature, when it comes to Union vs Union All, Union All is faster than Union; as Union operators incur additional overhead to eliminate duplicates.

I hope you enjoyed this article on the SQL Union operator. Feel free to ask any questions in the comments below.


Prashanth Jayaram
T-SQL

About Prashanth Jayaram

I’m a Database technologist having 11+ years of rich, hands-on experience on Database technologies. I am Microsoft Certified Professional and backed with a Degree in Master of Computer Application. My specialty lies in designing & implementing High availability solutions and cross-platform DB Migration. The technologies currently working on are SQL Server, PowerShell, Oracle and MongoDB. View all posts by Prashanth Jayaram

168 Views