Emil Drkusic

Learn SQL: How to Write a Complex SELECT Query

February 4, 2020 by

In my career, I’ve heard many times, things like “How to write a complex SELECT query?”, “Where to start from?” or “This query looks so complex. How you’ve learned to write such complex queries?”. While I would like to think of myself as of a brilliant mind or genius or add something like “query magician” to my social network profiles, well, writing complex SQL wouldn’t be the only thing required to do that. Therefore, in this article, I’ll try to cover the “magic” behind writing complex SELECT statements.

The Model

As always, I’ll start with the data model we’ll be using. Before you start to write (complex) queries you should understand what is where – which tables stored what data. Also, you should understand the nature of relations between these tables.

How to Write a Complex SELECT Query - the data model we'll use

If you don’t have these two on disposal, you have 3 options:

  • Ask somebody who created the model for the documentation (if that person is available). Same stands for understanding the business logic behind the data
  • Create documentation yourself. This takes time, but is really very useful, especially if you jump in the middle of an undocumented project
  • You can always do it without the documentation, but you should be pretty sure you know what you’re doing. E.g. I wouldn’t recommend you driving a car where I’ve repaired brakes. I mean, you can try it, but…

All these tips can be used regardless of what you are doing with your database. Having the overall picture will spare you a lot of time in the long-run, so invest some time when you’re starting.

Let’s Start with the Complex Query

In case I spent too many words so far, let’s remind ourselves of the original question – “How to write a complex SELECT query?”. And let’s start with a complex query.

And this is what query returns:

complex SQL SELECT query

As you can see, we have a complex query and 2 rows in the result. Without any comments, we can’t easily say what does this query does and how it works. Let’s change that now.

How to Write a Complex SELECT Query & Where is the Data?

We’re back to the original question. Now, we’ll answer this step by step. I’ll tell you what was the desired result of the query (assignment given to us).

Return all countries together with the number of related calls and their average duration in seconds. In the result display only countries where average call duration is greater than the average call duration of all calls.

The first thing we’ll do is to determine which tables we’ll be using in the process. In the data model, I’ve added colors to the tables we need to use.

the tables we'll use in the SELECT query

And how to determine which tables should be? The answer has two parts:

  • Use all tables containing data you need to display in your result. In our case, the tables in question are country (we need country_name) and call (we need start_time and end_time to calculate the average call duration)
  • If the tables from the previous bullet point are not directly related, you’ll also need to include all the tables between them (in our case that would be – how to get from the country table to the call table)

After this analysis we know we must use the following tables: country, city, customer, and call. If we want to use them properly, we need to JOIN these tables using foreign keys. Without even thinking about the final query, we now know it will contain this part:

We could do one thing, and that is to test what the query like this would return:

I won’t post the picture of the whole result because it simply has too many columns. Still, you can check it. I always advise that you test parts of your queries. While they won’t be displayed in the final results, they will be used in the background. By testing these parts you’ll get the idea of what is happening in the background, and could assume what the final result should be. But still, we have to answer on “How to write a complex SELECT query?”.

How to Write a Complex SELECT Query – Write Parts of the Query at the Time

We have already written part of the query and that’s a good practice. It will help you to build a complex query from simpler “blocks” but also, you’ll test your query along the way because you’ll be checking parts of it at a time as well, check how the query works when certain parts are added or executed.

I would start with this part “where average call duration is greater than the average call duration of all calls”. It’s obvious that we need to calculate the average duration from all calls (in seconds). So let’s do that.

subquery result

We’ve explained the aggregate functions in the previous article. So far, we haven’t talked about date & time functions, but it’s enough to say that the DATEDIFF function calculates the difference in the units of the given time period (we are after seconds here) between the start time and end time. The result returned implies that the average call duration was 354 seconds.

Now we’ll write down the query which returns aggregated values for all countries.

I would like to point out two things here:

  • SUM(CASE WHEN call.id IS NOT NULL THEN 1 ELSE 0 END) – This will sum up only existing calls. Since we’ve used LEFT JOIN, we’ll also join countries without any call. In case we’ve used COUNT, we would have value 1 returned for countries without any call, and we want 0 there (we want to see that info)
  • AVG(ISNULL(DATEDIFF(SECOND, call.start_time, call.end_time),0)) – This is very similar to the previously mentioned AVG. The difference here is that I’ve used ISNULL(…,0). This simply tests if the calculated value IS NULL, and if so, replaces it with 0. Calculated value could be NULL if there is not data (we’ve used LEFT JOIN)

Let’s see what this query returns.

complex SELECT query without HAVING

“How to write a complex SELECT query?” -> Now we’re really close to complete our query and get really close to this answer.

So, the result contains all countries with their number of calls and the average call duration. From this result, we’re interested only in these having average call duration greater than average call duration of all calls. That’s our original query, but with comments added.

You can see the query result in the picture below.

How to Write a Complex SELECT Query - final query with comments

Compared to the previous query, we’ve just added the HAVING part. While in the WHERE part of the query we test “regular” values, HAVING part of the query is used to test aggregated values. We’re using it to compare AVG values.

Comments are a crucial thing, not only in databases but in programming in general. By adding these 3 comment lines, the query should become much more readable. Even somebody who looks at this query for the first time will see what you did and why. That somebody could even be you if you’re looking at the code you wrote some time ago. While it takes some time to write these comments, don’t be lazy and do it. You’ll probably save yourself much more time when revisiting old queries/code.

Let’s Wrap up Everything

So, the question was – “How to write a complex SELECT query?”. While there is no easy answer, I would suggest the following steps:

  • Think of it as of LEGO bricks and build the query that way. Treat complex parts as black boxes – they will return what they need to and you’ll write (and incorporate into the main query) them later
  • Identify all the tables you’ll need in the query
  • Join tables containing the data you need to display or the data used in the WHERE part of the query
  • Display all data to check if you’ve joined everything correctly and to see the result of such a query
  • Create all subqueries separately. Test them to see do they return what they should. Add them to the main query
  • Test everything
  • Add comments

Could you give us your answer on “How to write a complex SELECT query?”. Which approach have you used?

Table of contents

Learn SQL: CREATE DATABASE & CREATE TABLE Operations
Learn SQL: INSERT INTO TABLE
Learn SQL: Primary Key
Learn SQL: Foreign Key
Learn SQL: SELECT statement
Learn SQL: INNER JOIN vs LEFT JOIN
Learn SQL: SQL Scripts
Learn SQL: Types of relations
Learn SQL: Join multiple tables
Learn SQL: Aggregate Functions
Learn SQL: How to Write a Complex SELECT Query
Learn SQL: The INFORMATION_SCHEMA Database
Learn SQL: SQL Data Types
Learn SQL: Set Theory
Learn SQL: User-Defined Functions
Learn SQL: User-Defined Stored Procedures
Learn SQL: SQL Views
Learn SQL: SQL Triggers
Learn SQL: Practice SQL Queries
Learn SQL: SQL Query examples
Learn SQL: Create a report manually using SQL queries
Learn SQL: SQL Server date and time functions
Learn SQL: Create SQL Server reports using date and time functions
Learn SQL: SQL Server Pivot Tables
Learn SQL: SQL Server export to Excel
Learn SQL: Intro to SQL Server loops
Learn SQL: SQL Server Cursors
Learn SQL: SQL Best Practices for Deleting and Updating data
Learn SQL: Naming Conventions
Learn SQL: SQL-Related Jobs
Learn SQL: Non-Equi Joins in SQL Server
Learn SQL: SQL Injection

Emil Drkusic
Latest posts by Emil Drkusic (see all)
SQL commands, T-SQL

About Emil Drkusic

Emil is a database professional with 10+ years of experience in everything related to databases. During the years, he worked in the IT and finance industry and now works as a freelancer. His past and present engagements vary from database design and coding to teaching, consulting, and writing about databases. Also not to forget, BI, creating algorithms, chess, philately, 2 dogs, 2 cats, 1 wife, 1 baby... You can find him on LinkedIn View all posts by Emil Drkusic

41,714 Views