Esat Erkec
How to improve performance of the SQL NULL values

Working with SQL NULL values

May 19, 2021 by

This article will explain what problems can occur when working with SQL NULL values ​​and it also gives some solution recommendations to overcome these issues.

What is a SQL NULL value?

In terms of the relational database model, a NULL value indicates an unknown value. If we widen this theoretical explanation, the NULL value points to an unknown value but this unknown value does not equivalent to a zero value or a field that contains spaces. Due to this structure of the NULL values, it is not possible to use traditional comparison (=, <, > and <>) operators in the queries. As a matter of fact, in the SQL Standards using the WHERE clause as the below will lead to return empty result sets.

For this reason, working with the NULL values might be a bit complicated and are required to use certain built-in functions which are customized for handling NULL values.

IS NULL Condition

The IS NULL condition is used to return rows that contain the NULL values in a column and its syntax is like below:

The following query will retrieve the rows from the Person table which are MiddleName column values are equal to NULL.

How to retrieve SQL NULL values to use IS NULL expression

The IS NOT NULL condition is used to return the rows that contain non-NULL values in a column. The following query will retrieve the rows from the Person table which are MiddleName column value is not equal to NULL values.

How to use IS NOT NULL expression

Handling SQL NULL values with Functions

As we stated earlier, SQL Server offers some functions that help to handle NULL values.

ISNULL(): The ISNULL() function takes two parameters and it enables us to replace NULL values with a specified value.

The expression parameter indicates the expression which we want to check NULL values.

The replacement parameter indicates the value which we want to replace the NULL values.

For example, in the following query, the ISNULL() function replaces the NULL values in the row with the specified value.

Handling SQL NULL values with ISNULL function

COALESCE(): The COALESCE() function takes unlimited parameters and returns the first non-null expression in a list.

In the following query, the COALESCE() function returns the SQLShack.com because it is the first non-null value in the list.

Handling SQL NULL values with COALESCE function

In this second example, the COALESCE() function returns the first non-null expression in the list of the columns.

COALESCE function usage

How to Count SQL NULL values in a column?

The COUNT() function is used to obtain the total number of the rows in the result set. When we use this function with the star sign it count all rows from the table regardless of NULL values. Such as, when we count the Person table through the following query, it will return 19972.

How to count SQL NULL values in a table

On the other hand, when we use the COUNT() function with a column name it counts only the non-NULL values in that column.

COUNT() function and SQL NULL values interaction

In order to count NULL values of a column, we can use the following query.

How to count null values in a column

AVG() function and SQL NULL values

The AVG () is used to calculate the average value of a result set, that is, it sums all the values ​​in that result set and divides that sum by the number of rows. One point to note about the AVG() function calculation is that NULL values will not be included in the average calculation.

When we calculate the weekly average fee, the Wednesday fee is not included in the calculation.

NULL values affect AVG function  results

If we want to include the NULL values in the calculation, we can use the COALESCE function.

NULL values and AVG function  interaction

What is the difference between ‘IS NULL’ and ‘= NULL’?

The equal (=) operator is used to compare two expressions values and if the compared values are equal the comparison result will be true. On the other hand, when we try to use an equal operator to find the rows that contain the null values, the result set will return nothing. For example, the following query will not return any rows.

SET ANSI NULL option and NULL function

The reason for this issue is related to the structure of the SQL NULL because it does not specify an explicit value thus comparing an unknown value with an exact value result will be false.

According to ANSI (American National Standard Institute) standards, any comparison or calculation performed with NULL values result must be NULL as we have just stated above. However, we can disable this behavior pattern by using the SET ANSI_NULLS command. By default, this option status is ON but we can disable this option during the session-level so the equals (=) and not equal (<>) comparison operators do not apply to the ANSI-SQL standard.

As seen above, the select statement returns the rows that have null values in the MiddleName column when we disabled the ANSI_NULLS option

Non-Clustered Indexes and SQL NULL values

The indexes can help to improve the performance of the queries but at this point, one question may appear in our minds.

“Can we use non-clustered indexes for the nullable columns in SQL Server and these indexes can improve performance of our queries?”

Let’s try to find out an answer to this with an example.

When we execute the following query, the query optimizer suggests a non-clustered index and it claims that this index dramatically improves the performance of the query.

NULL values  query plan

I/O statistics of a query

As seen in the execution plan, the optimizer has decided to use a non-clustered index scan operator, and it reads 107 data pages from the data cache. Now, we will apply the missing index suggestion.

In this step, we will re-execute the same query and re-analyze the execution plan and I/O statistics.

Index seek and query plan

I/O statistics of an optimized query

Without a doubt, the created index is used by the database engine and it boosts the performance of the query because the logical reads number is lower than the previous one.

The index seek operator details show us there is not any difference between the Actual Number of Rows and the Estimated Number of Rows.

Index Seek operator properties

This situation shows that the query optimizer finds accurate information about how many NULL rows are stored in the MiddleName column. Most likely, the following question will pop up in your mind.

“How does the SQL Server query optimizer know how many NULL rows are stored for a column in a table?”

Basically, the statistics store the data distribution about the tables, and the query optimizer makes use of this information during the creation of a query plan. When we create an index this data distribution information is also stored for the NULL valued columns. So that, optimizer estimates how much memory and other resources are required for a query

The DBCC SHOW_STATISTICS command gives us detailed information about the statistics and data distributions. Through the following query, we will return all details about the used index statistics for the IX_001_MiddleName index.

Statistics details of a index

As we can see there is an 8499 number of NULL values are stored in the MiddleName column and this information is used by the query optimizer.

In some cases, the ISNULL function is used with the where condition but this usage method may lead to use indexes inefficiently. The purpose of the following query is to fetch rows of the MiddleName column whose values are equal to A or NULL. However, this query cannot use the created non-clustered index so it will read all index pages and then return the appropriate rows.

How to improve performance of the ISNULL function

To eliminate this situation, we can make a little code modification in the query so that the query optimizer can use the indexes more efficiently. The following query returns some rows as like the previous one but it uses the indexes more efficiently.

How to improve performance of the SQL NULL values

Conclusion

In this article, we have learned how to work with SQL NULL values and have learned its performance impact on the queries.

Esat Erkec
39,697 Views