Ben Richardson

Understanding cursors and replacing them with JOINs in SQL Server

November 24, 2017 by

Relational database management systems including SQL Server are very good at processing data in sets.

However, if you want to process data on row-by-row basis rather than in sets, cursors are your only choice. Unfortunately, cursors are extremely slow and so where possible should be replaced with JOINS.

Why cursors are slow

To populate a cursor, database tables are iterated on row-by-row basis rather than in sets. While a cursor is being populated, the table being iterated is locked. Row operations can take a long time to execute depending upon the type of task being performed on each row. While the cursor is open, a table cannot be accessed or updated by the other users. This makes cursor-based operations extremely. Therefore, cursors can be avoided wherever they can by the set based operations.

In this article, we will see how cursors process data and look at how and when we can replace cursors with JOINS.

Preparing dummy data

Let’s start by creating some dummy data to work with.

Next, we need two tables “department” and “employee” with the department table having two columns: id and dep_name, and the employee table having four columns: id, name, dep_id, and salary.

The dep_id column of employee table will hold values from the id table of the department table, with a one too many relations between the department and employee tables. Remember this is not a perfectly normalized data table as we just want some data to execute example queries on.

Finally let’s add some dummy data.

A simple cursor example

Let start by creating a simple cursor that will loop through each row in the employee table. There are 15 rows in the employee table and so this won’t illustrate the performance issues that cursors can suffer but if the table contained >10,000 rows the performances issues would be clear.

The above code creates two variables @EmpId and @EmpName that will hold the id and name from the employee table respectively. To declare a cursor we use the DECLARE keyword followed by the cursor’s name and the FOR keyword.

Next, we declare the SELECT statement that will select the records that the cursor will process row-by-row.

The following lines create EmpCursor that will contain the id and name from the employee table:

At this point, data is added to the cursor. Before we can use the cursor we have to open it using the OPEN keyword. At this point, the cursor points at the top of the table. To move the cursor to the first row we use the FETCH NEXT command. We then use the INTO statement to fetch data from the cursor and insert it into our local variables. To move the cursor to next row we again use FETCH NEXT.

A better approach is to use a loop and call FETCH NEXT within the loop so that the cursor can iterate over all the rows. The @@FETCH_STATUS returns 0 as long as there are more rows in a set of records. In the above script, we used a WHILE loop to iterate through all the records in EmpCursor until @@FETCH_STATUS returns 0. When there are no more rows left, the @@FETCH_STATUS returns a non-zero value and the loop will exit. Inside the loop, we print the id and name from each row in the EmpCursor. Finally, we close the cursor we opened using CLOSE keyword.

The output of the above script will be as follows:

Cursor example involving multiple tables

In the next example, we will use a cursor that will store the id and dep_id of the employee. Inside the WHILE loop, we will retrieve the department name from the department table where the id of the department matches the dep_id of the employee table.

In addition, if the department name is “Sales”, we will increase the salary of the employee by 10% by multiplying the actual salary by 1.1. If the department name is “HR”, we will increase the salary of the employee by 20% by multiplying the actual salary by 1.2. Finally, if the department name is “IT”, we will increase the salary of the employee by 50% by multiplying the actual salary by 1.5.

Now if you SELECT records from the employee table, you will see an increase of 10%, 20% and 50% on the salaries of employees from Sales, HR and IT department, respectively.

The new salaries are shown in the following table:

Replacing cursors with JOINs

As explained earlier, cursors are extremely slow and should be replaced with JOINs whenever possible. Let’s look at how we replace the cursor that we saw in the last example with a JOIN statement. We will use a JOIN statement to increment the salaries of the employees from Sales, HR and IT department by of 10%, 20%, and 50% respectively, as we did in the previous example.

Note: Before you execute this script, truncate the employee table and insert dummy records as we did at the start of this article. This is to make sure that we achieve the same results using cursors and JOINs.

The above query will be at least 20 times faster than the cursor and will achieve same results. We have used a CASE statement to update the salary based on the dep_name. The employee and department tables have been joined so that we can access salary from the employee table and dep_name from department table in the same query. The JOIN has been implemented on the id column of the department table and the dep_id column of the employee table. The WHERE clause is used to filter records where the name of the department is Sales, HR or IT. This is because we are only updating the salaries of employees from these departments.

Conclusion

In this article we saw that how cursor based operations lead to performance issues. Therefore, JOINS should always be used in preference to cursors because of their efficiency and simplicity, unless you need to carry out a row-by-row operation on the records.

Other great articles from Ben

Sequence Objects in SQL Server
Debugging stored procedures in SQL Server Management Studio (SSMS)
Understanding cursors and replacing them with JOINs in SQL Server

Ben Richardson
168 Views