Eli Leiba

Creating a gap in sequences – TSQL Stored Procedure advisor

January 6, 2016 by

Introducing the Problem

Gaps existence in automatic sequenced columns occurs all the time. Missing identity values (or other sequencing values) occur for a variety of Reasons.

The most common reasons include: roll backed transactions, failed inserts and Deletes, large row deletes after delete commands that occur after many inserts to a sequenced table and so forth.

The sequence gaps appear in all kinds of sequences, IDENTITY, generic integer Columns that act as a sequenced primary key of a table and also dedicated sequence Objects.

In all of these cases, the sequenced number does not contain any duplicate values.

The sequence gaps scenarios problem can cause some of business questions since the values are not sequenced properly and values are missing for no apparent Reason.

In order to reduce gaps in sequences, I have constructed a stored procedure that analyze the gaps in a given table and sequence column.

The procedure gives back a new sequence number from one of four selected Methods (strategies):

1 – (The first fit gap strategy) – The first number that fit in the first gap

2 – (The last fit gap strategy) – The first number that fit in the last gap

3 – (The Largest fit gap strategy) – The first number that fit in the largest gap.

4 – (The smallest fit gap Strategy) – The first number that fit in the smallest gap.

Advantage and weakness

The advantage of this procedure is clear, it saves space. Rather than using a big integer that is 8 bytes for a large sequence column that advances all the time, we keep a shorter int column (only 4 bytes) and adjust the sequence value from time to time.

This scenario is useful when there are many deletes and not only inserts to the table.

The weakness of this method is clearly performance, executing the procedure takes time, and the execution time gets larger when the table is larger. Using an indexed (usually clustered index on the sequenced column), helps in this case.

How does the solution work?

I have created a stored procedure called GetNextSequence that acts as an advisor to the next value the sequence should have. The stored procedure uses a dynamic TSQL SELECT statement that is constructed inside the stored procedure. The statement uses the LEAD window function.

Starting from the SQL 2012 SQL Server edition, The LAG and LEAD window functions have been introduced as functions for accessing the prior or subsequent rows along with the current one.

If the table that is given as the parameter, has gaps in the given column, the table is then queried, comparing the Values that exist in the current row compared to the value in the next row.

If the difference between the two values is greater than 1, then a gap exists, and therefore, will be returned in the result set.

If there are no gaps in the table (that means that there are no adjacent values with difference greater than 1, no rows are returned from the query and the procedure returns -1)

The assumption of the procedure is that the sequence column is a positive integer (n integer and > 0) then a positive integer value is returned.

The Procedure gets three parameters and outputs one value.

The input parameters

  1. The Table name that has gaps in the sequenced column (i.e @table parameter)

  2. The name of the sequence column (i.e @col parameter)

  3. The method, as a string that has four valid codes (@i.e. the @method parameter)

    The codes are:

    • FIRST_FIT – The very first , minimal value for a sequence that fit in the first found gap
    • LAST_FIT – The first value of the sequence that fits in the last gap found
    • LARGEST_GAP – The first value of the sequence that fit in the largest gap found
    • SMALLEST_GAP – The first value of the sequence that fit in the smallest gap found
  4.  
  5. An Output parameter (The @sequence parameter):

    • If a sequence is found according to any method then a positive Integer value is returned according to the method selected
    • If -1 is returned it stands for – "No gaps found", so NEXT number should be Max (sequence) + 1

Note:

Note that it is up to the DBA that executes the procedure, to reseed the identity column on to alter the sequence back to the value suggested by the procedure.

The Procedure TSQL Code

Some Explanations for the code:

  • The procedure gets the table name , column name and method input parameters and constructs a dynamic TSQL statement in the following algorithmic format :

In The dynamic TSQL execution, the table and column are replaced by the corresponding parameters of the procedure.

  • The result of this dynamic TSQL statement is entered into a temporary, global table called ##sequences.
  • This temporary table is then queried, according to the method parameter.
  • If method is first-fit then the minimal value of the starting gap sequence is returned
  • If method is last-fit then the maximal value of the starting gap sequence is returned
  • If method is smallest-gap then the value of the starting gap sequence
  • Where the Gap is the smallest is returned.
  • If method is largest-gap then the value of the starting gap sequence
  • Where the Gap is the largest is returned

Here is an example for procedure execution with some explanations:

  1. A test case table called testTB is created with some test values

    As you can see there are gaps in the table:

  2. Then, we execute the procedure and test it with all the methods available:

We get the following output values, after this code execution:

13
853
56
401

Explanation for the above example:

  • The first gap is between 13 and 49, so according to the first fit method 13 is returned
  • The last gap is between 853 and 1053, so according to the last fit method 853 is returned
  • The smallest gap is between 56 and 89, so according to the smallest gap method 56 is returned.
  • The largest gap is between 401 and 849, so according to the largest gap method 401 is returned.

See more

To get 3 free licenses to a SQL Server monitoring tool, download ApexSQL Monitor and fill out this simple survey.

References

Finding (all the) gaps in an identity column (or any integer based column for that matter) using SQL 2005
Microsoft SQL Server: The Sequencing Solution
Get Rid of Gaps in Numerical Sequences


Eli Leiba

Eli Leiba

Eli Leiba is a senior application DBA at Israel Electric Company, a teacher at Microsoft CTEC, and senior database consultant with 24 years of experience working with both SQL Server and Oracle RDBMS. He is certified in Oracle and SQL Server database administration and implementation and has a B.S. in Computer Science.He can be reached at: iecdba@hotmail.com

View all posts by Eli Leiba
Eli Leiba
SQL Server performance tuning

About Eli Leiba

Eli Leiba is a senior application DBA at Israel Electric Company, a teacher at Microsoft CTEC, and senior database consultant with 24 years of experience working with both SQL Server and Oracle RDBMS. He is certified in Oracle and SQL Server database administration and implementation and has a B.S. in Computer Science. He can be reached at: iecdba@hotmail.com View all posts by Eli Leiba

316 Views