Ahmad Yaseen

How to track the history of data changes using SQL Server 2016 System-Versioned Temporal Tables

September 23, 2016 by

SQL Server 2016 introduces a new type of table that is designed to keep the full history of data changes, where row validity is managed by the system. This new table type is called a System-Versioned Temporal Table. In earlier SQL Server versions, user tables would enable you to hold only the most recent copy of the row, without being able to query the value before the UPDATE or DELETE operations. Using a Temporal Table, you will be able to query the recent state of the row as usual, in addition to the ability to query the full history of that row, which is fully managed by the SQL Server Engine, as you can’t define the rows validity period explicitly.

Each System-Versioned Temporal Table should contain two datetime period columns that are explicitly defined in the table; the Period Start column that will be used by the system to store the start time of the row, and the Period End column in which the system will store the end time of the row.

The main Temporal table will keep only the current version of the table rows, but it will be linked automatically to another table with the same exact schema, in which the previous versions of the modified rows will be stored automatically, this table is called a History table. When you create a Temporal table, the History table will be created automatically. If you don’t specify the history table name as a parameter in the SYSTEM_VERSIONING clause, it will be named with the naming pattern MSSQL_TemporalHistoryFor_<object_id>. Specifying the history table name while creating the Temporal table will allow the system to create the history table with the default system configuration, with the ability to create your own history table and provide its name in the Temporal table creation statement. Take into consideration that no changes can be performed on the history table’s data. If you try to delete any row from the history table, you will get the (Cannot delete rows from a temporal history table ‘<tableName>’) error, as it is only managed by the system.

When a new row is inserted, the system will assign the Period Start time column with the current transaction start time-based on the system clock, and assign the Period End time column with the maximum datetime2 value equal to 9999-12-31, and the row will be opened. If an existing row is updated, the system will copy the previous version of the row in the history table and fill the Period End time column with the current transaction start time based on the system clock, and the row will be closed. The Period End time column value of the main temporal table will be 9999-12-31 without any change.

Deleting an existing row, the row will be stored in the history table, with the Period End time column value equal to the current transaction start time based on the system clock, and the row will be closed after deleting it from the main temporal table.

System-Versioning can be enabled when a table is created using the CREATE TABLE statement or after creating the table using the ALTER DATABASE statement. In order to enable System-Versioning in a table, it must have a primary key, with two not-nullable datetime2 period columns defined as GENERATED ALWAYS AS ROW START or END, passed as parameters in the PERIOD FOR SYSTEM_TIME within the table definition.

The below CREATE TABLE statement is used to create the Temporal_Table_Demo temporal table, with DempID primary key, the two-period columns; TimeStart GENERATED ALWAYS AS ROW START and TimeEnd GENERATED ALWAYS AS ROW END, both designated as PERIOD FOR SYSTEM_TIME, the temporal table is linked to the dbo.Temporal_Table_Demo_History history table, taking into consideration that identifying the history table schema name is mandatory, and finally enabling the SYSTEM_VERSIONING feature as follows:

Browse the new temporal table from the Object Explorer, where you will see a special icon with a small clock (see below image) for the temporal table with System-Versioned between the parentheses. The history table will not have a separate table node; it can be shown by expanding the temporal table with a node under the main table identified with History between parentheses. A default clustered index will be created for the history table that contains the two Period Start and Period End columns as below:

Let’s assume that we need to enable System-Versioning on an existing table in our database. To do that, two Period Start and Period End not-nullable datetime2 columns should be added with all specs specified previously as in the below script:

In the previous script, the default constraints are defined as the period columns should be not-nullable and then dropped before enabling System_Versioning as the SQL Engine will be responsible for filling these two columns. The SQL Server Engine will perform a consistency check on both the temporal and history tables to verify that all schema requirements are applied before enabling System_Versioning. This includes checking that the Period End time is greater than the Period Start time and that the periods of the different versions of the same row will not overlap. The consistency check is enabled by default and can be disabled using the DATA_CONSISTENCY_CHECK = OFF statement in the System_Versioning enable part. Now the AWBuildVersion table is defined as a temporal table successfully:

Enabling System_Versioning on the table will not prevent you from modifying the table schema. The system will hold a schema lock on both the temporal and history table, where these changes will be replicated to the history table. A specific schema change can’t be performed directly using an ALTER DATABASE statement on a table with System_Versioning enabled, such as adding or removing an IDENTITY, COMPUTED, SPARSE, COLUMN_SET or ROWGUIDCOL columns. To apply such changes on a table with System_Versioning enabled, you need first to disable System_Versioning, perform the change you need then enable the System_Versioning again. The below script is used to add a new IDENTITY column to the Temporal_Table_Demo temporal table (we will not consider this added column in the coming demo):

Let’s start a demo to investigate this new feature. First, we will insert 5 records to the Temporal_Table_Demo temporal table using the following simple INSERT statement:

Querying the temporal table to check the inserted rows:

The result in our case will be like:

As we discussed previously, the Period start time will be filled by the system with the transaction start time, and the Period End time will be filled by the system with the maximum datetime2 value keeping the records opened. No changes applied to the history table as no action performed on that rows yet. If you try to query it, you will find it empty.

If we manage to delete the last row in the temporal table with DempID equal to 5:

And query the temporal table, we will find that the record is removed from the table, which is the normal situation:

But the new thing here is that, if we query the related history table:

We will find that the deleted record is inserted into the history table, and the Period End time is updated with the transaction start time, marking the row as closed:

Again, if we manage to update the number of employees in department 1:

And query the temporal table again, we will find that the row is updated and the Period Start time also updated with the new transaction time, keeping the row opened:

And the old record before the UPDATE operation will be inserted into the history table with the Period End time as the transaction current time, closing the row:

The previous results show us how useful the new System_Versioningfeature is since it allows us to rollback or minimum track the changes applied to that table without getting back to the old backup files for one or few records.

A new clause FOR SYSTEM_TIME is added to the SELECT statement to query the data across the history and main temporal tables, with five sub-clauses to provide you with many options to specify the required period of time. The first clause is the AS OF clause, which returns values after performing the union between the temporal and history tables and filtering the rows that are valid at the specified point in time. The row can be considered as valid if the Period Start time value is less than or equal to the specified time and that the Period End time is greater than the specified time.

The below SELECT statement uses the FOR SYSTEM_TIME AS FOR clause to retrieve all rows with TimeStart less than or equal to the provided time and the TimeEnd is greater than that provided time:

The final result for the valid rows after the internal union between the temporal and history tables will be as follows:

The second clause is the CONTAINED IN clause that returns all opened and closed rows within the time range defined by the provided two-period values. The below SELECT statement uses the FOR SYSTEM_TIME CONTAINED IN clause to retrieve all rows that opened and closed within the time range specified by the @StartTime and @EndTime values:

The result in our case will be:

The third clause is the FROM…TO clause that returns all active rows within the time range defined by the provided two values, regardless of the active starting time. A union will be performed internally between the main temporal table and the history table to retrieve the active rows within the provided range of time. The below SELECT statement uses the FOR SYSTEM_TIME FROM TO clause to retrieve all active rows within the time range specified by the @StartTime and @EndTime values:

The below result shows all active records within the provided range of time:

The Fourth clause is the BETWEEN…AND clause that works same as the FROM…TO clause except that it returns the only the rows that became active on the lower boundary or became inactive on the upper boundary provided in the statement. The below SELECT statement uses the FOR SYSTEM_TIME BETWEEN AND clause to retrieve the rows that became active on @StartTime or the rows that became inactive on @EndTime:

The query output will be like:

The last clause is the ALL clause that performs an internal union between the rows in the main temporal table and the history table without specifying any time range as follows:

The result will be similar to:

As any new feature introduced in SQL Server, a related DMV or property within the DMV will be added in order to query that feature’s information. The sys.tables system tables can be queried for the tables with temporal_type not equal to zero, to retrieve all tables with System_Versioning enabled with the corresponding history tables as below:

The result of the query will show us the two tables that we enabled the System_Versioning on with its related history tables:

The new system catalog view sys.periods can be used also to list all period columns of the tables with System_Versioning enabled on it. Full information can be retrieved by joining the sys.periods with the sys.tables as in the below query:

The query result will show us the two System_Versioning enabled tables with the Period Start and Period End columns names:

SQL Server allows you to add indexes to the temporal and history tables to enhance the query’ performance. Assume that we need to tune the performance of the below query that is running very frequent in our system and any enhancement in it will make a difference in the overall performance:

We turned the STATISTICS TIME on in order to get the query execution time accurately. Let’s run the query and check the query execution plan using the APEXSQL PLAN application and execution time. The execution plan of the query shows us that a Clustered Index Scan is performed in the temporal table as there is no index on the TimeStart and TimeEnd columns:

And the query took 107ms to run completely:

Let’s create a non-clustered index on the TimeStart and TimeEnd columns:

And execute the query again, the execution plan generated using the APEXSQL PLAN application shows us now that the Clustered Index Scan on the Temporal table changed to Index Seek:

And that the execution time is decreased clearly to 37ms:

As you can see, the index that we created in the temporal table enhanced the query performance and decreased the execution time considerably. Depending on the workload on your environment, you can customize the best index that fit your queries and enhance your system performance.

Conclusion

The System-Versioned Temporal table is a new type of table that is linked with a related history table in order to keep a history for each row change performed on that table. You can develop your own solution depending on the temporal table to track the data changes with the change time and rollback any disaster changes without the need to restore the backup files. You can get information about the temporal tables and the period columns using the existing and the newly introduced SQL Server system objects. The good point here is that it is managed only by the system requiring no effort from your side to define retention periods and follow up the insertion or deletion processes on the historical tables. As any new feature, test it carefully in your test environment before applying it to the live environment.


Ahmad Yaseen
Latest posts by Ahmad Yaseen (see all)
SQL Server 2016, Temporal tables

About Ahmad Yaseen

Ahmad Yaseen is a Microsoft Big Data engineer with deep knowledge and experience in SQL BI, SQL Server Database Administration and Development fields. He is a Microsoft Certified Solution Expert in Data Management and Analytics, Microsoft Certified Solution Associate in SQL Database Administration and Development, Azure Developer Associate and Microsoft Certified Trainer. Also, he is contributing with his SQL tips in many blogs. View all posts by Ahmad Yaseen

168 Views