Dmitry Piliugin

SQL Server 2017: Columnstore in-place updates

April 26, 2018 by

In this post, I continue the exploration of SQL Server 2017 and we will look at the nonclustered columnstore index updates.

Columnstore index has some internal structures to support updates. In 2014 it was a Delta Store – to accept newly inserted rows (when there will be enough rows in delta store, server compresses it and switches to Columnstore row groups) and a Deleted Bitmap to handle deleted rows. In 2016 there are more internal structures, Mapping Index for a clustered Columnstore index to maintain secondary nonclustered indexes and a deleted buffer to speed up deletes from a nonclustered Columnstore index.

Updates were always split into insert + delete. But that is now changed, if a row locates in a delta store, now inplace updates are possible. Another change is that it is now possible to have a per row (narrow) plan instead of per index (wide) plan.

Let’s make some experiments.

I will create a table, insert 100 000 rows into it and then create a nonclustered Columnstore index. All the rows will go to the compressed row group. After that, I insert one more portion of 100 000 rows, that is not enough to go directly to a compressed row group, so a delta store will be created and the new rows will go to it.

The result is:

Note: Notice, that I created a varchar(max) column in a table, that was done deliberately, because I wanted to test if LOB columns are supported in 2017. Unfortunatelly, when I tried to create a nonclustered Columnstore index, I got an error:

The statement failed. Column ‘Note’ has a data type that cannot participate in a columnstore index. Omit column ‘Note’.

However, if you create a clustered Columnstore, you will succeed! Though, you will get the following warning in CTP 1.2:

Warning: Using Lob types (NVARCHAR(MAX), VARCHAR(MAX), and VARBINARY(MAX)) with Clustered Columnstore Index is in public preview. Do not use it on production data without backup during the public preview period.

I think it is great that we can now have LOB varchar columns in Columnstore, because it was a blocker for some scenarios, at least I have seen some of those.

Now, let’s look at the update plans under compatibility level of SQL Server 2016 (130) and SQL Server 2017 (140):

The plans are:

The first plan is a per index plan, that means that a server will apply changes in each index one after another separately. The second plan is a per row plan, and the changes are done in one iterator. Usually, if a small portion of rows is updated, the per row plan is cheaper and more convenient.

We can see, that in 2017 the per-row narrow plan is now possible.

Next, we will check the in-place updates for delta stores. That means that an update operator should not be split as insert + delete, but rather done in place as a single update.

We will issue an update of the delta store row (OrderID > 100000) and then query transaction log, to see, what is done, was it an insert + delete or update. We do this first under compatibility level 2016 and then under 2017.

The result is:

You may see, that in case of 2016 there was first LOP_MODIFY_ROW for a base table which is a heap and then LOP_DELETE_ROWS and LOP_INSERT_ROWS for a delta store of dbo.Orders (function fn_dblog is very user friendly and even adds “(Delta)” for an allocation unit name, cool)!

In case of 2017, there are just two operations, both of them are LOP_MODIFY_ROW, one for a base heap and one for a delta store.

What I like about it, is that it is one of the additions that you get out of the box simply by upgrading to a newer version (when there will be an official RTM), as Microsoft used to say – it just runs faster. You have less pressure on transaction log and you save CPU instructions doing one operation instead of two, which, I believe, is quite important for the real-time analytics.

If you change the OrderID to the value that is stored in a compressed row group and run it (two times, to exclude PFS and GAM page actions when the deleted buffer is created for the first row), you will see that for a compressed row groups nothing is changed:

That’s all for today, but stay tuned, there are some more interesting query processing related things in 2017 to be covered.

Thank you for reading.

Table of contents

SQL Server 2017: Columnstore Indexes and Trivial Plan
SQL Server 2017: Columnstore in-place updates
SQL Server 2017: Scalar Subquery Simplification
SQL Server 2017: Interleaved Execution for mTVF
SQL Server 2017: Sort, Spill, Memory and Adaptive Memory Grant Feedback
SQL Server 2017: Statistics to Compile a Query Plan
SQL Server 2017: How to Get a Parallel Plan
SQL Server 2017: Adaptive Join Internals
Dmitry Piliugin