Sifiso Ndlovu

Columnstore Index in SQL Server

May 28, 2014 by

Prerequisite

The discussion pertaining to SQL Server 2012 columnstore indexing is better explained through theoretical and practical measures. Thus, for the practical measure part – I will be using the AdventureWorksDW2012 sample database. The rest of the prerequisites for a comprehensive understanding of this topic are as follows:


  • Conceptual understanding of clustered and non-clustered indexing in SQL Server.
    • Although columnstore indexing is a new feature of SQL Server 2012, like in many database platforms, the concept of indexing has largely been prevalent in SQL Server versions prior to SQL Server 2012. Thus, in order to fully appreciate columnstore indexing I suggest you familiarise yourself with SQL Server indexing capability.
  • Running Instance of SQL Server 2012 along with AdventureWorksDW2012 sample database

Introducing Columnstore Index in SQL Server 2012

Columnstore index is an enterprise feature that is intended to improve the query response on T-SQL queries against OLAP systems. This feature was introduced in SQL Server 2012 as an alternative to the traditional row-based indexing that has been dominant in SQL Server versions prior to SQL Server 2012. It must be noted however that the notion of indexing against columns of a given object is not novel to Microsoft instead has been largely pioneered by SAP’s Sybase IQ database technology.

Getting started with Columnstore Index in SQL Server 2012

If you familiar with the creation of clustered or non-clustered indexes you should be able to find your way around creating a columnstore index. Figure 1 shows the complete syntax for creating a SQL Server 2012 columnstore index.

Figure 1
Figure showing the complete syntax for creating a SQL Server 2012 columnstore index

Whilst the arguments of columnstore index creation are further explained in the TechNet site, it must be noted that the inclusion of the NONCLUSTERED argument does not make any difference during the creation of columnstore index in SQL Server 2012. The reason for this is that clustered columnstore indexes are not supported in SQL Server 2012 which means that every SQL Server 2012 columnstore index is always created as NONCLUSTERED columnstore index – regardless of whether or not you specify the NONCLUSTERED argument in your create columnstore index statement.

Thus, Figure 2 and Figure 3 show the respective creation of columnstore indexes against dbo.FactProductInventory_ApexSQL_CC1 and dbo.FactProductInventory tables.

Figure 2                                                                                                                   Figure 3
Figure showing the creation of columnstore indexes against dbo.FactProductInventory_ApexSQL_CC1 table    Figure 3 illustrating the creation of columnstore indexes against dbo.FactProductInventory table

Although the script in Figure 2 differs from the one shown in Figure 3 through its omission of the NONCLUSTERED argument, the execution of both scripts invariably results into the creation of NONCLUSTERED index as shown in Figure 4.

Figure 4
Figure illustrating the creation of NONCLUSTERED index

In addition to getting started with the syntax of creating columnstore index, you will notice that there is now a new icon in SQL Server 2012 object explorer that indicates the presence of a columnstore index in a given object as shown in Figure 5.

Figure S5
SQL Server 2012 Object Explorer - Icon indicating the presence of a columnstore index in a given object

Finally, as you continue getting familiar with columnstore index, take note of the list of data types that are permitted in a create columnstore index statement. Table 1 outlines a list of supported data types as per SQL Server Data Type Category.

Table 1

Character Strings Unicode Character Strings Exact Numerics Approximate Numerics Date and Time
CHAR NCHAR INT FLOAT DATE
VARCHAR – except for MAX values NVARCHAR – except for MAX values BIGINT REAL DATETIME
SMALLINT DATETIME2
TINYINT SMALLDATETIME
BIT TIME
MONEY DATETIMEOFFSET
– with a scale less than 2
SMALLMONEY
DECIMAL
NUMERIC

Limitations of Columnstore Index in SQL Server 2012

Unfortunately, notwithstanding some of the potential gains in columnstore indexing, there are several restrictions. In this section we will take a look at the limitations of columnstore index usage in SQL Server 2012. However, due to the fact SQL Server 2014 had already been released at the time of writing this article, it must then be noted that some of the columnstore index restrictions applicable to SQL Server 2012 have since been resolved in SQL Server 2014 – i.e. creation of clustered columnstore index is now supported in SQL Server 2014. Nevertheless, as of SQL Server 2012, the limitations of columnstore index were as follows:

  1. Restrictions on Column-Store Indexes:
    1. Columnstore index is not allowed to be created against a sparse column
    2. Columnstore index is not allowed to have more than 1024 columns
    3. Columnstore index is not allowed to be used as a primary key in a given object
    4. Columnstore index is not allowed to be used as a foreign key in a given object
    5. During a create columnstore index statement, you are not allowed to specify sorting keywords (i.e. ASC, DESC) on the columns that are used as part of the columnstore index
    6. Columnsstore index is not allowed to be created against a view or indexed view
    7. Columnstore create index statement is not allowed to contain the INCLUDE keyword
    8. You are not allowed to modify the columnsstore index using the ALTER INDEX statement
    9. A columnstore index can only be created in a table that currently has no data
    10. You are restricted to creating a single columnstore index per given object
    11. You are not allowed to update the data of an object that uses columnstore indexing
    12. Although you are allowed to add columns to a table that uses columnstore index, you are not allowed to drop any of the columns that are referenced in a columnstore index
  2. SQL Server 2012 Features that do not support Column-Store Indexes:
    1. SQL Server Replication
    2. Change Tracking
    3. Change Data Capture
    4. Filestream
    5. Compression
      1. Page
      2. Row
      3. SQL Server Vardecimal Storage Format
  3. SQL Server 2012 Data Types that do not support Column-Store Indexes:

    Table 2

    Character Strings Unicode Character Strings Exact Numerics Binary Strings Date and Time Other Data Types
    VARCHAR (MAX) NVARCHAR (MAX) DECIMAL – with precision greater than 18 digits BINARY DATETIMEOFFSET – with scale greater than 2 XML
    NUMERIC – with precision greater than 18 digits VARBINARY HIERARCHYID
    IMAGE SPATIALTYPES
    TIMESTAMP
    SQLVARIANT
    UNIQUEIDENTIFIER
    ROWVERSION

Tutorial: Row based index vs Columnstore Index in SQL Server 2012

In this section we take a closer look at some of the benefits that can be incurred by using a columnstore index in a data warehouse environment. We will be using the AdventureWorksDW2012.dbo.FactProductInventory table to test the regular index part of this tutorial. I have chosen this table as it has largest record count (at 776286) of all the objects that ships with the AdventureWorksDW2012 sample database.

Figure 6 shows the entity relationship diagram of the dbo.FactProductInventory table. The clustered index and primary key uses a combination of ProductKey and DateKey fields.

Figure 6
Figure illustrating the entity relationship diagram of the dbo.FactProductInventory table

For testing the columnstore index part, I have done the following:

  1. Created a new fact table called dbo.FactProductInventory_ApexSQL, which is based on the definition of the sample dbo.FactProductInventory table, and then
  2. Created a nonclustered columnstore index called CIX_FactProductInventory_ApexSQL on the dbo.FactProductInventory_ApexSQL table (Figure 7 shows the definition of the CIX_FactProductInventory_ApexSQL columnstore index).

    Figure 7
    Figure showing the definition of the CIX_FactProductInventory_ApexSQL columnstore index

Now for the purposes of testing both scenarios (row based vs columnstore indexes), I am going to apply the same sample user story against dbo.FactProductInventory and dbo.FactProductInventory_ApexSQL. The user story is as follows:

“As a user, I would like to see a report that show the totals of Unit Costs, Units In, Units Out, and Units Balance per Product”

Figure 8 and Figure 9 show respective T-SQL queries for retrieving the data to satisfy the user story against dbo.FactProductInventory and dbo.FactProductInventory_ApexSQL tables.

Figure 8
Figure illustrating a T-SQL query for retrieving the data to satisfy the user story against dbo.FactProductInventory table

Figure 9
Figure showing a T-SQL query for retrieving the data to satisfy the user story against dbo.FactProductInventory_ApexSQL table

Figure 10 and Figure 11 show IO Statistics for the respective execution of T-SQL queries shown in Figure 8 and Figure 9. Just by comparing the statistics in terms of logical reads and object scan count, it can be seen that there are significant performance benefits for the table that uses a columnstore index.

Figure 10
Illustrating IO statistics for the execution of a query for retrieving the data against dbo.FactProductInventory table

Figure 11
Illustrating IO statistics for the execution of a query for retrieving the data against dbo.FactProductInventory_ApexSQL table

Finally, as probably could have been expected after looking at the IO Statistics, the execution plan of the T-SQL query that uses the columnstore index (shown in Figure 13) is much simpler and less costly as compared to that of the T-SQL query that uses a regular row-based index (shown in Figure 12).

Figure 12
The execution plan of the T-SQL query that uses a regular row-based index

Figure 13
The execution plan of the T-SQL query that uses the columnstore index

Conclusion

The newly introduced concept of indexing columns in SQL Server 2012 known as columnstore index can significantly improve the response to T-SQL query requests. However, in spite of such improvements, there are several restrictions to columnstore index usage that may force one to relinquish SQL Server 2012 features such as replication and change tracking for the benefit of columnstore indexing.

Sifiso Ndlovu
168 Views