Dmitry Piliugin

Author
Recent Posts

Dmitry is a SQL Server enthusiast from Russia, Moscow. He started his journey to the world of SQL Server more than ten years ago. Most of the time he was involved as a developer of corporate information systems based on the SQL Server data platform.

Currently he works as a database developer lead, responsible for the development of production databases in a media research company. He is also an occasional speaker at various community events and tech conferences. His favorite topic to present is about the Query Processor and anything related to it. Dmitry is a Microsoft MVP for Data Platform since 2014.

View all posts by Dmitry Piliugin

Latest posts by Dmitry Piliugin (see all)

SQL Server 2017: Adaptive Join Internals - April 30, 2018
SQL Server 2017: How to Get a Parallel Plan - April 28, 2018
SQL Server 2017: Statistics to Compile a Query Plan - April 28, 2018

SQL Server 2017: Adaptive Join Internals

April 30, 2018 by Dmitry Piliugin

SQL Server 2017 brings a new query processing methods that are designed to mitigate cardinality estimation errors in query plans and adapt plan execution based on the execution results. This innovation is called Adaptive Query Processing and consist of the three features:

Adaptive Memory Grant Feedback;
Interleaved Execution;
Adaptive Joins.

SQL Server 2017: How to Get a Parallel Plan

April 28, 2018 by Dmitry Piliugin

SQL Server chooses parallel plans based on the costing (there are also some other factors that should be met for the plan that it can go parallel). Sometimes serial plan is slightly cheaper than a parallel, so it is assumed to be faster and picked by the optimizer, however, because the costing model is just a model it is not always true (for a number of reasons, enlisted in Paul’s article below) and parallel plan runs much faster.

SQL Server 2017: Statistics to Compile a Query Plan

April 28, 2018 by Dmitry Piliugin

While preparing the post about Adaptive Joins, I’d like to share a quick post about the hidden gem in SQL Server 2017 CTP 2.0, discovered recently. In this short post, we will look at how you can determine what statistics are used by the optimizer during a plan compilation in SQL Server 2017.

Prior to SQL Server 2017, there were two ways how you could do it, both undocumented and involving undocumented trace flags.

SQL Server 2017: SQL Sort, Spill, Memory and Adaptive Memory Grant Feedback

April 27, 2018 by Dmitry Piliugin

This article explores SQL Sort, Spill, Memory and Adaptive Memory Grant Feedback mechanism in SQL Server.

SQL Server 2017: Interleaved Execution for mTVF

April 27, 2018 by Dmitry Piliugin

In this post, we are going to look at the new feature in SQL Server 2017 – interleaved execution. You need to install SQL Server 2017 CTP 1.3 to try it, if you are ready, let’s start.

Now, when a CTP 2.0 of SQL Server 2017 is out, you don’t need to turn on the undocumented TF described further, and the plans are also different, so the examples from this post use CTP.1.3, probably not actual at the moment (I was asked to hold this post, until the public CTP 2 is out, and interleaved execution is officially announced). However, the post demonstrates Interleaved execution details and might be still interesting.

SQL Server 2017: Scalar Subquery Simplification

April 26, 2018 by Dmitry Piliugin

Nowadays a lot of developers use Object-Relational Mapping (ORM) frameworks. ORM is a programming technique that maps data from an object-oriented to a relational format, i.e. it allows a developer to abstract from a relational database (SQL Server, for example), use object-oriented language (C#, for example) and let an ORM to do all the “talks” to a database engine by generating query texts automatically. ORMs are not perfect, especially if they are used in a wrong way. Sometimes they generate inefficient queries, e.g. a query with redundant expressions. SQL Server has a mechanism to struggle with that inefficiency called a query simplification.

SQL Server 2017: Columnstore in-place updates

April 26, 2018 by Dmitry Piliugin

In this post, I continue the exploration of SQL Server 2017 and we will look at the nonclustered columnstore index updates.

Columnstore index has some internal structures to support updates. In 2014 it was a Delta Store – to accept newly inserted rows (when there will be enough rows in delta store, server compresses it and switches to Columnstore row groups) and a Deleted Bitmap to handle deleted rows. In 2016 there are more internal structures, Mapping Index for a clustered Columnstore index to maintain secondary nonclustered indexes and a deleted buffer to speed up deletes from a nonclustered Columnstore index.

Updates were always split into insert + delete. But that is now changed, if a row locates in a delta store, now inplace updates are possible. Another change is that it is now possible to have a per row (narrow) plan instead of per index (wide) plan.

Let’s make some experiments.

SQL Server 2017: Columnstore Indexes and Trivial Plan

April 25, 2018 by Dmitry Piliugin

Some time ago, SQL Server 2017 was released and issued as CTP. The most exciting release in that CTP was that SQL Server now supports Linux! This is awesome and I consider it to be great news for many people.

I am personally interested in the new features of query processing, and finally I had some time to install the SQL Server 2017 and dig a little bit into it. Currently, it is CTP 1.2 available, and I will use this version for my experiments.

While exploring new extended events, I’ve found an interesting event compilation_stage_statistics and one of the columns of this event was trivial_plan_scanning_cs_index_discarded with the following description “Number of trivial plans discarded or could have been discarded which scan Columnstore index”. That pushed me to do some investigations of the topic.

SQL Server 2016: Scalar UDF Estimation and Project Normalization

April 25, 2018 by Dmitry Piliugin

In this post, we will continue to look at the cardinality estimation changes in SQL Server 2016. This time we will talk about scalar UDF estimation. Scalar UDFs (sUDF) in SQL Server have quite bad performance and I encourage you try to avoid them in general, however, a lot of systems still use them.

Scalar UDF Estimation Change

I’ll use Microsoft sample DB AdventureworksDW2016CTP3 and write the following simple scalar function, it always returns 1, regardless of the input parameter. I run my queries against Microsoft SQL Server 2016 (SP1) (KB3182545) – 13.0.4001.0 (X64)

Join Estimation Internals in SQL Server

April 24, 2018 by Dmitry Piliugin

In this post we continue looking at the Cardinality Estimator (CE). The article explores some join estimation algorithms in the details, however this is not a comprehensive join estimation analysis, the goal of this article is to give a reader a flavor of join estimation in SQL Server.

The complexity of the CE process is that it should predict the result without any execution (at least in the current versions), in other words it should somehow model the real execution and based on that modeling get the number of rows. Depending on the chosen model the predicted result may be closer to the real one or not. One model may give very good results in one type of situations, but will fail in the other, the second one may fail the first set and succeed in the second one. That is why SQL server uses different approaches when estimating different types of operations with different properties. Joins are no exception to this.

Query Plan on a busy SQL Server

April 24, 2018 by Dmitry Piliugin

Yesterday I came across a question on one of SQL forums, that I may rephrase like:

“Does a query plan compilation depend on how busy SQL Server is”.

Before we go further, I should explicitly mention that we talk about a Compiled plan, not an Executable plan. Plan execution will of course depend on how busy server is, for example, the query may wait for the memory grant to start execution, or execution may be slow because there are no cached pages in the Buffer Pool etc.

However, the question was about a Compiled plan: does the shape of a plan depend on the server load.

From the first glance it should not. But…

Query Trace Column Values in SQL Server

April 23, 2018 by Dmitry Piliugin

Sometimes, when I saw expressions like ‘Expr1002’ or ‘WindowCount1007’ or something similar in the columns Output List of a query plan, I asked myself, is there a way to project those columns into the final result to look at the values. That question first came to me out of curiosity when I was playing with window aggregate functions and a Window Spool plan operator in SQL Server 2012, I wanted to look into the Window Spool to understand, how it performs an aggregation.

Interestingly, that SQL Server 2016 CTP3.0 allows us to look deep inside into the iterator and observe the data flowing through it. Let’s turn on an “x-ray machine” and take a look.

Runtime Constants Sniffing in SQL Server

April 23, 2018 by Dmitry Piliugin

Most of the people know about the so-called “Parameter Sniffing”. This topic was discussed in many aspects in a number of great articles. It is interesting that not only parameters might be “sniffed” during the first execution, but also a runtime constant functions. Let’s look at the example.

Test Data

I will use a test server and administrator account to run the script below, be sure you have enough privileges on your test server if you want to try out the script below.

Few Outer Rows Optimization in SQL Server

April 20, 2018 by Dmitry Piliugin

In this blog post, we will look at one more Nested Loops (NL) Join Post Optimization Rewrite. This time we will talk about parallel NL and Few Outer Rows Optimization.

For the demonstration purposes, I will use the enlarged version of AdventureWorks2014. In the sample query, I will also use the trace flag (TF) 8649 – this TF forces parallel plan when possible and is very convenient here, as we need one for the demo. There are also a few other undocumented TFs: TF 3604 – direct diagnostic output to console, TF 8607 – get a physical operator tree, before Post Optimization Rewrite, TF 7352 – get a tree after Post Optimization Rewrite phase.

The sample query is asking for some data based on the period’s table.

Batch Sort and Nested Loop in SQL Server

April 18, 2018 by Dmitry Piliugin

Continuing my blog post series after 24HOP Russia “Query Processor Internals – Joins”. In this (and the next one) blog post, we will talk about the Nested Loop Post Optimization Rewrite optimizations.

Some of you may know that a Nested Loop join algorithm preserves order of the outer table.

SQL Server – Hash Join Execution Internals

April 17, 2018 by Dmitry Piliugin

Some time ago, on the 24HOP Russia I was talking about the Query Processor internals and joins. Despite I had three hours, I felt the lack of time, and something left behind, because it is a huge topic, if you try to cover it in different aspects in details. With the few next articles, I’ll try to describe some interesting parts of my talk in more details. I will start with Hash Join execution internals.

The Hash Match algorithm is one of the three available algorithms for joining two tables together. However, it is not only about joining. You may observe a complete list of the logical operations that Hash Match supports in the documentation:

SQL Server – Yet another X-Ray for the QP

April 16, 2018 by Dmitry Piliugin

This post is for the “optimizer” fan people. For those who are not afraid when they hear: “query transformation rules”. It this is you and you are interested in the optimizer internals – welcome.

Consider the following DB – opt. And the following query:

FORCESCAN and Partitioned table in SQL Server

April 13, 2018 by Dmitry Piliugin

I would like to share one curios case that I recently came across.

Long story short:

This bug may lead to incorrect results if you use a partitioned table and the FORCESCAN hint.

Bug

Consider the following example, let’s keep it simple.

SQL Server NOLOCK and Top Optimization

April 12, 2018 by Dmitry Piliugin

Sometimes people use nolock hint as a “turbo” button for their queries, assuming that not taking locks will speed up the query execution. There are many good articles describing all the dangerous moments of this approach, because of the read uncommitted isolation level. However, the focus of this article is a performance problem that you may encounter using nolock hint in some cases.

Let us setup some test data first.

MTVF and CE Model Variation in SQL Server

April 5, 2018 by Dmitry Piliugin

This is a note about multi-statement table valued functions (MTVF) and how their cardinality is estimated in the new CE framework.

In the old CE framework the MTVF had fixed estimate of one row, in the new one the estimate is still fixed, however, now it is 100 rows. That’s the whole story. =)

Ascending Key and CE Model Variation in SQL Server

April 5, 2018 by Dmitry Piliugin

In this note, I’m going to discuss one of the most useful and helpful cardinality estimator enhancements – the Ascending Key estimation.

We should start with defining the problem with the ascending keys and then move to the solution, provided by the new CE.

Overpopulated Primary Key and CE Model Variation in SQL Server

April 4, 2018 by Dmitry Piliugin

In this blog post, we are going to talk about another cardinality estimation model enhancement in SQL Server 2014 – Overpopulated Primary Key (OPK).

Consider a fact table that contains information about some sales, for example, and a date dimension table. Usually, a fact table contains the data about the current year and past years, but a dimension table usually contains the data for the next few years also.

Join Containment Assumption and CE Model Variation in SQL Server

April 3, 2018 by Dmitry Piliugin

In this post we are going to talk about one of the model assumptions, that was changed in the new cardinality estimation mechanism in SQL Server 2014 – Join Containment Assumption.

You may find some information about this assumption in the Ian Jose’s blog post: Query Processor Modelling Extensions in SQL Server 2005 SP1, there you may find the description of the so-called simple assumption and base assumption. Another source of available information is a white paper from Joseph Sack Optimizing Your Query Plans with the SQL Server 2014 Cardinality Estimator.

Filtered Stats and CE Model Variation in SQL Server

April 2, 2018 by Dmitry Piliugin

In this blog post, we are going to view some interesting model variation, that I’ve found while exploring the new CE.

A model variation is a new concept in the cardinality estimation framework 2014, that allows easily turn on and off some model assumptions and cardinality estimation algorithms. Model variations are based on a mechanism of pluggable heuristics and may be used in special cases. I think they are left for Microsoft support to be able to address some client’s CE issues pointwise.

Today we are going to view some interesting model variation, that creates filtered statistics on-the-fly. I should give a disclaimer here.

Cardinality Estimation Framework Version Control in SQL Server

March 30, 2018 by Dmitry Piliugin

This is a small post about how you may control the cardinality estimator version and determine which version was used to build a plan.

The version of the cardinality framework is determined by the query database context, where the database has a specific compatibility level.

When you create a database in SQL Server 2014 it has the latest compatibility level equals 120 by default. If you issue a query in that database context, the new cardinality version will be used. You may verify this by inspecting the plan property “CardinalityEstimationModelVersion” of the root language element (the one with the green icon), SELECT, for example.