Murilo Miranda

AlwaysOn Availability Groups – Curiosities to make your job easier – Part 2

October 6, 2015 by

In continuation to the previous article, where we talked about what happens when I add a readable secondary and why there are different options to control connections to the AG, it’s time to continue and demystify more curiosities about the beloved Availability Groups!

And the next curiosity is….

How does statistics work in a readable secondary?

Have you ever thought about that? Maybe if your database is running fine, you aren’t even bother to know if your stats are updated in the primary replica… why burn your brain thinking about the secondary replica?

Well, this is something nice to know, at least to explain to the customers

Statistic is an object containing a histogram of the values distribution in a column. It’s common to have a statistic created for a single column, but there are statistics for multiple columns. Based on those created statistics, the Query Optimizer will create the execution plans, in order to better execute a query request.

The statistics can be created manually or automatically, if the AUTO_CREATE_STATISTICS is set to ON, in the database properties, or when you create indexes.

As you may guess, the primary replica is where all the statistics are created, as this is the read / write replica. No matter if the statistic was created automatically or manually.

As you can see in the picture above, the stats are transferred as other database objects, with the objective of synchronize the primary and all the secondary replicas of an Availability Group. This way the databases should be identical in a point in time. But wait… Identical? Hmm… this might be a problem…

Why the word “identical” called my attention? Simple… I will answer with a question: Is your workload identical in the primary and secondary replicas?

If your answer is “yes”, probably you are either using the secondary replica as standby, for disaster recovery purposes, or using to balance the load randomly, so the workload in the primary and secondary is the same.

If you answer was “no” or “I’m not sure”, you might not be using the statistics that are generated in the primary replica. Queries executed on Primary, may not be similar to the ones executed in the Secondary. But this is not the only concern here… Statistics on Secondary Databases, may be stale at some point, retuning bad execution plans.

The two described examples are common and are a big problem if you think that the secondary replica is read-only! So, how could I create statistic to support the reports that are being done in the secondary replica? How can I update a stale statistic?

Hopefully, Microsoft developers thought about this and they came up with a solution. And this is simpler than you can imagine

When a new statistic is created in the secondary replica, the TempDB will be used to store the object. This means that all the activity done in the secondary replica will be covered with the proper statistics. This also means that when you restart your secondary replica all the temporary statistics will disappear!

Those temporary stats are identifiable by their name, with the suffix “readonly_database_statistic”. You can use the sys.stats view in order to check the statistics of specific table (filtering by object id) or the entire database. This view will show the statistic name and this way you can identify the temporary ones. If your objective is identify only the temporary statistics, the same view has a column called “is_temporary” to distinguish the permanent and non-permanent statistics.

Here is a sample table:

Do you remember the first part of this series, where we talked about the need of a readable secondary to convert the transactions to snapshot isolation level? Because of this, the TempDB was being used to store all the row versions, right? Notice that the TempDB is also used in the statistics case! This enforces the importance of the TempDB in an Availability Groups environment. Make sure you follow all the best practices in order to avoid problems!

Some hands-on

If you want to test and see this happening with your own eyes, I have something ready for you. You will need an “AdventureWorks2014” database, which you can find here. Setup a simple Availability Group including this database and execute the code as follow.

Execute the following in the primary replica:

This code will create a table, insert one row to this table and, finally, execute a select. This select is not returning any results. When you query the sys.stats view, you will notice that a statistic was created to cover the executed query.

Now run the following code in the secondary replica:

This code is querying the sys.stats view and showing that both the primary and secondary replicas have the same statistics for that table. After that we run a different query (from the primary replica) and we query again the sys.stats. You will notice that a new temporary statistic was created!

I hope you enjoyed and keep following the Availability Curiosities 🙂

Thank you for reading.

Murilo Miranda