Extracting information from execution plans

Analysing an execution plan is something very nice and easy even if you are using SSMS with no additional addons.

You can compare 2 plans too, either with SSMS itself or with some third party tools that do a good job highlighting potential problems.

The problem is when you are used to Query Store and these “rainbow queries” have hundreds of execution plans

Sometimes SQL has a hard time deciding which plan to use, and too many crayons

Since all tools rely on a DBA to look into the plan an identify the actual issues, you’d have to review on at a time. What if you could it automatically?

Query Store already has the plans, and there is a ton of information they contain:

  • The statements executed (only one for Query Store’s plans, but plans in the SQL cache or stored in exported plan files could contain several).
  • Indexes that SQL server considers may improve the performance of the query (also known as “missing indexes”).
  • Filtered indexes that exists on the tables, but due to the values passed to the query in the WHERE clauses were not used.
  • The details on each node of the execution plan (the little boxes that contain the details).
  • Cursors used in the statement.
  • Operations involving indexes (scan, seek, update…).
  • Columns accessed by the statement (for either read of write operations).
  • Statistics used by the SQL Engine to generate the execution plan.

So I thought… why not get all this information from the plan?

… right, because execution plan’s XML can be complex even for the simplest SELECT

There is no way I was gonna be able to navigate all that nightmare, but I remembered something I had written almost two years ago to take an XML file and turn it into smaller chunks (one per line) to make it more manageable and to be able to extract certain fields. Back then I was only listing the tables and columns involved, so I decided to update it and extract more information.

The process goes as follow:

  • The execution plan’s XML is extracted into a temp table (one line at a time, treated as if it were a plain text), but turning every line into a valid XML (replacing “</label>! with “<label/>”, and adding “/>” to turn each line into a XML itself).
  • Modify certain lines (such as the multi-line close tags into a separate custom one).
  • Based on the contents of each line, store the information into the proper table (indexes, statistics, columns) whilst keeping track of the node this information is associated to.

QDSToolbox’s PlanMiner does all this, and combined with the plans available in Query Store, this can be used to incrementally load the new plans’ information into those tables for later consumption.

The next step would be defining certain rules to define which execution plans have issues, and what kind of situations should be flagged, or what reports to generate:

  • Missing indexes with an impact > 80%.
  • Queries that would be impacted if a certain table or column is modified or dropped.
  • Queries that are strictly read-only and you could offload to a read-only replica on your Availability Groups.
  • Queries using forced indexes that would need rewriting if the indexes are removed.
  • New plans generated in the last 24 hours with potential issues.

These rules are not in place at the moment, just the extraction of the data. I am not expert in performance, but anyone who wants to try this tool and define some rules to apply to the execution plans is welcome to do so: I’ll try my best to integrate it with the tool and allow flexibility in the rules so anyone can define his/her own.

SQL, why are you acting up? SQL & Statistics

Whenever I see some query regression in Query Store due to SQL going nuts and using an execution plan which is obviously not the best available, I wonder why it is doing that. But since I don’t have access to its source code (and even if I had, I’d probably wouldn’t fully understand it), the only thing I could do was either force the better plan and hope SQL will use it (which doesn’t always do: there are certain limitations to that), or update the statistics on all the tables involved in the query and hope for the best.

This was a very manual operation:

  • Open the query text.
  • Identify all the tables involved in the query, which may include accessing the definition of the views it accesses, and that’s if the query text is not truncated and some of them are lost due to that.
  • Find all the statistics on the identified tables that have a certain small percentage of sample rate or had a high percentage of rows updated since the statistics were last updated, rendering them obsolete.
  • Write a script to update them based using a sample rate appropriate based on the row count of the table.

But since the execution plans detail all statistics used by SQL Engine to generate the plan and they are kept in Query Store, I decided to extract those details from them.

You know a task is annoying when querying an XML file looks easier.

The new tool added to the QDSToolBox, StatisticsUsed, does just that: given a query ID (or list of them), or even a whole object, generates a table with the details of the statistics used. Based on factors like when were the statistics last updated, the row count on the table or the percentage of rows changed since then, it generates the appropriate UPDATE STATS command. This is how the final result would look like:

Demo of the result of StatisticsUsed