Joe Sack

Ten Common Threats to Execution Plan Quality

SentryOne eBooks

In these books, you will find useful, hand-picked articles that will help give insight into some of your most vexing performance problems. These articles were written by several of the SQL Server industry’s leading experts, including Paul White, Paul Randal, Jonathan Kehayias, Erin Stellato, Glenn Berry, Aaron Bertrand, and Joe Sack.

Free Download

Featured Author

Paul Randal, CEO of SQLskills, writes about knee-jerk performance tuning, DBCC, and SQL Server internals.

Paul’s Posts

The quality of an execution plan is highly dependent on the accuracy of the estimated number of rows output by each plan operator. If the estimated number of rows is significantly skewed from the actual number of rows, this can have a significant impact on the quality of a query's execution plan. Poor plan quality can be responsible for excessive I/O, inflated CPU, memory pressure, decreased throughput and reduced overall concurrency.

By "plan quality" – I'm talking about having SQL Server generate an execution plan that results in physical operator choices that reflect the current state of the data. By making such decisions based on accurate data, there is a better chance that the query will perform properly. The cardinality estimate values are used as input for operator costing, and when the values are too far off from reality, the negative impact to the execution plan can be pronounced. These estimates are fed to the various cost models associated to the query itself, and bad row estimates can impact a variety of decisions including index selection, seek vs. scan operations, parallel versus serial execution, join algorithm selection, inner vs. outer physical join selection (e.g. build vs. probe), spool generation, bookmark lookups vs. full clustered or heap table access, stream or hash aggregate selection, and whether or not a data modification uses a wide or narrow plan.

As an example, let's say you have the following SELECT query (using the Credit database):

SELECT
  m.member_no,
  m.lastname,
  p.payment_no,
  p.payment_dt,
  p.payment_amt
FROM dbo.member AS m
INNER JOIN dbo.payment AS p
ON m.member_no = p.member_no;

Based on the query logic, is the following plan shape what you would expect to see?

Plan with nested loop join

And what about this alternative plan, where instead of a nested loop we have a hash match?

Plan with hash join

The "correct" answer is dependent on a few other factors – but one major factor is the number of rows in each of the tables. In some cases, one physical join algorithm is more appropriate than the other – and if the initial cardinality estimate assumptions aren't correct, your query may be using a non-optimal approach.

Identifying cardinality estimate issues is relatively straightforward. If you have an actual execution plan, you can compare the estimated versus actual row count values for operators and look for skews. SQL Sentry Plan Explorer simplifies this task by allowing you to see actual versus estimated rows for all operators in a single plan tree tab versus having to hover over the individual operators in the graphical plan:

Plan Tree tab in SQL Sentry Plan Explorer

Now, skews don't always result in poor quality plans, but if you are having performance issues with a query and you see such skews in the plan, this is one area that is then worthy of further investigation.

Identification of cardinality estimate issues is relatively straightforward, but the resolution often isn't. There are a number of root causes as to why cardinality estimate issues can occur, and I'll cover ten of the more common reasons in this post.

Missing or Stale Statistics

Of all the reasons for cardinality estimate issues, this is the one that you hope to see, as it is often easiest to address. In this scenario, your statistics are either missing or out-of-date. You may have database options for automatic statistics creation and updates disabled, "no recomputed" enabled for specific statistics, or have large enough tables that your automatic statistics updates simply aren't happening frequently enough.

Sampling Issues

It may be that the precision of the statistics histogram is inadequate – for example, if you have a very large table with significant and/or frequent data skews. You may need to change your sampling from the default or if even that doesn't help – investigate using separate tables, filtered statistics or filtered indexes.

Hidden Column Correlations

The query optimizer assumes that columns within the same table are independent. For example, if you have a city and state column, we may intuitively know that these two columns are correlated, but SQL Server does not understand this unless we help it out with an associated multi-column index, or with manually-created multi-column statistics. Without helping the optimizer with correlation, the selectivity of your predicates may be exaggerated.

Below is an example of two correlated predicates:

SELECT 
  lastname,
  firstname
FROM dbo.member
WHERE city = 'Minneapolis'
AND state_prov - 'MN';

I happen to know that 10% of our 10,000 row member table qualify for this combination, but the query optimizer is guessing that it is 1% of the 10,000 rows:

Misleading estimates

Now contrast this with the appropriate estimate that I see after adding multi-column statistics:

More accurate estimates

Intra-Table Column Comparisons

Cardinality estimation issues can occur when comparing columns within the same table. This is a known issue. If you have to do so, you can improve the cardinality estimates of the column comparisons by using computed columns instead or by re-writing the query to use self-joins or common table expressions.

Table Variable Usage

Using table variables much? Table variables show a cardinality estimate of "1" – which for just a small number of rows may not be an issue, but for large or volatile result sets can significantly impact query plan quality. Below is a screenshot of an operator's estimate of 1 row versus the actual 1,600,000 rows from the @charge table variable:

If this is your root cause, you would be well-advised to explore alternatives like temporary tables and or permanent staging tables where possible.

Scalar and MSTV UDFs

Similar to table variables, multi-statement table-valued and scalar functions are a black-box from a cardinality estimation perspective. If you're encountering plan quality issues due to them, consider inline table functions as an alternative – or even pulling out the function reference entirely and just referencing objects directly.

Below shows an estimated versus actual plan when using a multi-statement table-valued function:

Comparison of estimated vs. actual plans using multi-statement TVF

Data Type Issues

Implicit data type issues in conjunction with search and join conditions can cause cardinality estimate issues. They also can also surreptitiously eat away at server-level resources (CPU, I/O, memory), so it's important to address them whenever possible.

Complex Predicates

You’ve probably seen this pattern before – a query with a WHERE clause that has each table column reference wrapped in various functions, concatenation operations, mathematical operations and more. And while not all function wrapping precludes proper cardinality estimates (such as for LOWER, UPPER and GETDATE) there are plenty of ways to bury your predicate to the point that the query optimizer can no longer make accurate estimates.

Query Complexity

Similar to buried predicates, are your queries extraordinarily complex? I realize “complex” is a subjective term, and your assessment may vary, but most can agree that nesting views within views within views that reference overlapping tables is likely to be non-optimal – especially when coupled with 10+ table joins, function references and buried predicates. While the query optimizer does an admirable job, it isn’t magic, and if you have significant skews, query complexity (Swiss-army knife queries) can certainly make it next to impossible to derive accurate row estimates for operators.

Distributed Queries

Are you using distributed queries with linked servers and you see significant cardinality estimate issues? If so, be sure to check the permissions associated with the linked server principal being used to access the data. Without the minimum db_ddladmin fixed database role for the linked server account, this lack of visibility to remote statistics due to insufficient permissions may be the source for your cardinality estimation issues.

And there are others…

There are other reasons why cardinality estimates can be skewed, but I believe I've covered the most common ones. The key point is to pay attention to the skews in association with known, poorly performing queries. Don't assume that the plan was generated based on accurate row count conditions. If these numbers are skewed, you need to try to troubleshoot this first.