The quality of an execution plan is highly dependent on the accuracy of the estimated number of rows output by each plan operator. If the estimated number of rows is significantly skewed from the actual number of rows, this can have a significant impact on the quality of a query's execution plan. Poor plan quality can be responsible for excessive I/O, inflated CPU, memory pressure, decreased throughput and reduced overall concurrency.
By "plan quality" – I'm talking about having SQL Server generate an execution plan that results in physical operator choices that reflect the current state of the data. By making such decisions based on accurate data, there is a better chance that the query will perform properly. The cardinality estimate values are used as input for operator costing, and when the values are too far off from reality, the negative impact to the execution plan can be pronounced. These estimates are fed to the various cost models associated to the query itself, and bad row estimates can impact a variety of decisions including index selection, seek vs. scan operations, parallel versus serial execution, join algorithm selection, inner vs. outer physical join selection (e.g. build vs. probe), spool generation, bookmark lookups vs. full clustered or heap table access, stream or hash aggregate selection, and whether or not a data modification uses a wide or narrow plan.
As an example, let's say you have the following
SELECT query (using the Credit database):
SELECT m.member_no, m.lastname, p.payment_no, p.payment_dt, p.payment_amt FROM dbo.member AS m INNER JOIN dbo.payment AS p ON m.member_no = p.member_no;
Based on the query logic, is the following plan shape what you would expect to see?
And what about this alternative plan, where instead of a nested loop we have a hash match?
The "correct" answer is dependent on a few other factors – but one major factor is the number of rows in each of the tables. In some cases, one physical join algorithm is more appropriate than the other – and if the initial cardinality estimate assumptions aren't correct, your query may be using a non-optimal approach.
Identifying cardinality estimate issues is relatively straightforward. If you have an actual execution plan, you can compare the estimated versus actual row count values for operators and look for skews. SQL Sentry Plan Explorer simplifies this task by allowing you to see actual versus estimated rows for all operators in a single plan tree tab versus having to hover over the individual operators in the graphical plan:
Now, skews don't always result in poor quality plans, but if you are having performance issues with a query and you see such skews in the plan, this is one area that is then worthy of further investigation.
Identification of cardinality estimate issues is relatively straightforward, but the resolution often isn't. There are a number of root causes as to why cardinality estimate issues can occur, and I'll cover ten of the more common reasons in this post.
Missing or Stale Statistics
Of all the reasons for cardinality estimate issues, this is the one that you hope to see, as it is often easiest to address. In this scenario, your statistics are either missing or out-of-date. You may have database options for automatic statistics creation and updates disabled, "no recomputed" enabled for specific statistics, or have large enough tables that your automatic statistics updates simply aren't happening frequently enough.
It may be that the precision of the statistics histogram is inadequate – for example, if you have a very large table with significant and/or frequent data skews. You may need to change your sampling from the default or if even that doesn't help – investigate using separate tables, filtered statistics or filtered indexes.
Hidden Column Correlations
The query optimizer assumes that columns within the same table are independent. For example, if you have a city and state column, we may intuitively know that these two columns are correlated, but SQL Server does not understand this unless we help it out with an associated multi-column index, or with manually-created multi-column statistics. Without helping the optimizer with correlation, the selectivity of your predicates may be exaggerated.
Below is an example of two correlated predicates:
SELECT lastname, firstname FROM dbo.member WHERE city = 'Minneapolis' AND state_prov - 'MN';
I happen to know that 10% of our 10,000 row
member table qualify for this combination, but the query optimizer is guessing that it is 1% of the 10,000 rows:
Now contrast this with the appropriate estimate that I see after adding multi-column statistics:
Intra-Table Column Comparisons
Cardinality estimation issues can occur when comparing columns within the same table. This is a known issue. If you have to do so, you can improve the cardinality estimates of the column comparisons by using computed columns instead or by re-writing the query to use self-joins or common table expressions.
Table Variable Usage
Using table variables much? Table variables show a cardinality estimate of "1" – which for just a small number of rows may not be an issue, but for large or volatile result sets can significantly impact query plan quality. Below is a screenshot of an operator's estimate of 1 row versus the actual 1,600,000 rows from the
@charge table variable:
If this is your root cause, you would be well-advised to explore alternatives like temporary tables and or permanent staging tables where possible.
Scalar and MSTV UDFs
Similar to table variables, multi-statement table-valued and scalar functions are a black-box from a cardinality estimation perspective. If you're encountering plan quality issues due to them, consider inline table functions as an alternative – or even pulling out the function reference entirely and just referencing objects directly.
Below shows an estimated versus actual plan when using a multi-statement table-valued function:
Data Type Issues
Implicit data type issues in conjunction with search and join conditions can cause cardinality estimate issues. They also can also surreptitiously eat away at server-level resources (CPU, I/O, memory), so it's important to address them whenever possible.
You’ve probably seen this pattern before – a query with a
WHERE clause that has each table column reference wrapped in various functions, concatenation operations, mathematical operations and more. And while not all function wrapping precludes proper cardinality estimates (such as for
GETDATE) there are plenty of ways to bury your predicate to the point that the query optimizer can no longer make accurate estimates.
Similar to buried predicates, are your queries extraordinarily complex? I realize “complex” is a subjective term, and your assessment may vary, but most can agree that nesting views within views within views that reference overlapping tables is likely to be non-optimal – especially when coupled with 10+ table joins, function references and buried predicates. While the query optimizer does an admirable job, it isn’t magic, and if you have significant skews, query complexity (Swiss-army knife queries) can certainly make it next to impossible to derive accurate row estimates for operators.
Are you using distributed queries with linked servers and you see significant cardinality estimate issues? If so, be sure to check the permissions associated with the linked server principal being used to access the data. Without the minimum
db_ddladmin fixed database role for the linked server account, this lack of visibility to remote statistics due to insufficient permissions may be the source for your cardinality estimation issues.
And there are others…
There are other reasons why cardinality estimates can be skewed, but I believe I've covered the most common ones. The key point is to pay attention to the skews in association with known, poorly performing queries. Don't assume that the plan was generated based on accurate row count conditions. If these numbers are skewed, you need to try to troubleshoot this first.