Paul White

Rewriting Queries to Improve Performance

Measure and improve performance for distributed workloads on Microsoft Analystics Platform Server.  More
Answers.SQLPerformance.com

Upload your tough execution plans and get helpful query tuning advice from renowned experts like Paul White.

Visit Site

Featured Author

Paul Randal, CEO of SQLskills, writes about knee-jerk performance tuning, DBCC, and SQL Server internals.

Paul’s Posts

In a perfect world, it would not matter which particular T-SQL syntax we chose to express a query. Any semantically identical construction would lead to exactly the same physical execution plan, with exactly the same performance characteristics.

To achieve that, the SQL Server query optimizer would need to know every possible logical equivalence (assuming we could ever know them all), and be given the time and resources to explore all the options. Given the enormous number of possible ways we can express the same requirement in T-SQL, and the huge number of possible transformations, the combinations quickly become unmanageable for all but the very simplest cases.

A "perfect world" with complete syntax-independence might not seem quite so perfect to users that have to wait days, weeks, or even years for a modestly-complex query to compile. So the query optimizer compromises: it explores some common equivalences and tries hard to avoid spending more time on compilation and optimization than it saves in execution time. Its goal can be summarized as trying to find a reasonable execution plan in a reasonable time, while consuming reasonable resources.

One result of all this is that execution plans are often sensitive to the written form of the query. The optimizer does have some logic to quickly transform some widely-used equivalent constructions to a common form, but these abilities are neither well documented nor (anywhere near) comprehensive.

We can certainly maximize our chances of getting a good execution plan by writing simpler queries, providing useful indexes, maintaining good statistics, and confining ourselves to more relational concepts (e.g. by avoiding cursors, explicit loops, and non-inline functions) but this is not a complete solution. Neither is it possible to say that one T-SQL construction will always produce a better execution plan that a semantically-identical alternative.

My usual advice is to start with the simplest relational query form that meets your needs, using whatever T-SQL syntax you find preferable. If the query does not perform to requirements after physical optimization (e.g. indexing), it can be worth trying to express the query in a slightly different way, while retaining the original semantics. This is the tricky part. Which part of the query should you try to rewrite? Which rewrite should you try? There is no simple one-size-fits-all answer to these questions. Some of it comes down to experience, though knowing a bit about query optimization and execution engine internals can be a useful guide as well.

Example

This example uses the AdventureWorks TransactionHistory table. The script below makes a copy of the table and creates a clustered and non-clustered index. We will not be modifying the data at all; this step is just to make the indexing clear (and to give the table a shorter name):

SELECT *
INTO dbo.TH
FROM Production.TransactionHistory;
 
CREATE UNIQUE CLUSTERED INDEX CUQ_TransactionID
ON dbo.TH (TransactionID);
 
CREATE NONCLUSTERED INDEX IX_ProductID
ON dbo.TH (ProductID);

The task is to produce a list of product and history IDs for six particular products. One way to express the query is:

SELECT ProductID, TransactionID
FROM dbo.TH
WHERE ProductID IN (520, 723, 457, 800, 943, 360);

This query returns 764 rows using the following execution plan (shown in SQL Sentry Plan Explorer):

image

This simple query qualifies for TRIVIAL plan compilation. The execution plan features six separate index seek operations in one:

image

Eagle-eyed readers will have noticed that the six seeks are listed in ascending product ID order, not in the (arbitrary) order specified in the original query's IN list. Indeed, if you run the query yourself, you are quite likely to observe results being returned in ascending product ID order. The query is not guaranteed to return results in that order of course, because we did not specify a top-level ORDER BY clause. We can however add such an ORDER BY clause, without changing the execution plan produced in this case:

SELECT ProductID, TransactionID
FROM dbo.TH
WHERE ProductID IN (520, 723, 457, 800, 943, 360)
ORDER BY ProductID;

I won't repeat the execution plan graphic, because it is exactly the same: the query still qualifies for a trivial plan, the seeking operations are exactly the same, and the two plans have exactly the same estimated cost. Adding the ORDER BY clause cost us precisely nothing, but gained us a guarantee of result set ordering.

We now have a guarantee that results will be returned in product ID order, but our query does not currently specify how rows with the same product ID will be ordered. Looking at the results, you might observe that rows for the same product ID appear to be ordered by transaction ID, ascending.

Without an explicit ORDER BY, this is just another observation (i.e. we cannot rely on this ordering), but we can modify the query to ensure rows are ordered by transaction ID within each product ID:

SELECT ProductID, TransactionID
FROM dbo.TH
WHERE ProductID IN (520, 723, 457, 800, 943, 360)
ORDER BY ProductID, TransactionID;

Again, the execution plan for this query is exactly the same as before; the same trivial plan with the same estimated cost is produced. The difference is that the results are now guaranteed to be ordered first by product ID and then by transaction ID.

Some people might be tempted to conclude that the two previous queries would also always return rows in this order, because the execution plans are the same. This is not a safe implication, because not all execution engine details are exposed in execution plans (even in the XML form). Without an explicit order by clause, SQL Server is free to return the rows in any order, even if the plan looks the same to us (it could, for example, perform the seeks in the order specified in the query text). The point is the query optimizer knows about, and can enforce, certain behaviours within the engine that are not visible to users.

In case you are wondering how our non-unique nonclustered index on Product ID can return rows in Product and Transaction ID order, the answer is that the nonclustered index key incorporates Transaction ID (the unique clustered index key). In fact, the physical structure of our nonclustered index is exactly the same, at all levels, as if we had created the index with the following definition:

CREATE UNIQUE NONCLUSTERED INDEX IX_ProductID
ON dbo.TH (ProductID, TransactionID);

We can even write the query with an explicit DISTINCT or GROUP BY and still get exactly the same execution plan:

SELECT DISTINCT ProductID, TransactionID
FROM dbo.TH
WHERE ProductID IN (520, 723, 457, 800, 943, 360)
ORDER BY ProductID, TransactionID;

To be clear, this does not require changing the original nonclustered index in any way. As a final example, note that we can also request results in descending order:

SELECT DISTINCT ProductID, TransactionID
FROM dbo.TH
WHERE ProductID IN (520, 723, 457, 800, 943, 360)
ORDER BY ProductID DESC, TransactionID DESC;

The execution plan properties now show that the index is scanned backward:

image

Aside from that, the plan is the same – it was produced at the trivial plan optimization stage, and still has the same estimated cost.

Rewriting the query

There is nothing wrong with the previous query or execution plan, but we might have chosen to express the query differently:

SELECT ProductID, TransactionID
FROM dbo.TH
WHERE ProductID = 520
OR ProductID = 723
OR ProductID = 457
OR ProductID = 800
OR ProductID = 943
OR ProductID = 360;

Clearly this form specifies exactly the same results as the original, and indeed the new query produces the same execution plan (trivial plan, multiple seek in one, same estimated cost). The OR form does perhaps make it slightly clearer that the result is a combination of the results for the six individual product IDs, which might lead us to try another variation that makes this idea even more explicit:

SELECT ProductID, TransactionID FROM dbo.TH WHERE ProductID = 520 
UNION ALL
SELECT ProductID, TransactionID FROM dbo.TH WHERE ProductID = 723 
UNION ALL
SELECT ProductID, TransactionID FROM dbo.TH WHERE ProductID = 457 
UNION ALL
SELECT ProductID, TransactionID FROM dbo.TH WHERE ProductID = 800 
UNION ALL
SELECT ProductID, TransactionID FROM dbo.TH WHERE ProductID = 943 
UNION ALL
SELECT ProductID, TransactionID FROM dbo.TH WHERE ProductID = 360;

The execution plan for the UNION ALL query is quite different:

image

Aside from the obvious visual differences, this plan required cost-based (FULL) optimization (it did not qualify for a trivial plan), and the estimated cost is (relatively speaking) quite a bit higher, around 0.02 units versus around 0.005 units before.

This goes back to my opening remarks: the query optimizer does not know about every logical equivalence, and cannot always recognize alternative queries as specifying the same results. The point I am making at this stage is that expressing this particular query using UNION ALL rather than IN resulted in a less optimal execution plan.

Second example

This example chooses a different set of six product IDs and requests results in transaction ID order:

SELECT ProductID, TransactionID
FROM dbo.TH
WHERE ProductID IN (870, 873, 921, 712, 707, 711)
ORDER BY TransactionID;

Our nonclustered index cannot provide rows in the requested order, so the query optimizer has a choice to make between seeking on the nonclustered index and sorting, or scanning the clustered index (which is keyed on transaction ID alone) and applying the product ID predicates as a residual. The product IDs listed happen to have a lower selectivity than the previous set, so the optimizer chooses a clustered index scan in this case:

image

Because there is a cost-based choice to make, this execution plan did not qualify for a trivial plan. The estimated cost of the final plan is about 0.714 units. Scanning the clustered index requires 797 logical reads at execution time.

Perhaps being surprised that the query did not use the product index, we might try forcing a seek of the nonclustered index using an index hint, or by specifying FORCESEEK:

SELECT ProductID, TransactionID
FROM dbo.TH WITH (FORCESEEK)
WHERE ProductID IN (870, 873, 921, 712, 707, 711)
ORDER BY TransactionID;

image

This results in an explicit sort by transaction ID. The new sort is estimated to make up 96% of the new plan's 1.15 unit cost. This higher estimated cost explains why the optimizer chose the apparently-cheaper clustered index scan when left to its own devices. The I/O cost of the new query is lower though: when executed, the index seek consumes only 49 logical reads (down from 797).

We might also have chosen to express this query using (the previously unsuccessful) UNION ALL idea:

SELECT ProductID, TransactionID FROM dbo.TH WHERE ProductID = 870 
UNION ALL
SELECT ProductID, TransactionID FROM dbo.TH WHERE ProductID = 873 
UNION ALL
SELECT ProductID, TransactionID FROM dbo.TH WHERE ProductID = 921 
UNION ALL
SELECT ProductID, TransactionID FROM dbo.TH WHERE ProductID = 712 
UNION ALL
SELECT ProductID, TransactionID FROM dbo.TH WHERE ProductID = 707 
UNION ALL
SELECT ProductID, TransactionID FROM dbo.TH WHERE ProductID = 711
ORDER BY TransactionID;

The produces the following execution plan (click the image to enlarge in a new window):

image

This plan may seem more complex, but it has an estimated cost of only 0.099 units, which is much lower than the clustered index scan (0.714 units) or seek plus sort (1.15 units). In addition, the new plan consumes only 49 logical reads at execution time – the same as the seek + sort plan, and much lower than the 797 needed for the clustered index scan.

This time, expressing the query using UNION ALL produced a much better plan, both in terms of estimated cost and logical reads. The source data set is a bit too small to make a truly meaningful comparison between query durations or CPU usage, but the clustered index scan takes twice as long (26ms) as the other two on my system.

The extra sort in the hinted plan is probably harmless in this simple example because it is unlikely to spill to disk, but many people will prefer the UNION ALL plan anyway because it is non-blocking, avoids a memory grant, and does not require a query hint.

Conclusion

We have seen that query syntax can affect the execution plan chosen by the optimizer, even though the queries logically specify exactly the same result set. The same rewrite (e.g. UNION ALL) will sometimes result in an improvement, and sometimes cause a worse plan to be selected.

Rewriting queries and trying alternate syntax is a valid tuning technique, but some care is needed. One risk is that future changes to the product might cause the different query form to suddenly stop producing the better plan, but one could argue that is always a risk, and mitigated by pre-upgrade testing or the use of plan guides.

There is also a risk of getting carried away with this technique:  using 'weird' or 'unusual' query constructions to obtain a better-performing plan is often a sign that a line has been crossed. Exactly where the distinction lies between valid alternate syntax and 'unusual/weird' is probably quite subjective; my own personal guide is to work with equivalent relational query forms, and to keep things as simple as possible.