Itzik Ben-Gan

Are You Sorted? Tips Concerning T-SQL Window Ordering

SentryOne eBooks

In these books, you will find useful, hand-picked articles that will help give insight into some of your most vexing performance problems. These articles were written by several of the SQL Server industry’s leading experts, including Paul White, Paul Randal, Jonathan Kehayias, Erin Stellato, Glenn Berry, Aaron Bertrand, and Joe Sack.

Free Download

Featured Author

Paul Randal, CEO of SQLskills, writes about knee-jerk performance tuning, DBCC, and SQL Server internals.

Paul’s Posts

A supporting index can potentially help avoid the need for explicit sorting in the query plan when optimizing T-SQL queries involving window functions. By a supporting index, I mean one with the window partitioning and ordering elements as the index key, and the rest of the columns that appear in the query as the index included columns. I often refer to such an indexing pattern as a POC index as an acronym for partitioning, ordering, and covering. Naturally, if a partitioning or ordering element doesn’t appear in the window function, you omit that part from the index definition.

But what about queries involving multiple window functions with different ordering needs? Similarly, what if other elements in the query besides window functions also require arranging input data as ordered in the plan, such as a presentation ORDER BY clause? These can result in different parts of the plan needing to process the input data in different orders.

In such circumstances, you’ll typically accept explicit sorting is unavoidable in the plan. You may find the syntactical arrangement of expressions in the query can affect how many explicit sort operators you get in the plan. By following some basic tips you can sometimes reduce the number of explicit sort operators, which can, of course, have a major impact on the performance of the query.

Environment for Demos

In my examples, I’ll use the sample database PerformanceV5. You can download the source code to create and populate this database here.

I ran all the examples on SQL Server® 2019 Developer, where batch-mode on rowstore is available.

In this article, I want to focus on tips having to do with the potential of the window function’s calculation in the plan to rely on ordered input data without requiring an extra explicit sort activity in the plan. This is relevant when the optimizer uses a serial or parallel row-mode treatment of window functions, and when using a serial batch-mode Window Aggregate operator.

SQL Server doesn't currently support an efficient combination of a parallel order-preserving input prior to a parallel batch-mode Window Aggregate operator. So, to use a parallel batch-mode Window Aggregate operator, the optimizer has to inject an intermediary parallel batch-mode Sort operator, even when the input is already preordered.

For simplicity’s sake, you can prevent parallelism in all examples shown in this article. To achieve this without needing to add a hint to all queries, and without setting a server-wide configuration option, you can set the database scoped configuration option MAXDOP to 1, like so:

USE PerformanceV5;

ALTER DATABASE SCOPED CONFIGURATION SET MAXDOP = 1;

Remember to set it back to 0 after you’re done testing the examples in this article. I’ll remind you at the end.

Alternatively, you can prevent parallelism at the session level with the undocumented DBCC OPTIMIZER_WHATIF command, like so:

DBCC OPTIMIZER_WHATIF(CPUs, 1);

To reset the option when you’re done, invoke it again with the value 0 as the number of CPUs.

When you’re done trying all of the examples in this article with parallelism disabled, I recommend enabling parallelism and trying all examples again to see what changes.

Tips 1 and 2

Before I start with the tips, let’s first look at a simple example with a window function designed to benefit from a supporting index.

Consider the following query, which I’ll refer to as Query 1:

SELECT orderid, orderdate, custid,

  SUM(orderid) OVER(PARTITION BY custid ORDER BY orderdate, orderid ROWS UNBOUNDED PRECEDING) AS sum1
  
FROM dbo.Orders;

Don’t worry about the fact that the example is contrived. There’s no good business reason to compute a running total of order IDs—this table is decently sized with 1MM rows, and I wanted to show a simple example with a common window function such as one that applies a running total computation.

Following the POC indexing scheme, you create the following index to support the query:

CREATE UNIQUE NONCLUSTERED INDEX idx_nc_cid_od_oid ON dbo.Orders(custid, orderdate, orderid);

The plan for this query is shown in Figure 1.

Figure 1: Plan for Query 1

No surprises here. The plan applies an index order scan of the index you just created, providing the data ordered to the Window Aggregate operator, without the need for explicit sorting.

Next, consider the following query, which involves multiple window functions with different ordering needs, as well as a presentation ORDER BY clause:

SELECT orderid, orderdate, custid,

  SUM(orderid) OVER(PARTITION BY custid ORDER BY orderid ROWS UNBOUNDED PRECEDING) AS sum2,

  SUM(orderid) OVER(PARTITION BY custid ORDER BY orderdate, orderid ROWS UNBOUNDED PRECEDING) AS sum1,

  SUM(1.0 * orderid) OVER(ORDER BY orderid ROWS UNBOUNDED PRECEDING) AS sum3

FROM dbo.Orders
ORDER BY custid, orderid;

I’ll refer to this query as Query 2. The plan for this query is shown in Figure 2.

Figure 2: Plan for Query 2

Notice there are four Sort operators in the plan.

If you analyze the various window functions and presentation ordering needs, you’ll find there are three distinct ordering needs:

  • custid, orderdate, orderid
  • orderid
  • custid, orderid

Given one of them (the first in the list above) can be supported by the index you created earlier, you would expect to see only two sorts in the plan. So, why does the plan have four sorts? It looks like SQL Server doesn’t try to be too sophisticated with rearranging the processing order of the functions in the plan to minimize sorts. It processes the functions in the plan in the order they appear in the query. That’s at least the case for the first occurrence of each distinct ordering need, but I’ll elaborate on this shortly.

You can remove the need for some of the sorts in the plan by applying the following two simple practices:

Tip 1: If you have an index to support some of the window functions in the query, specify those first.

Tip 2: If the query involves window functions with the same ordering need as the presentation ordering in the query, specify those functions last.

Following these tips, you rearrange the appearance order of the window functions in the query like so:

SELECT orderid, orderdate, custid,

  SUM(orderid) OVER(PARTITION BY custid ORDER BY orderdate, orderid ROWS UNBOUNDED PRECEDING) AS sum1,

  SUM(1.0 * orderid) OVER(ORDER BY orderid ROWS UNBOUNDED PRECEDING) AS sum3,

  SUM(orderid) OVER(PARTITION BY custid ORDER BY orderid ROWS UNBOUNDED PRECEDING) AS sum2

FROM dbo.Orders
ORDER BY custid, orderid;

I’ll refer to this query as Query 3. The plan for this query is shown in Figure 3.

Figure 3: Plan for Query 3

As you can see, the plan now has only two sorts.

Tip 3

SQL Server doesn’t try to be too sophisticated in rearranging the processing order of window functions in an attempt to minimize sorts in the plan. However, it’s capable of a certain simple rearrangement. It scans the window functions based on appearance order in the query and each time it detects a new distinct ordering need, it looks ahead for additional window functions with the same ordering need and if it finds those, it groups them together with the first occurrence. In some cases, it can even use the same operator to compute multiple window functions.

Consider the following query as an example:

SELECT orderid, orderdate, custid,

  SUM(orderid) OVER(PARTITION BY custid ORDER BY orderdate, orderid ROWS UNBOUNDED PRECEDING) AS sum1,

  SUM(1.0 * orderid) OVER(ORDER BY orderid ROWS UNBOUNDED PRECEDING) AS sum3,

  SUM(orderid) OVER(PARTITION BY custid ORDER BY orderid ROWS UNBOUNDED PRECEDING) AS sum2,

  MAX(orderid) OVER(PARTITION BY custid ORDER BY orderdate, orderid ROWS UNBOUNDED PRECEDING) AS max1,

  MAX(orderid) OVER(ORDER BY orderid ROWS UNBOUNDED PRECEDING) AS max3,

  MAX(orderid) OVER(PARTITION BY custid ORDER BY orderid ROWS UNBOUNDED PRECEDING) AS max2,

  AVG(orderid) OVER(PARTITION BY custid ORDER BY orderdate, orderid ROWS UNBOUNDED PRECEDING) AS avg1,

  AVG(1.0 * orderid) OVER(ORDER BY orderid ROWS UNBOUNDED PRECEDING) AS avg3,

  AVG(orderid) OVER(PARTITION BY custid ORDER BY orderid ROWS UNBOUNDED PRECEDING) AS avg2

FROM dbo.Orders
ORDER BY custid, orderid;

I’ll refer to this query as Query 4. The plan for this query is shown in Figure 4.

Figure 4: Plan for Query 4

Window functions with the same ordering needs aren’t grouped together in the query. However, there are still only two sorts in the plan. This is because what counts in terms of processing order in the plan is the first occurrence of each distinct ordering need. This leads me to the third tip.

Tip 3: Make sure to follow tips 1 and 2 for the first occurrence of each distinct ordering need. Subsequent occurrences of the same ordering need, even if nonadjacent, are identified and grouped together with the first.

Tips 4 and 5

Suppose you want to return columns resulting from windowed calculations in a certain left-to-right order in the output. But what if the order isn’t the same as the order that will minimize sorts in the plan?

For example, suppose you want the same result as the one produced by Query 2 in terms of left-to-right column order in the output (column order: other cols, sum2, sum1, sum3), but you’d rather have the same plan like the one you got for Query 3 (column order: other cols, sum1, sum3, sum2), which got two sorts instead of four.

That’s perfectly doable if you’re familiar with the fourth tip.

Tip 4: The aforementioned recommendations apply to appearance order of window functions in the code, even if within a named table expression such as a CTE or view, and even if the outer query returns the columns in a different order than in the named table expression. Therefore, if you need to return columns in a certain order in the output, and it’s different from the optimal order in terms of minimizing sorts in the plan, follow the tips in terms of appearance order within a named table expression, and return the columns in the outer query in the desired output order.

The following query, which I’ll refer to as Query 5, illustrates this technique:

WITH C AS
(
  SELECT orderid, orderdate, custid,

    SUM(orderid) OVER(PARTITION BY custid ORDER BY orderdate, orderid ROWS UNBOUNDED PRECEDING) AS sum1,

    SUM(1.0 * orderid) OVER(ORDER BY orderid ROWS UNBOUNDED PRECEDING) AS sum3,

    SUM(orderid) OVER(PARTITION BY custid ORDER BY orderid ROWS UNBOUNDED PRECEDING) AS sum2

  FROM dbo.Orders  
)
SELECT orderid, orderdate, custid,
  sum2, sum1, sum3
FROM C
ORDER BY custid, orderid;

The plan for this query is shown in Figure 5.

Figure 5: Plan for Query 5

You still get only two sorts in the plan despite the fact that the column order in the output is: other cols, sum2, sum1, sum3, like in Query 2.

One caveat to this trick with the named table expression is if your columns in the table expression aren’t referenced by the outer query, they are excluded from the plan and therefore don’t count.

Consider the following query, which I’ll refer to as Query 6:

WITH C AS
(
  SELECT orderid, orderdate, custid,

    MAX(orderid) OVER(PARTITION BY custid ORDER BY orderdate, orderid ROWS UNBOUNDED PRECEDING) AS max1,

    MAX(orderid) OVER(ORDER BY orderid ROWS UNBOUNDED PRECEDING) AS max3,

    MAX(orderid) OVER(PARTITION BY custid ORDER BY orderid ROWS UNBOUNDED PRECEDING) AS max2,

    AVG(orderid) OVER(PARTITION BY custid ORDER BY orderdate, orderid ROWS UNBOUNDED PRECEDING) AS avg1,

    AVG(1.0 * orderid) OVER(ORDER BY orderid ROWS UNBOUNDED PRECEDING) AS avg3,

    AVG(orderid) OVER(PARTITION BY custid ORDER BY orderid ROWS UNBOUNDED PRECEDING) AS avg2,

    SUM(orderid) OVER(PARTITION BY custid ORDER BY orderid ROWS UNBOUNDED PRECEDING) AS sum2,

    SUM(orderid) OVER(PARTITION BY custid ORDER BY orderdate, orderid ROWS UNBOUNDED PRECEDING) AS sum1,

    SUM(1.0 * orderid) OVER(ORDER BY orderid ROWS UNBOUNDED PRECEDING) AS sum3

  FROM dbo.Orders  
)
SELECT orderid, orderdate, custid,
  sum2, sum1, sum3,
  max2, max1, max3,
  avg2, avg1, avg3
FROM C
ORDER BY custid, orderid;

Here all table expression columns are referenced by the outer query, so optimization happens based on the first distinct occurrence of each ordering need within the table expression:

  • max1: custid, orderdate, orderid
  • max3: orderid
  • max2: custid, orderid

This results in a plan with only two sorts as shown in Figure 6.

Figure 6: Plan for Query 6

Now change only the outer query by removing the references to max2, max1, max3, avg2, avg1 and avg3, like so:

WITH C AS
(
  SELECT orderid, orderdate, custid,

    MAX(orderid) OVER(PARTITION BY custid ORDER BY orderdate, orderid ROWS UNBOUNDED PRECEDING) AS max1,

    MAX(orderid) OVER(ORDER BY orderid ROWS UNBOUNDED PRECEDING) AS max3,

    MAX(orderid) OVER(PARTITION BY custid ORDER BY orderid ROWS UNBOUNDED PRECEDING) AS max2,

    AVG(orderid) OVER(PARTITION BY custid ORDER BY orderdate, orderid ROWS UNBOUNDED PRECEDING) AS avg1,

    AVG(1.0 * orderid) OVER(ORDER BY orderid ROWS UNBOUNDED PRECEDING) AS avg3,

    AVG(orderid) OVER(PARTITION BY custid ORDER BY orderid ROWS UNBOUNDED PRECEDING) AS avg2,

    SUM(orderid) OVER(PARTITION BY custid ORDER BY orderid ROWS UNBOUNDED PRECEDING) AS sum2,

    SUM(orderid) OVER(PARTITION BY custid ORDER BY orderdate, orderid ROWS UNBOUNDED PRECEDING) AS sum1,

    SUM(1.0 * orderid) OVER(ORDER BY orderid ROWS UNBOUNDED PRECEDING) AS sum3

  FROM dbo.Orders  
)
SELECT orderid, orderdate, custid,
  sum2, sum1, sum3
FROM C
ORDER BY custid, orderid;

I’ll refer to this query as Query 7. The computations of max1, max3, max2, avg1, avg3, and avg2 in the table expression are irrelevant to the outer query so they are excluded. The remaining computations involving window functions in the table expression, which are relevant to the outer query, are those of sum2, sum1, and sum3. Unfortunately, they do not appear in the table expression in optimal order in terms of minimizing sorts. As you can see in the plan for this query as shown in Figure 7, there are four sorts.

Figure 7: Plan for Query 7

If you’re thinking it’s unlikely you will have columns in the inner query you won’t refer to in the outer query, think views. Each time you query a view, you might be interested in a different subset of the columns. With this in mind, the fifth tip could help in reducing sorts in the plan.

Tip 5: In the inner query of a named table expression like a CTE or view, group all window functions with the same ordering needs together, and follow tips 1 and 2 in the order of the groups of functions.

The following code implements a view based on this recommendation:

CREATE OR ALTER VIEW dbo.MyView
AS

SELECT orderid, orderdate, custid,

  MAX(orderid) OVER(PARTITION BY custid ORDER BY orderdate, orderid ROWS UNBOUNDED PRECEDING) AS max1,

  SUM(orderid) OVER(PARTITION BY custid ORDER BY orderdate, orderid ROWS UNBOUNDED PRECEDING) AS sum1,

  AVG(orderid) OVER(PARTITION BY custid ORDER BY orderdate, orderid ROWS UNBOUNDED PRECEDING) AS avg1,

  MAX(orderid) OVER(ORDER BY orderid ROWS UNBOUNDED PRECEDING) AS max3,

  SUM(1.0 * orderid) OVER(ORDER BY orderid ROWS UNBOUNDED PRECEDING) AS sum3,

  AVG(1.0 * orderid) OVER(ORDER BY orderid ROWS UNBOUNDED PRECEDING) AS avg3,

  MAX(orderid) OVER(PARTITION BY custid ORDER BY orderid ROWS UNBOUNDED PRECEDING) AS max2,

  AVG(orderid) OVER(PARTITION BY custid ORDER BY orderid ROWS UNBOUNDED PRECEDING) AS avg2,

  SUM(orderid) OVER(PARTITION BY custid ORDER BY orderid ROWS UNBOUNDED PRECEDING) AS sum2

FROM dbo.Orders;
GO

Now query the view requesting only the windowed result columns sum2, sum1, and sum3, in this order:

SELECT orderid, orderdate, custid,
  sum2, sum1, sum3
FROM dbo.MyView
ORDER BY custid, orderid;

I’ll refer to this query as Query 8. You get the plan shown in Figure 8 with only two sorts.

Figure 8: Plan for Query 8

Tip 6

When you have a query with multiple window functions with multiple distinct ordering needs, the common wisdom is you can support only one of them with preordered data via an index. This is the case even when all window functions have respective supporting indexes.

Let me demonstrate this. Recall earlier when you created the index idx_nc_cid_od_oid, which can support window functions needing the data ordered by custid, orderdate, orderid, such as the following expression:

SUM(orderid) OVER(PARTITION BY custid ORDER BY orderdate, orderid ROWS UNBOUNDED PRECEDING)

Suppose, in addition to this window function, you also need the following window function in the same query:

SUM(orderid) OVER(PARTITION BY custid ORDER BY orderid ROWS UNBOUNDED PRECEDING)

This window function would benefit from the following index:

CREATE UNIQUE NONCLUSTERED INDEX idx_nc_cid_oid ON dbo.Orders(custid, orderid);

The following query, which I’ll refer to as Query 9, invokes both window functions:

SELECT orderid, orderdate, custid,

  SUM(orderid) OVER(PARTITION BY custid ORDER BY orderdate, orderid ROWS UNBOUNDED PRECEDING) AS sum1,

  SUM(orderid) OVER(PARTITION BY custid ORDER BY orderid ROWS UNBOUNDED PRECEDING) AS sum2

FROM dbo.Orders;

The plan for this query is shown in Figure 9.

Figure 9: Plan for Query 9

I get the following time stats for this query on my machine, with results discarded in SSMS:

CPU time = 3234 ms,  elapsed time = 3354 ms.

As explained earlier, SQL Server scans the windowed expressions in order of appearance in the query and figures it can support the first with an ordered scan of the index idx_nc_cid_od_oid. But then it adds a Sort operator to the plan to order the data like the second window function needs. This means the plan has N log N scaling. It doesn’t consider using the index idx_nc_cid_oid to support the second window function. You’re probably thinking it can’t, but try to think a bit outside of the box. Could you not compute each of the window functions based on its respective index order and then join the results? Theoretically, you can, and depending on the size of the data, availability of indexing, and other resources available, the join version could sometimes do better. SQL Server doesn’t consider this approach, but you certainly can implement it by writing the join yourself, like so:

WITH C1 AS
(
  SELECT orderid, orderdate, custid,

    SUM(orderid) OVER(PARTITION BY custid ORDER BY orderdate, orderid ROWS UNBOUNDED PRECEDING) AS sum1

  FROM dbo.Orders
),
C2 AS
(
  SELECT orderid, custid,

    SUM(orderid) OVER(PARTITION BY custid ORDER BY orderid ROWS UNBOUNDED PRECEDING) AS sum2

  FROM dbo.Orders
)
SELECT C1.orderid, C1.orderdate, C1.custid, C1.sum1, C2.sum2
FROM C1
  INNER JOIN C2
    ON C1.orderid = C2.orderid;

I’ll refer to this query as Query 10. The plan for this query is shown in Figure 10.

Figure 10: Plan for Query 10

The plan uses ordered scans of the two indexes with no explicit sorting whatsoever, computes the window functions, and uses a hash join to join the results. This plan scales linearly compared to the previous one which has N log N scaling.

I get the following time stats for this query on my machine (again with results discarded in SSMS):

CPU time = 1000 ms,  elapsed time = 1100 ms.

To recap, here’s our sixth tip.

Tip 6: When you have multiple window functions with multiple distinct ordering needs, and you’re able to support all of them with indexes, try a join version and compare its performance to the query without the join.

Cleanup

If you disabled parallelism by setting the database scoped configuration option MAXDOP to 1, reenable parallelism by setting it to 0:

ALTER DATABASE SCOPED CONFIGURATION SET MAXDOP = 0;

If you used the undocumented session option DBCC OPTIMIZER_WHATIF with the CPUs option set to 1, reenable parallelism by setting it to 0:

DBCC OPTIMIZER_WHATIF(CPUs, 0);

You can retry all examples with parallelism enabled if you like.

Use the following code to clean up the new indexes you created:

DROP INDEX IF EXISTS idx_nc_cid_od_oid ON dbo.Orders;
DROP INDEX IF EXISTS idx_nc_cid_oid ON dbo.Orders;

And the following code to remove the view:

DROP VIEW IF EXISTS dbo.MyView;

Follow the Tips to Minimize Number of Sorts

Window functions need to process the input data ordered. Indexing can help in eliminating sorting in the plan, but normally only for one distinct ordering need. Queries with multiple ordering needs typically involve some sorts in their plans. However, by following certain tips, you can minimize the number of sorts needed. Here’s a summary of the tips I mentioned in this article:

  • Tip 1: If you have an index to support some of the window functions in the query, specify those first.
  • Tip 2: If the query involves window functions with the same ordering need as the presentation ordering in the query, specify those functions last.
  • Tip 3: Make sure to follow tips 1 and 2 for the first occurrence of each distinct ordering need. Subsequent occurrences of the same ordering need, even if nonadjacent, are identified and grouped together with the first.
  • Tip 4: The aforementioned recommendations apply to appearance order of window functions in the code, even if within a named table expression such as a CTE or view, and even if the outer query returns the columns in a different order than in the named table expression. Therefore, if you need to return columns in a certain order in the output, and it’s different from the optimal order in terms of minimizing sorts in the plan, follow the tips in terms of appearance order within a named table expression, and return the columns in the outer query in the desired output order.
  • Tip 5: In the inner query of a named table expression like a CTE or view, group all window functions with the same ordering needs together, and follow tips 1 and 2 in the order of the groups of functions.
  • Tip 6: When you have multiple window functions with multiple distinct ordering needs, and you’re able to support all of them with indexes, try a join version and compare its performance to the query without the join.