The Halloween Problem - Part 4 - SQLPerformance.com

[ Part 1 | Part 2 | Part 3 | Part 4 ]

The Halloween Problem can have a number of important effects on execution plans. In this final part of the series, we look at the tricks the optimizer can employ to avoid the Halloween Problem when compiling plans for queries that add, change or delete data.

Background

Over the years, a number of approaches have been tried to avoid the Halloween Problem. One early technique was to simply avoid building any execution plans that involved reading from and writing to keys of the same index. This was not very successful from a performance point of view, not least because it often meant scanning the base table instead of using a selective nonclustered index to locate the rows to change.

A second approach was to completely separate the reading and writing phases of an update query, by first locating all rows that qualify for the change, storing them somewhere, and only then starting to perform the changes. In SQL Server, this full phase separation is achieved by placing the now-familiar Eager Table Spool on the input side of the update operator:

The spool reads all rows from its input and stores them in a hidden tempdb work table. The pages of this work table may remain in memory, or they might require physical disk space if the set of rows is large, or if the server is under memory pressure.

Full phase separation can be less than ideal because we generally want to run as much of the plan as possible as a pipeline, where each row is fully processed before moving on to the next. Pipelining has many advantages including avoiding the need for temporary storage, and only touching each row once.

The SQL Server Optimizer

SQL Server goes much further than the two techniques described so far, though it does of course include both as options. The SQL Server query optimizer detects queries that require Halloween Protection, determines how much protection is required, and uses cost-based analysis to find the cheapest method of providing that protection.

The easiest way to understand this aspect of the Halloween Problem is to look at some examples. In the following sections, the task is to add a range of numbers to an existing table – but only numbers that do not already exist:

CREATE TABLE dbo.Test
(
    pk      integer NOT NULL,

    CONSTRAINT PK_Test
        PRIMARY KEY CLUSTERED (pk)
);

5 rows

The first example processes a range of numbers from 1 to 5 inclusive:

INSERT dbo.Test (pk)
SELECT Num.n 
FROM dbo.Numbers AS Num
WHERE
    Num.n BETWEEN 1 AND 5
    AND NOT EXISTS 
    (
        SELECT NULL
        FROM dbo.Test AS t 
        WHERE t.pk = Num.n
    );

Since this query reads from and writes to the keys of the same index on the Test table, the execution plan requires Halloween Protection. In this case, the optimizer uses full phase separation using an Eager Table Spool:

50 rows

With five rows now in the Test table, we run the same query again, changing the WHERE clause to process the numbers from 1 to 50 inclusive:

This plan provides correct protection against the Halloween Problem, but it does not feature an Eager Table Spool. The optimizer recognizes that the Hash Match join operator is blocking on its build input; all rows are read into a hash table before the operator starts the matching process using rows from the probe input. As a consequence, this plan naturally provides phase separation (for the Test table only) without the need for a spool.

The optimizer chose a Hash Match join plan over the Nested Loops join seen in the 5-row plan for cost-based reasons. The 50-row Hash Match plan has a total estimated cost of 0.0347345 units. We can force the Nested Loops plan used previously with a hint to see why the optimizer did not choose nested loops:

This plan has an estimated cost of 0.0379063 units including the spool, a bit more than the Hash Match plan.

500 Rows

With 50 rows now in the Test table, we further increase the range of numbers to 500:

This time, the optimizer chooses a Merge Join, and again there is no Eager Table Spool. The Sort operator provides the necessary phase separation in this plan. It fully consumes its input before returning the first row (the sort cannot know which row sorts first until all rows have been seen). The optimizer decided that sorting 50 rows from the Test table would be cheaper than eager-spooling 450 rows just before the update operator.

The Sort plus Merge Join plan has an estimated cost of 0.0362708 units. The Hash Match and Nested Loops plan alternatives come out at 0.0385677 units and 0.112433 units respectively.

Something odd about the Sort

If you have been running these examples for yourself, you might have noticed something odd about that last example, particularly if you looked at the Plan Explorer tool tips for the Test table Seek and the Sort:

The Seek produces an ordered stream of pk values, so what is the point of sorting on the same column immediately afterward? To answer that (very reasonable) question, we start by looking at just the SELECT portion of the INSERT query:

SELECT Num.n 
FROM dbo.Numbers AS Num
WHERE
    Num.n BETWEEN 1 AND 500
    AND NOT EXISTS 
    (
        SELECT 1
        FROM dbo.Test AS t 
        WHERE t.pk = Num.n
    )
ORDER BY
    Num.n;

This query produces the execution plan below (with or without the ORDER BY I added to address certain technical objections you might have):

Notice the lack of a Sort operator. So why did the INSERT plan include a Sort? Simply to avoid the Halloween Problem. The optimizer considered that performing a redundant sort (with its built-in phase separation) was the cheapest way to execute the query and guarantee correct results. Clever.

Halloween Protection Levels and Properties

The SQL Server optimizer has specific features that allow it to reason about the level of Halloween Protection (HP) required at each point in the query plan, and the detailed effect each operator has. These extra features are incorporated into the same property framework the optimizer uses to keep track of hundreds of other important bits of information during its search activities.

Each operator has a required HP property and a delivered HP property. The required property indicates the level of HP needed at that point in the tree for correct results. The delivered property reflects the HP provided by the current operator and the cumulative HP effects provided by its subtree.

The optimizer contains logic to determine how each physical operator (for example, a Compute Scalar) affects the HP level. By exploring a wide range of plan alternatives and rejecting plans where the delivered HP is less than the required HP at the update operator, the optimizer has a flexible way to find correct, efficient plans that do not always require an Eager Table Spool.

Plan changes for Halloween Protection

We saw the optimizer add a redundant sort for Halloween Protection in the previous Merge Join example. How can we be sure this is more efficient than a simple Eager Table Spool? And how can we know which features of an update plan are only there for Halloween Protection?

Both questions can be answered (in a test environment, naturally) using undocumented trace flag 8692, which forces the optimizer to use an Eager Table Spool for Halloween Protection. Recall that the Merge Join plan with the redundant sort had an estimated cost of 0.0362708 magic optimizer units. We can compare that to the Eager Table Spool alternative by recompiling the query with trace flag 8692 enabled:

INSERT dbo.Test (pk)
SELECT Num.n 
FROM dbo.Numbers AS Num
WHERE
    Num.n BETWEEN 1 AND 500
    AND NOT EXISTS 
    (
        SELECT 1
        FROM dbo.Test AS t 
        WHERE t.pk = Num.n
    )
OPTION (QUERYTRACEON 8692);

The Eager Spool plan has an estimated cost of 0.0378719 units (up from 0.0362708 with the redundant sort). The cost differences shown here are not very significant due to the trivial nature of the task and the small size of the rows. Real-world update queries with complex trees and larger row counts often produce plans that are much more efficient thanks to the SQL Server optimizer’s ability to think deeply about Halloween Protection.

Other non-spool options

Positioning a blocking operator optimally within a plan is not the only strategy open to the optimizer to minimize the cost of providing protection against the Halloween Problem. It can also reason about the range of values being processed, as the following example demonstrates:

CREATE TABLE #Test
(
    pk          integer IDENTITY PRIMARY KEY,
    some_value  integer
);

CREATE INDEX i ON #Test (some_value);

-- Pretend the table has lots of data in it
UPDATE STATISTICS #Test
WITH ROWCOUNT = 123456, PAGECOUNT = 1234;

UPDATE #Test
SET some_value = 10
WHERE some_value = 5;

The execution plan shows no need for Halloween Protection, despite the fact we are reading from and updating the keys of a common index:

The optimizer can see that changing ‘some_value’ from 5 to 10 could never cause an updated row to be seen a second time by the Index Seek (which is only looking for rows where some_value is 5). This reasoning is only possible where literal values are used in the query, or where the query specifies OPTION (RECOMPILE), allowing the optimizer to sniff the values of the parameters for a one-off execution plan.

Even with literal values in the query, the optimizer may be prevented from applying this logic if the database option FORCED PARAMETERIZATION is ON. In that case, the literal values in the query are replaced by parameters, and the optimizer can no longer be sure that Halloween Protection is not required (or will not be required when the plan is reused with different parameter values):

In case you are wondering what happens if FORCED PARAMETERIZATION is enabled and the query specifies OPTION (RECOMPILE), the answer is that the optimizer compiles a plan for the sniffed values, and so can apply the optimization. As always with OPTION (RECOMPILE), the specific-value query plan is not cached for reuse.

Top

This last example shows how the Top operator can remove the need for Halloween Protection:

UPDATE TOP (1) t
SET some_value += 1
FROM #Test AS t
WHERE some_value <= 10;

No protection is required because we are only updating one row. The updated value cannot be encountered by the Index Seek, because the processing pipeline stops as soon as the first row is updated. Again, this optimization can only be applied if a constant literal value is used in the TOP, or if a variable returning the value ‘1’ is sniffed using OPTION (RECOMPILE).

If we change the TOP (1) in the query to a TOP (2), the optimizer chooses a Clustered Index Scan instead of the Index Seek:

We are not updating the keys of the clustered index, so this plan does not require Halloween Protection. Forcing the use of the nonclustered index with a hint in the TOP (2) query makes the cost of the protection apparent:

The optimizer estimated the Clustered Index Scan would be cheaper than this plan (with its extra Halloween Protection).

Odds and Ends

There are a couple of other points I want to make about Halloween Protection that have not found a natural place in the series before now. The first is the question of Halloween Protection when a row-versioning isolation level is in use.

Row Versioning

SQL Server provides two isolation levels, READ COMMITTED SNAPSHOT and SNAPSHOT ISOLATION that use a version store in tempdb to provide a statement- or transaction-level consistent view of the database. SQL Server could avoid Halloween Protection completely under these isolation levels, since the version store can provide data unaffected by any changes the currently executing statement might have made so far. This idea is currently not implemented in a released version of SQL Server, though Microsoft has filed a patent describing how this would work, so perhaps a future version will incorporate this technology.

Heaps and Forwarded Records

If you are familiar with the internals of heap structures, you might be wondering if a particular Halloween Problem might occur when forwarded records are generated in a heap table. In case this is new to you, a heap record will be forwarded if an existing row is updated such that it no longer fits on the original data page. The engine leaves behind a forwarding stub, and moves the expanded record to another page.

A problem could occur if a plan containing a heap scan updates a record such that it is forwarded. The heap scan might encounter the row again when the scan position reaches the page with the forwarded record. In SQL Server, this issue is avoided because the Storage Engine guarantees to always follow forwarding pointers immediately. If the scan encounters a record that has been forwarded, it ignores it. With this safeguard in place, the query optimizer does not have to worry about this scenario.

SCHEMABINDING and T-SQL Scalar Functions

There are very few occasions when using a T-SQL scalar function is a good idea, but if you must use one you should be aware of an important effect it can have regarding Halloween Protection. Unless a scalar function is declared with the SCHEMABINDING option, SQL Server assumes the function accesses tables. To illustrate, consider the simple T-SQL scalar function below:

CREATE FUNCTION dbo.ReturnInput
(
    @value integer
)
RETURNS integer
AS
BEGIN
	RETURN @value;
END;

This function does not access any tables; in fact it does nothing except return the parameter value passed to it. Now look at the following INSERT query:

DECLARE @T AS TABLE (ProductID integer PRIMARY KEY);

INSERT @T (ProductID)
SELECT p.ProductID
FROM AdventureWorks2012.Production.Product AS p;

The execution plan is exactly as we would expect, with no Halloween Protection needed:

Adding our do-nothing function has a dramatic effect, however:

DECLARE @T AS TABLE (ProductID integer PRIMARY KEY);

INSERT @T (ProductID)
SELECT dbo.ReturnInput(p.ProductID)
FROM AdventureWorks2012.Production.Product AS p;

The execution plan now includes an Eager Table Spool for Halloween Protection. SQL Server assumes the function accesses data, which might include reading from the Product table again. As you may recall, an INSERT plan that contains a reference to the target table on the reading side of the plan requires full Halloween Protection, and as far as the optimizer knows, that might be the case here.

Adding the SCHEMABINDING option to the function definition means SQL Server examines the body of the function to determine which tables it accesses. It finds no such access, and so does not add any Halloween Protection:

ALTER FUNCTION dbo.ReturnInput
(
    @value integer
)
RETURNS integer
WITH SCHEMABINDING
AS
BEGIN
	RETURN @value;
END;
GO
DECLARE @T AS TABLE (ProductID int PRIMARY KEY);

INSERT @T (ProductID)
SELECT p.ProductID
FROM AdventureWorks2012.Production.Product AS p;

This issue with T-SQL scalar functions affects all update queries – INSERT, UPDATE, DELETE, and MERGE. Knowing when you are hitting this problem is made more difficult because unnecessary Halloween Protection will not always show up as an extra Eager Table Spool, and scalar function calls may be hidden in views or computed column definitions, for example.

[ Part 1 | Part 2 | Part 3 | Part 4 ]

15 thoughts on “The Halloween Problem – Part 4”

tobi says:

February 21, 2013 at 8:25 PM

I would have missed the edge case with a scalar UDF querying tables! What an insidious edge case.
Fabiano Amorim says:

February 23, 2013 at 8:47 PM

Brilliant articles as always,
I wish it would have some trick or a TF to avoid a boring spool to protect from HP in a remote query… I mean:
I’m trying to insert some data from a remote query into a table and it insists to protect it from a halloween problem.

For instance:
INSERT INTO SomeOtherTable(Col1, Col2)
SELECT Col1, Col2
FROM OPENQUERY (sql2012, 'SELECT Col1, Col2 FROM Database.dbo.SomeTable')

I don’t want to pay for the expensive spool …

I realized that if I use SELECT INTO it don’t uses the spool … but this is not what I want… I can’t use SELECT INTO in my customer query.

It is a shame there isn't anything we can do to avoid it… :-(, are you aware of how to do it?

Regards
Paul White says:

February 24, 2013 at 1:31 AM

Hi Fabiano,

I'm not aware of any magic that would avoid a spool or other phase-separation in that case. Using INSERT…EXEC with a procedure looks like it avoids the issue, but the results are stored in a hidden tempdb worktable (parameter table scan), which is just as bad as the spool. The only times I have encountered this issue myself (when the size of the task meant the spool was important), I was able to use SELECT INTO followed by a partition SWITCH, or use SQLCLR and the SqlBulkCopy interface to replace the remote query.

Paul
Rajesh Kumaran says:

March 2, 2013 at 7:25 PM

Hi Paul ,
As always excellent article.

By the way , is there any way to avoid the table spool when referencing the same table for a delete? . In this case UserCookieID is the PK
DELETE a
FROM UserCookies a
WHERE UserCookieID IN (
SELECT TOP (50000) UserCookieID
FROM dbo.UserCookies WITH (NOLOCK)
WHERE DateModified < DateAdd(dd, – 80, getdate())
)

Paul White says:

March 3, 2013 at 12:11 AM

Hi Rajesh,

Yes, because there is no need for a self-reference here:

DECLARE @Cookies AS TABLE
(
    CookieID        integer PRIMARY KEY,
    DateModified    datetime2 NOT NULL UNIQUE
);

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;

DELETE TOP (5000) c
FROM @Cookies AS c
WHERE c.DateModified < DATEADD(DAY, -80, CURRENT_TIMESTAMP);

Paul

Rajesh Kumaran says:

March 4, 2013 at 12:35 PM

Ahh.. good catch Paul :)
Thanks very much

Thanks
Rajesh
Itzik Ben-Gan says:

September 5, 2013 at 3:45 PM

Hi Paul,

Regarding the Hash Match and Sort operators providing Halloween protection without the need for an Eager Spool. That's clear that phase separation is achieved with serial execution since the operators block until the input is consumed. I was wondering, do we get the same guarantee with parallel plans? Or perhaps a Gather Streams exchange is responsible for that. Finally, do you know of a way to see in the plan whether a certain operator is blocking or not, or is this just something you need to know about it?

Thanks!
Itzik
Paul White says:

September 6, 2013 at 10:27 AM

Hi Itzik,

Parallel fully blocking (stop-and-go) iterators provide the same level of HP as they do in a serial plan.

Fully blocking iterators consume their entire input set in the Open() method call. The Gather Streams' Open() call completes Open() calls on all subtasks attached to worker threads at the exchange. All instances of a blocking iterator in a parallel zone are guaranteed to read their entire input (at Open) before any GetRow() calls occur. Therefore, blocking iterators in parallel branches provide complete phase separation, just as they do in serial plans.

As far as identifying blocking iterators is concerned, no, this information is not exposed in show plan (it would be nice though). There aren't that many fully blocking iterators; they include Sort, Eager Hash Distinct (not Flow Distinct), the various Eager Spools, UDX, and Scalar (no group by) Aggregate. Hash join is only blocking on its build input.
Itzik Ben-Gan says:

September 6, 2013 at 12:54 PM

Thanks, Paul. That's good to know.
Vladimir Moldovanenko says:

January 8, 2014 at 4:17 PM

Hi Paul,
Wonderful article!
I am too wondering and learning about spool operator.
I am attempting recursive WHILE loop in multi-statement TVF. I need temp table data to process it further.
I noticed spool operator and its cost.
I posted a question on T-SQL forum and got some explanations.
http://social.msdn.microsoft.com/Forums/sqlserver/en-US/f043e7d9-09ca-47e9-b34d-3b713f94ee63/how-to-remove-table-spooleager-spool-from-query-plan-of-a-function?forum=transactsql

However, I do believe that there cannot be Halloween issue in my code. Yet, optimizer thinks so.
Or am I wrong here?

Any explanation(s) as to why spool is used here? I want to understand it.

Thank you so much.
Paul White says:

January 9, 2014 at 2:11 AM

Hi Vladimir,

I see you got an answer to your question by reading the other parts in this series and Itzik's article (http://sqlmag.com/t-sql/divide-and-conquer-halloween). I up-voted Tom Cooper's answers to you on that thread as they were particularly helpful. Many of the other suggestions and comments were not correct, unfortunately.

Paul
Martin says:

March 4, 2014 at 6:23 PM

The 2012 internals book (p. 674) also mentions that a special form of compute scalar can sometimes be used to provide this protection but unfortunately doesn't provide any more details of this.
Paul White says:

March 4, 2014 at 6:38 PM

Hi Martin,

You're right, it doesn't provide any more details :) Sorry to disappoint (as I know this will), but the 'specialness' referred to is very limited. In general, a compute scalar has a choice between using a pointer to a value or making a copy. For single-row update plans, whether a copy is made or a reference used can have implications for Halloween Protection. In very specific cases, choosing a copying compute scalar can avoid the need for another HP implementation, e.g. an eager spool. The type of compute scalar is not exposed in query plans.

Paul
Martin says:

March 4, 2014 at 7:42 PM

Thanks for the explanation. I hadn't heard of that one before.
Alex Friedman says:

January 31, 2017 at 1:26 PM

Re-reading this posts series again. Great stuff! Thank you so much.

Comments are closed.