Paul White

The Halloween Problem – Part 3

February 18, 2013 by in SQL Performance, SQL Plan | 4 Comments
SentryOne eBooks

In these books, you will find useful, hand-picked articles that will help give insight into some of your most vexing performance problems. These articles were written by several of the SQL Server industry’s leading experts, including Paul White, Paul Randal, Jonathan Kehayias, Erin Stellato, Glenn Berry, Aaron Bertrand, and Joe Sack.

Free Download

Featured Author

Paul White is an independent SQL Server consultant specializing in performance tuning, execution plans, and the query optimizer.

Paul’s Posts

[ Part 1 | Part 2 | Part 3 | Part 4 ]

The MERGE statement (introduced in SQL Server 2008) allows us to perform a mixture of INSERT, UPDATE, and DELETE operations using a single statement. The Halloween Protection issues for MERGE are mostly a combination of the requirements of the individual operations, but there are some important differences and a couple of interesting optimizations that apply only to MERGE.

Avoiding the Halloween Problem with MERGE

We start by looking again at the Demo and Staging example from part two:

CREATE TABLE dbo.Demo
(
    SomeKey integer NOT NULL,

    CONSTRAINT PK_Demo
        PRIMARY KEY (SomeKey)
);

CREATE TABLE dbo.Staging
(
    SomeKey integer NOT NULL
);

INSERT dbo.Staging
    (SomeKey)
VALUES
    (1234),
    (1234);

CREATE NONCLUSTERED INDEX c 
ON dbo.Staging (SomeKey);

INSERT dbo.Demo
SELECT s.SomeKey
FROM dbo.Staging AS s
WHERE NOT EXISTS
(
    SELECT 1
    FROM dbo.Demo AS d
    WHERE d.SomeKey = s.SomeKey
);

As you may recall, this example was used to show that an INSERT requires Halloween Protection when the insert target table is also referenced in the SELECT part of the query (the EXISTS clause in this case). The correct behaviour for the INSERT statement above is to try to add both 1234 values, and to consequently fail with a PRIMARY KEY violation. Without phase separation, the INSERT would incorrectly add one value, completing without an error being thrown.

The INSERT execution plan

The code above has one difference from that used in part two; a nonclustered index on the Staging table has been added. The INSERT execution plan still requires Halloween Protection though:

image_thumb20

The MERGE execution plan

Now try the same logical insert expressed using MERGE syntax:

MERGE dbo.Demo AS d
USING dbo.Staging AS s ON
    s.SomeKey = d.SomeKey
WHEN NOT MATCHED BY TARGET THEN
    INSERT (SomeKey)
    VALUES (s.SomeKey);

In case you are not familiar with the syntax, the logic there is to compare rows in the Staging and Demo tables on the SomeKey value, and if no matching row is found in the target (Demo) table, we insert a new row. This has exactly the same semantics as the previous INSERT...WHERE NOT EXISTS code, of course. The execution plan is quite different however:

image_thumb24

Notice the lack of an Eager Table Spool in this plan. Despite that, the query still produces the correct error message. It seems SQL Server has found a way to execute the MERGE plan iteratively while respecting the logical phase separation required by the SQL standard.

The hole-filling optimization

In the right circumstances, the SQL Server optimizer can recognize that the MERGE statement is hole-filling, which is just another way of saying that the statement only adds rows where there is an existing gap in the target table’s key.

For this optimization to be applied, the values used in the WHEN NOT MATCHED BY TARGET clause must exactly match the ON part of the USING clause. Also, the target table must have a unique key (a requirement satisfied by the PRIMARY KEY in the present case). Where these requirements are met, the MERGE statement does not require protection from the Halloween Problem.

Of course, the MERGE statement is logically no more or less hole-filling than the original INSERT...WHERE NOT EXISTS syntax. The difference is that the optimizer has complete control over implementing the MERGE statement, whereas the INSERT syntax would require it to reason about the wider semantics of the query. A human can easily see that the INSERT is also hole-filling, but the optimizer does not think about things in the same way we do.

To illustrate the exact matching requirement I mentioned, consider the following query syntax, which does not benefit from the hole-filling optimization. The result is full Halloween Protection provided by an Eager Table Spool:

MERGE dbo.Demo AS d
USING dbo.Staging AS s ON
    s.SomeKey = d.SomeKey
WHEN NOT MATCHED THEN
    INSERT (SomeKey)
    VALUES (s.SomeKey * 1);

image_thumb27

The only difference there is the multiplication by one in the VALUES clause – something which does not change the logic of the query, but which is enough to prevent the hole-filling optimization being applied.

Hole-filling with Nested Loops

In the previous example, the optimizer chose to join the tables using a Merge join. The hole-filling optimization can also be applied where a Nested Loops join is chosen, but this requires an extra uniqueness guarantee on the source table, and an index seek on the inner side of the join. To see this in action, we can clear out the existing staging data, add uniqueness to the nonclustered index, and try the MERGE again:

-- Remove existing duplicate rows
TRUNCATE TABLE dbo.Staging;

-- Convert index to unique
CREATE UNIQUE NONCLUSTERED INDEX c 
ON dbo.Staging (SomeKey)
WITH (DROP_EXISTING = ON);

-- Sample data
INSERT dbo.Staging
    (SomeKey)
VALUES
    (1234),
    (5678);

-- Hole-filling merge
MERGE dbo.Demo AS d
USING dbo.Staging AS s ON
    s.SomeKey = d.SomeKey
WHEN NOT MATCHED THEN
    INSERT (SomeKey)
    VALUES (s.SomeKey);

The resulting execution plan again uses the hole-filling optimization to avoid Halloween Protection, using a nested loops join and an inner-side seek into the target table:

Nested loops merge hole filling

Avoiding unnecessary index traversals

Where the hole-filling optimization applies, the engine may also apply a further optimization. It can remember the current index position while reading the target table (processing one row at a time, remember) and reuse that information when performing the insert, instead of seeking down the b-tree to find the insert location. The reasoning is that the current read position is very likely to be on the same page where the new row should be inserted. Checking that the row does in fact belong on this page is very fast, since it involves checking only the lowest and highest keys currently stored there.

The combination of eliminating the Eager Table Spool and saving an index navigation per row can provide a significant benefit in OLTP workloads, provided the execution plan is retrieved from cache. The compilation cost for MERGE statements is rather higher than for INSERT, UPDATE and DELETE, so plan reuse is an important consideration. It is also helpful to ensure that pages have sufficient free space to accommodate new rows, avoiding page splits. This is typically achieved through normal index maintenance and the assignment of a suitable FILLFACTOR.

I mention OLTP workloads, which typically feature a large number of relatively small changes, because the MERGE optimizations may not be a good choice where a large number of are rows processed per statement. Other optimizations like minimally-logged INSERTs cannot currently be combined with hole-filling. As always, the performance characteristics should be benchmarked to ensure the expected benefits are realized.

The hole-filling optimization for MERGE inserts may be combined with updates and deletes using additional MERGE clauses; each data-changing operation is assessed separately for the Halloween Problem.

Avoiding the join

The final optimization we will look at can be applied where the MERGE statement contains update and delete operations as well as a hole-filling insert, and the target table has a unique clustered index. The following example shows a common MERGE pattern where unmatched rows are inserted, and matching rows are updated or deleted depending on an additional condition:

CREATE TABLE #T
(
    col1 integer NOT NULL,
    col2 integer NOT NULL,

    CONSTRAINT PK_T
        PRIMARY KEY (col1)
);

CREATE TABLE #S
(
    col1 integer NOT NULL,
    col2 integer NOT NULL,

    CONSTRAINT PK_S
        PRIMARY KEY (col1)
);

INSERT #T
    (col1, col2)
VALUES
    (1, 50),
    (3, 90);

INSERT #S
    (col1, col2)
VALUES
    (1, 40),
    (2, 80),
    (3, 90);

The MERGE statement required to make all the required changes is remarkably compact:

 

MERGE #T AS t
USING #S AS s ON t.col1 = s.col1
WHEN NOT MATCHED THEN INSERT VALUES (s.col1, s.col2)
WHEN MATCHED AND t.col2 - s.col2 = 0 THEN DELETE
WHEN MATCHED THEN UPDATE SET t.col2 -= s.col2;

The execution plan is quite surprising:

Insert then Merge plan

No Halloween Protection, no join between the source and target tables, and it’s not often you will see a Clustered Index Insert operator followed by a Clustered Index Merge to the same table. This is another optimization targeted at OLTP workloads with high plan reuse and suitable indexing.

The idea is to read a row from the source table and immediately try to insert it into the target. If a key violation results, the error is suppressed, the Insert operator outputs the conflicting row it found, and that row is then processed for an update or delete operation using the Merge plan operator as normal.

If the original insert succeeds (without a key violation) processing continues with the next row from the source (the Merge operator only processes updates and deletes). This optimization primarily benefits MERGE queries where most source rows result in an insert. Again, careful benchmarking is required to ensure performance is better than using separate statements.

Summary

The MERGE statement provides several unique optimization opportunities. In the right circumstances, it can avoid the need to add explicit Halloween Protection compared with an equivalent INSERT operation, or perhaps even a combination of INSERT, UPDATE, and DELETE statements. Additional MERGE-specific optimizations can avoid the index b-tree traversal that is usually needed to locate the insert position for a new row, and may also avoid the need to join the source and target tables completely.

In the final part of this series, we will look at how the query optimizer reasons about the need for Halloween protection, and identify some more tricks it can employ to avoid the need to add Eager Table Spools to execution plans that change data.

[ Part 1 | Part 2 | Part 3 | Part 4 ]