Aaron Bertrand

If you are using indexed views and MERGE, please read this!

SentryOne Newsletters

The SQLPerformance.com bi-weekly newsletter keeps you up to speed on the most recent blog posts and forum discussions in the SQL Server community.

eNews is a bi-monthly newsletter with fun information about SentryOne, tips to help improve your productivity, and much more.

Subscribe

Featured Author

Paul Randal, CEO of SQLskills, writes about knee-jerk performance tuning, DBCC, and SQL Server internals.

Paul’s Posts

Fellow MVP Jamie Thomson recently pointed out that there is a "wrong results" bug in SQL Server that can manifest itself when the following conditions are true:

  • You have an indexed view that joins at least two tables;
  • those tables are constrained in either direction by a single-column foreign key;
  • you perform updates to the base table(s) using MERGE which includes both UPDATE and (DELETE or INSERT) actions; and,
  • you subsequently issue queries that reference the index on the view (intentionally or not).

Unfortunately, the Knowledge Base article describing the problem (KB #2756471) is pretty light on details. They don't tell you how to reproduce the problem, or even what, specifically, you should be looking for to see if this affects you; and they don't even make mention of MERGE (which is actually the core of the problem, not NOEXPAND, and not a simple update). There are some additional details in the Connect item that brought about the fix; hopefully the KB article will be updated with more details before long.

In the meantime, the result you may see is incorrect data – or better put, stale data: The query may show you the old version of the updated row(s)! I spent a few minutes trying to reproduce this scenario in AdventureWorks, and failed miserably. Thankfully, Paul White (blog | @SQL_Kiwi) wrote an outstanding post describing the scenario and showing a full repro of the issue.

I don't think I can emphasize how serious this is.

Surely millions of customers are using indexed views, many of those have migrated their DML code to use MERGE, and a large number of those are on Enterprise Edition (or aren't but are using the NOEXPAND hint or are referencing the index directly). Paul was quick to point out that NOEXPAND is not required to reproduce the problem in Enterprise Edition, and also discovered many of the other details required to reproduce the bug.

This post is not intended to steal any thunder from Jamie's or Paul's posts; just an attempt to reiterate the concern and to raise awareness of this issue. If you're in the habit of ignoring Cumulative Updates, opting to wait for Service Packs, and there is any chance that this issue could be affecting you right now, you owe it to yourself, not to mention your stakeholders and customers, to take this issue seriously.

So what should you do?

Well, what you do next depends on what version and edition of SQL Server you're running, and whether the bug actually affects you (or could).

    SQL Server 2008 SP3
    SQL Server 2008 R2 SP1/SP2
    SQL Server 2012 RTM/SP1

    Your options if you are on one of these builds:

    1. You should update to the latest Cumulative Update for your branch:
      Branch Fixed in CU Build Minimum build required
      to apply update

      KB Article
      (Download)
      2008 Service Pack 3 CU #8 10.00.5828 10.00.5500 KB #2771833
      2008 R2 Service Pack 1 CU #10 10.50.2868 10.50.2500 KB #2783135
      2008 R2 Service Pack 2 CU #4 10.50.4270 10.00.4000 KB #2777358
      2012 RTM CU #5 11.00.2395 11.00.2100 KB #2777772
      2012 Service Pack 1 CU #2 11.00.3339 11.00.3000 KB #2790947

      Table 1 : Builds that contain the fix

    2. If you don't apply the fix, then you need to test all the references to your views to validate that they return correct results in all cases – including after you've updated the base tables using MERGE. If they don't (or you suspect that they might later be affected), then you should rebuild the clustered index on all affected views (or repair the indexed view(s) using DBCC CHECKTABLE, as Paul has described in his post), and stop using MERGE against these tables until you've applied the fix. If you continue using MERGE against the base tables, prepare to continue repairing the views in order to avoid the problem.
       
    3. A quicker fix would be to prevent the damaged indexed view from being used at all, by using any of the following methods required:
       
      • apply the query hint OPTION (EXPAND VIEWS) to all relevant queries;
      • remove any explicit references to the index on the view;
      • in Standard or other editions where indexed views are not matched automatically, remove all instances of NOEXPAND.
         

      But this, of course, would largely defeat the purpose of the indexed view – may as well just drop the index. That said, it is usually better to get the right results slowly, than to get the wrong results quickly; so maybe that's okay.

    SQL Server 2008 RTM/SP1/SP2
    SQL Server 2008 R2 RTM

    Unfortunately, you're on a build that is no longer in mainstream support, and it is unlikely this issue will be fixed for you (unless you are on extended support and you make a lot of noise). So your options are limited here – either move to a supported branch per the table above, and apply the Cumulative Update, or choose one of the other options mentioned previously.

    SQL Server 2000
    SQL Server 2005

    Well, the bad news is that you're also on a build that is no longer supported. The good news is that in this specific case it doesn't matter – you can't use MERGE anyway, so this bug can't affect you.

Other MERGE issues

Sadly, this is far from the first bug we've seen with MERGE, and it likely won't be the last. Here is a quick selection of a dozen MERGE bugs that are still marked as active on Connect:

Now, it may be the case that some of these bugs have actually been fixed, but their status is wrong because the loop back to Connect has not been closed. Even if that is the case, it can't be the true for all of them (and potentially for others that I did not uncover).

In addition, it has been demonstrated by Dan Guzman that MERGE is not immune from race conditions and other concurrency issues. The workaround is to use HOLDLOCK (or a higher isolation level); however, it is a common misconception that MERGE is completely atomic and not prone to this problem at all. Therefore I'll wonder aloud: how many MERGE statements out there include HOLDLOCK (or are being executed under SERIALIZABLE)? How many have been thoroughly tested for issues relating to concurrency?

Conclusion

Personally, I think the syntax is great (albeit daunting to learn), but every time an issue comes up, it erodes my confidence in the practicality of replacing existing DML with the new construct.

With that in mind, not to be Chicken Little, but I would not feel comfortable recommending anyone to use MERGE unless they implement extremely comprehensive testing. Some of these issues are also present with standard UPSERT methodologies, but the problems are more obvious there. MERGE, merely through its single-statement nature, makes you want to believe in magic. Maybe someday it will deliver, but right now I know it's not going to be able to saw a person in half without some serious help.