Paul Randal

Tracking Synchronous Statistics Updates

SentryOne Newsletters

The SQLPerformance.com bi-weekly newsletter keeps you up to speed on the most recent blog posts and forum discussions in the SQL Server community.

eNews is a bi-monthly newsletter with fun information about SentryOne, tips to help improve your productivity, and much more.

Subscribe

Featured Author

Jonathan Kehayias is a Principal Consultant with SQLskills and the youngest MCM ever.

Jonathan’s Posts

Introduction

The SQL Server query optimizer makes use of statistics during query compilation to help determine the optimal query plan. By default, if the optimizer notices a statistic is out-of-date because of too many changes to a table, it will update the statistic immediately before query compilation can continue (only the statistics it needs, not all the statistics for the table).

Note that “too many” is non-specific because it varies by version and whether trace flag 2371 is enabled – see the AUTO_UPDATE_STATISTICS section of this page for details.

The Problem With Synchronous Statistics Updates

Synchronously updating statistics before compilation obviously introduces a delay and makes the query take longer to compile and execute. Just how big of a delay depends on several factors, including:

  • How many tables involved in the query have reached the “too many changes” threshold
  • How many statistics for each of those tables have to be updated because they’re needed for compilation
  • How many rows there are in the tables involved
  • The options specified when each statistic was created (e.g., FULLSCAN and PERSIST_SAMPLE_PERCENT=ON)

So, there can be a seemingly random delay, which might cause problems in some scenarios, especially if an application has a very low query timeout set.

Avoiding Synchronous Statistics Updates

There are various ways to avoid synchronous statistics updates, such as:

  • Setting AUTO_UPDATE_STATISTICS to OFF, which turns off all automatic updates and means you’ll need to perform your own statistics maintenance to avoid the possibility of sub-optimal query plans from out-of-date statistics.
  • Setting AUTO_UPDATE_STATISTICS_ASYNC to ON, so when the optimizer notices a statistic needs to be updated, it continues with compilation, and a background task updates the statistic a little bit later. This only works if you also have AUTO_UPDATE_STATISTICS set to ON.
  • Perform regular statistics maintenance, so automatic synchronous or asynchronous statistics updates don’t happen at all.

There’s a lot of debate in the SQL Server community around whether to enable asynchronous statistics updates. I asked my lovely wife, Kimberly L. Tripp, what her opinion is, and she always recommends enabling it, and she’s forgotten more about statistics than I’ll ever know, so I believe her. ☺

Tracking Synchronous Statistics Updates

There has never been an obvious way to tell whether a query was taking a long time because it was waiting for a synchronous statistics update. You could tell *after* the statistics update had completed if you had an Extended Event session already running watching for the auto_stats event and filtering on the async column being set to 0. However, that column in the event output was only added in SQL Server 2017, and you’d also have to configure an action that captured something to identify the query involved.

Now in SQL Server 2019, there’s the WAIT_ON_SYNCHRONOUS_STATISTICS_UPDATE wait type, and at first glance it seems like it would easily allow you to see if a query is waiting for a synchronous statistics update by just looking in sys.dm_os_waiting_tasks to see what the query is currently waiting for.

Unfortunately, that’s not the case.

The term “waiting” is a bit misleading here as in this case the thread isn’t actually waiting. This new wait type is an example of what’s called a “preemptive” wait, where the thread switches to a mode where it remains on the processor until it’s finished its work. Most preemptive waits are when a thread makes a call outside of SQL Server (e.g., to get security information from a Domain Controller), but sometimes a thread is doing something inside SQL Server and needs to complete it before potentially being forced to yield the processor because its 4ms thread quantum has expired. Neither of those things are what’s happening here. In this case, the thread registers the start of a preemptive wait with the new wait type and then does the statistics update, probably incurring other *real* waits like PAGEIOLATCH_SH along the way. It’s not until the statistics update has completed that the preemptive wait ends and is accounted for in the wait statistics metrics.

Why is this a big deal? Well, the DMV sys.dm_os_waiting_tasks shows the wait types for all the threads that are *really* waiting, i.e., on the waiting tasks list of a scheduler, so if the synchronous statistics update thread isn’t waiting for WAIT_ON_SYNCHRONOUS_STATISTICS_UPDATE, that wait type will not show up in the output of the DMV. The new wait type cannot be used to see whether a query is currently waiting for a statistics update.

You can easily prove this to yourself by doing the following:

  • Create a table with a few hundred thousand rows
  • Create a statistic on a table column, and specify FULLSCAN and PERSIST_SAMPLE_PERCENT = ON as options, forcing the entire table to be read every time the statistic is updated
  • Update twenty thousand rows
  • Checkpoint the database and execute DBCC DROPCLEANBUFFERS
  • Do a SELECT statement with a WHERE clause on the column with the statistic you created
  • Look in sys.dm_os_waiting_tasks DMV for the session ID of the SELECT, and you’ll see it’s likely waiting for PAGEIOLATCH_SH as the statistics update reads through the table

With that disappoint aside, there’s a trick to be able to see if a query is waiting for a synchronous statistics update. When a statistics update happens, a command called STATMAN runs, and you can see it happening in the output from sys.dm_exec_requests: the status will be “suspended” (even though the thread is running, as I described above), and the command will be “SELECT (STATMAN).”

What Use Is the New Wait Type?

Although the new wait type can’t be used as an immediate way to tell a query is waiting for a synchronous statistics update, if it shows up in your regular wait statistics analysis, you know some queries in the workload may be suffering from these delays. But that’s about the limit of its usefulness as far as I’m concerned. Unless the average wait time shows up as a concerning percentage of your average query execution time or you’re continually capturing waits over small periods of time to allow proper analysis, you don’t know for sure whether there’s a problem.

This is a wait type where the wait time may vary wildly, depending on the factors I mentioned earlier. Therefore, I would use just the presence of this wait type to be alerted to potential problems, and I’d want to implement an Extended Event session as described above to capture instances of synchronous statistics updates to see if their duration is long enough to merit taking some corrective action.

Summary

I’m not sure that the addition of the WAIT_ON_SYNCHRONOUS_STATISTICS_UPDATE wait type is going to change whether people configure asynchronous statistics updates or simply do all the statistics maintenance themselves, but at least now you’ll be able to tell whether queries are waiting for synchronous statistics updates and take some further action.

Until next time, happy performance troubleshooting!