Introduction
The SQL Server query optimizer makes use of statistics during query compilation to help determine the optimal query plan. By default, if the optimizer notices a statistic is out-of-date because of too many changes to a table, it will update the statistic immediately before query compilation can continue (only the statistics it needs, not all the statistics for the table).
Note that “too many” is non-specific because it varies by version and whether trace flag 2371 is enabled – see the AUTO_UPDATE_STATISTICS section of this page for details.
The Problem With Synchronous Statistics Updates
Synchronously updating statistics before compilation obviously introduces a delay and makes the query take longer to compile and execute. Just how big of a delay depends on several factors, including:
- How many tables involved in the query have reached the “too many changes” threshold
- How many statistics for each of those tables have to be updated because they’re needed for compilation
- How many rows there are in the tables involved
- The options specified when each statistic was created (e.g., FULLSCAN and PERSIST_SAMPLE_PERCENT=ON)
So, there can be a seemingly random delay, which might cause problems in some scenarios, especially if an application has a very low query timeout set.
Avoiding Synchronous Statistics Updates
There are various ways to avoid synchronous statistics updates, such as:
- Setting AUTO_UPDATE_STATISTICS to OFF, which turns off all automatic updates and means you’ll need to perform your own statistics maintenance to avoid the possibility of sub-optimal query plans from out-of-date statistics.
- Setting AUTO_UPDATE_STATISTICS_ASYNC to ON, so when the optimizer notices a statistic needs to be updated, it continues with compilation, and a background task updates the statistic a little bit later. This only works if you also have AUTO_UPDATE_STATISTICS set to ON.
- Perform regular statistics maintenance, so automatic synchronous or asynchronous statistics updates don’t happen at all.
There’s a lot of debate in the SQL Server community around whether to enable asynchronous statistics updates. I asked my lovely wife, Kimberly L. Tripp, what her opinion is, and she always recommends enabling it, and she’s forgotten more about statistics than I’ll ever know, so I believe her. ☺
Tracking Synchronous Statistics Updates
There has never been an obvious way to tell whether a query was taking a long time because it was waiting for a synchronous statistics update. You could tell *after* the statistics update had completed if you had an Extended Event session already running watching for the auto_stats event and filtering on the async column being set to 0. However, that column in the event output was only added in SQL Server 2017, and you’d also have to configure an action that captured something to identify the query involved.
Now in SQL Server 2019, there’s the WAIT_ON_SYNCHRONOUS_STATISTICS_UPDATE wait type, and at first glance it seems like it would easily allow you to see if a query is waiting for a synchronous statistics update by just looking in sys.dm_os_waiting_tasks to see what the query is currently waiting for.
Unfortunately, that’s not the case.
The term “waiting” is a bit misleading here as in this case the thread isn’t actually waiting. This new wait type is an example of what’s called a “preemptive” wait, where the thread switches to a mode where it remains on the processor until it’s finished its work. Most preemptive waits are when a thread makes a call outside of SQL Server (e.g., to get security information from a Domain Controller), but sometimes a thread is doing something inside SQL Server and needs to complete it before potentially being forced to yield the processor because its 4ms thread quantum has expired. Neither of those things are what’s happening here. In this case, the thread registers the start of a preemptive wait with the new wait type and then does the statistics update, probably incurring other *real* waits like PAGEIOLATCH_SH along the way. It’s not until the statistics update has completed that the preemptive wait ends and is accounted for in the wait statistics metrics.
Why is this a big deal? Well, the DMV sys.dm_os_waiting_tasks shows the wait types for all the threads that are *really* waiting, i.e., on the waiting tasks list of a scheduler, so if the synchronous statistics update thread isn’t waiting for WAIT_ON_SYNCHRONOUS_STATISTICS_UPDATE, that wait type will not show up in the output of the DMV. The new wait type cannot be used to see whether a query is currently waiting for a statistics update.
You can easily prove this to yourself by doing the following:
- Create a table with a few hundred thousand rows
- Create a statistic on a table column, and specify FULLSCAN and PERSIST_SAMPLE_PERCENT = ON as options, forcing the entire table to be read every time the statistic is updated
- Update twenty thousand rows
- Checkpoint the database and execute DBCC DROPCLEANBUFFERS
- Do a SELECT statement with a WHERE clause on the column with the statistic you created
- Look in sys.dm_os_waiting_tasks DMV for the session ID of the SELECT, and you’ll see it’s likely waiting for PAGEIOLATCH_SH as the statistics update reads through the table
With that disappoint aside, there’s a trick to be able to see if a query is waiting for a synchronous statistics update. When a statistics update happens, a command called STATMAN runs, and you can see it happening in the output from sys.dm_exec_requests: the status will be “suspended” (even though the thread is running, as I described above), and the command will be “SELECT (STATMAN).”
What Use Is the New Wait Type?
Although the new wait type can’t be used as an immediate way to tell a query is waiting for a synchronous statistics update, if it shows up in your regular wait statistics analysis, you know some queries in the workload may be suffering from these delays. But that’s about the limit of its usefulness as far as I’m concerned. Unless the average wait time shows up as a concerning percentage of your average query execution time or you’re continually capturing waits over small periods of time to allow proper analysis, you don’t know for sure whether there’s a problem.
This is a wait type where the wait time may vary wildly, depending on the factors I mentioned earlier. Therefore, I would use just the presence of this wait type to be alerted to potential problems, and I’d want to implement an Extended Event session as described above to capture instances of synchronous statistics updates to see if their duration is long enough to merit taking some corrective action.
Summary
I’m not sure that the addition of the WAIT_ON_SYNCHRONOUS_STATISTICS_UPDATE wait type is going to change whether people configure asynchronous statistics updates or simply do all the statistics maintenance themselves, but at least now you’ll be able to tell whether queries are waiting for synchronous statistics updates and take some further action.
Until next time, happy performance troubleshooting!
Thanks Paul, that new wait type will be useful.
If I don't have problem with AUTO UPDATE STATISTICS – I will not change default configuration. That should be rule for everything. Generally, I am sure that all changes/issues/bugs on SQL engine are more and better tested (in Microsoft) in combination with default values of other configurations, and because of relation between functionalities and configurations I will keep default values if they work well.
But, we change AUTO_UPDATE_STATISTICS_ASYNC to ON (to non default value), on our huge OLTP database, and we do that (a few years ago) because one specific issue immediately after migration on SQL Server 2016, because of unexpected behavior:
During procedure recompilation, when Engine decided to update statistic during compilation time (synchronous, before query execution), although we maintain statistics on three days, in that specific situation it lasted too long. It couldn't to finish that update stats in 30 seconds, and all requests went to timeout before query execution starts. It was very complicate situation and this is not place to write about details, but we didn't have a problem before with the same procedure on SQL Server 2014. That configuration change wasn't solution, that was a workaround.
I have a big question? And I didn't get an answer from Microsoft, after 6 months of correspondence. If engine during compile time finds out-of-dated statistic (AUTO_UPDATE_STATISTICS_ASYNC is OFF), why there are no mechanism that engine refuse that update if concludes it will take too long? For me, on OLTP system, everything longer than few seconds for update stats during compilation time is unacceptable.
Before migration we didn't have that problem
That's just the way it works – what should be the value for 'too long'? And that would require knowing buffer pool contents – what proportion of the required data file pages are already in memory. Hence the 'sledgehammer' approach of turning auto-update off or using async updates.
Many thanks for the blog! Always good to read a refresher.
Not sure if this is still the case, but it certianly appead to be an issue we had a while back for async:
Does the Async Statistic update work if the query that generates the request for the update fails?
If query compilation gets as far as noticing the need for the async stats update and adding it to the background task queue, it will happen regardless of how much further compilation and execution proceeds.