Erin Stellato

Tracking Automatic Updates to Statistics

SentryOne Newsletters

The SQLPerformance.com bi-weekly newsletter keeps you up to speed on the most recent blog posts and forum discussions in the SQL Server community.

eNews is a bi-monthly newsletter with fun information about SentryOne, tips to help improve your productivity, and much more.

Subscribe

Featured Author

Erin Stellato is a Principal Consultant with SQLskills and a Microsoft Data Platform MVP.

Erin’s Posts

When you create a new database in SQL Server, the Auto Update Statistics option is enabled by default. It is generally recommended to leave this option enabled. Ideally, statistics are managed by a scheduled job, and the automatic option is used as a safety net – available to update statistics in the event a scheduled update does not occur, or accidentally does not include all existing statistics.

Some DBAs rely solely on automatic updates to manage statistics, and as long as no performance problems related to out-of-date or poorly sampled statistics exist, this is acceptable. If you are relying on this option to manage your statistics, and you have some very large tables, it might be worth implementing trace flag 2371. As with any trace flag, make sure you test with a representative workload before implementing it in production. You may already be aware that there are times when an automatic update can affect system performance. For example, the update to a statistic can introduce a spike in CPU or I/O as the table or index data is read, as well as delay query execution while the update occurs. There is another database option you can enable to address that query delay, and I’ll cover that later in this post.

The question I'm often asked is, "How do you know if automatic updates to statistics are causing performance problems?" One option is to track them, and tie the updates to a change in performance. Many options exist for tracking updates, and in this post we'll review a few of the available methods so you can choose and then implement the option that fits best with your existing method of monitoring for performance problems.

SQL Trace

If you are running SQL Server 2008 R2 or below in your environment, SQL Trace is one method you can use to capture automatic updates. The trace definition script used below only captures the Auto Stats event, which catches when a statistic auto-updates, and when a statistic auto-creates. After this trace has run for a while in your environment, you can load it into a table and then query the output to see what updates occurred. The included script below walks through an example using the baseball stats sample database.

Extended Events

If you are running SQL Server 2012 or higher, I recommend using Extended Events to capture automatic updates. Like the SQL Trace script, the included session definition script only captures the Auto Stats events – again, both auto-updates and auto-creates. Once the XE session has run for a while, you can load the output into a table through the UI and then query it to see what updates occurred. The included script walks through the same example as before, but this time using Extended Events to collect the data.

sys.dm_db_stats_properties

A third option that you could consider to monitor statistics updates is the sys.dm_db_stats_properties DMF, which is only available in SQL Server 2008 R2 SP2 and higher, and SQL Server 2012 SP1 and higher. As much as I love this DMF, this solution is trickier in terms of making sure you captured all the data, and reviewing the output takes more work. With the DMF, every auto-update is not tracked, we just trend statistics update information to see when updates occur.

It's a simple task: you create a table to hold the stats information, and then snapshot information from the DMF to the table on a regular basis. The key here is to figure out how often to capture the data. Every hour is probably overkill, once a day might not be frequent enough. I recommend starting with a SQL Agent job that snapshots the DMF data every four hours. Let that run for a few days, then check your data. If statistics are updating once a day at most, then you can increase the interval to every eight or twelve hours. If statistics are easily updating every four hours, then drop your interval to every two hours – you want to make sure you're capturing each update. For this reason, for some systems, sys.dm_db_stats_properties might not be worth the effort; an XE session or Trace might be simpler.

A final sample script walks through an example of how you would use sys.dm_db_stats_properties to trend updates to statistics. Be aware that this script only captures statistics information for one table. If you alter the script to capture every table in the database, there will be a lot more data to analyze.

Next Steps

Download the sample scripts, and decide which method you should use to track statistics updates.

Once you have the data that shows when automatic updates occur, you need to tie that back to known performance issues. As such, if you're not tracking any performance metrics, then the auto-stats update data isn't going to help with any kind of correlation. Assuming you have timestamps for any performance issue, you can compare it to the StartTime and EndTime from Trace, the timestamp from XE, or last_updated from the sys.dm_db_stats_properties DMF, to determine if the automatic update did affect system performance.

If you cannot make any correlation between the updates and performance issues, then you can rule out updates as the cause of the problem, and focus on another area. If the updates are the root cause, then you have two options: disable the Auto Update Statistics option, or enable the Auto Update Statistics Asynchronously option. Both have pros and cons that you, as the DBA, must consider.

Disabling Auto-Update Statistics

If you choose to disable the auto-update statistics option, the two most important things to know are:

  1. You absolutely must manage your statistics via a maintenance task or custom job.
  2. Queries will not recompile when you update statistics in SQL Server 2008 R2 and below.

I view the second item as a bigger challenge – I'm a big advocate for managing statistics and expect that it's something DBAs are doing anyway. The larger issue is that, even though updating statistics normally causes queries to recompile (to take advantage of the updated stats), this does not occur when you have the auto-update statistics option disabled. I've written about this previously, and recommend reviewing this information if you're not familiar with this behavior. Also see a follow-up post for options to address it.

In general, this is not the path that I recommend. There are very specific edge-cases where this might be appropriate, but I would rather see a DBA perform manual updates (through scheduled jobs) to avoid the automatic updates, and leave the option enabled as a safety measure.

Enabling Auto Update Statistics Asynchronously

When you enable the Auto Update Statistics Asynchronously option, if a statistic has been invalidated and a query that uses that statistic is run, the statistic will not be updated until after the query has completed – the update is asynchronous. The benefit here is that the update will not affect the query that was run; the drawback is that the query will use the existing plan, which may no longer be the optimal plan. Whether the plan is still optimal will depend on your workload (e.g. a reporting workload with long-running queries). As a DBA, you have to weigh the pros and cons of enabling this option, and determine what's best for your database. Note that if you are running SQL Server 2008 through 2012, there is a memory leak associated with this setting. Microsoft does have Cumulative Updates available that provide a fix, but if you don't already have them applied, you face another decision: apply the CU so you can enable the option, or don't apply the CU and don’t enable the option.

Summary

The only way to know if automatic updates to statistics are affecting query performance is to either see an update occur at the same time as a problem, or capture when updates occur and correlate the data to additional information you're capturing about performance problems. The latter option allows you to be proactive – even if you're not having performance issues, it might be a good idea to know how often automatic updates occur. Frequent updates may mean you need to revisit the Agent job that manually manages statistics. In general, leave the option to automatically update statistics enabled, but have a method to manage statistics and use the option as a safety net.