So, in SQL Server 2016, statistics updates using sample mode now run in parallel under compatibility level 130, and this is how it works by default, for all automatic and manual statistics updates. This is explained briefly here:
(The documentation has also been updated, both the Compatibility Level topic and the UPDATE STATISTICS topic.)
Wouldn't it be nice, though, to be able to specify how many CPUs can actually be used for these operations (other than just allowing the cap of 16)? I think that being able to limit this to 4 or 8 would be an obvious and logical thing to support. Especially for customers running systems with 16 or fewer cores, or multiple instances on a box, who can't rely on Enterprise features like Resource Governor (which most Enterprise Customers couldn't be bothered using either, IMHO).
The business justification for this would be the same as the justifications used for adding MAXDOP support REBUILD, DBCC CHECKDB and its family of maintenance operations, etc. You want to prevent this type of activity from taking over all the cores, without doing something as drastic as turning off auto-updates or using instance-wide MAXDOP – because not everybody has the luxury of maintenance windows.
And in this case, instance-wide MAXDOP won't help anyway, because SQL Server 2016 RTM has a bug where MAXDOP is ignored for sampled statistics updates. A fix is forthcoming, but I thought you should know; if this is causing you an issue, one option is to use a lower compatibility level.
But I will reiterate something I say often: Compatibility level is getting far too crowded. If I want parallel sampled stats on my database but I have enough cardinality estimation regressions to require the old CE, I have to pick one or the other.
And another thing: Resource Governor is overkill for this use case, and limiting core usage from statistics updates shouldn't really be an Enterprise feature (just like the REBUILD and CHECKDB mentioned above). Please don't tell me that RG is an acceptable alternative, because it's only possible for users with Enterprise Edition *and* workload classifications that should be constrained by MAXDOP all the time. I should be able to limit this by specific operation (or, say, for only my biggest/problem tables), not by constraining a login's entire session.
How I wish they would do it
Ideally, we would be able to set this at the database level, using the new DATABASE SCOPED CONFIGURATION option, and at the statement level, using the familiar OPTION (MAXDOP n) syntax. Statement level would win, and any sample mode statistics updates (including automatic) without an explicit MAXDOP hint would fall back to the database level setting. This would allow me to set a MAXDOP of 4, for example, for all automatic statistics updates that happen at unpredictable times, but 8 or 16 for manual operations in known maintenance windows. As one example.
If you want to vote for this, please see the following Connect item, and add a business justification for this (a la Michael Campbell):
Of course, that item has been there since 2010, so there is no mention at all about the DATABASE SCOPED CONFIGURATION avenue, which is why I left a comment, too.
In the meantime, if you want to disable parallelism for sample mode, there is a trace flag to effectively return to older behavior (you can also do this by reverting to a compatibility level less than 130, but I don't recommend this because it affects a lot of other things). I will update this space when I've been given the okay to disclose the trace flag publicly, but right now, Microsoft is holding it tight to their chest.
3 thoughts on “A potential improvement for statistics updates : MAXDOP”
The Resource Governor isn't a viable option to control AUTOMATED parallel stats operations because EVEN IF I've got Enterprise Edition and have set up the Resource Governor, I've done that for 'unruly' or problematic users/applications.If I've got 2 pools for 'problematic' users/apps or things I want to constrain, but have mission critical systems i don't want to constrain at all, what happens when a query there triggers a stats operation? I all of a sudden burn up 16 cores to run the sampled stats operation?
i.e., you're right – there needs to be a better way to manage these details (and your ideals for how this could be tackled make perfect sense to me).
It's sad if it takes a blog post from a known community member in a very visible place to get an obvious feature implemented. It must be super cheap to add this feature. Stats updates run through the query processor anyway. They just have to declare the syntax and pass an integer through to the query processor.
Sometimes I'm not convinced the SQL Server product team understands the idea of a cost/benefit analysis.
Truly agree. We should have this setup. Even for the new optimizer issues I fought with the SQL team almost 2 months with no fixes.
I would also love to see this implemented in the upcoming CU or SP for the 2014 version itself as I need not wait for this simple fix to be available till next higher version.
Hope SQL team listens one day.
Comments are closed.