On November 16, 2016, Microsoft announced some very significant changes for SQL Server 2016 Standard Edition, which were implemented in SQL Server 2016 Service Pack 1 (Build 13.0.4001.0). Many very useful programmability-related features that were previously only available in Enterprise Edition will now be available in Standard Edition (and also in Web Edition and even Express Edition).
Once you have a database application using SQL Server 2016 Standard Edition Service Pack 1 (or even a lower edition), you can just perform an edition upgrade to Enterprise Edition to get even more scalability and performance, taking advantage of the higher license limits for sockets, cores, and memory in Enterprise Edition, as detailed here.
You will also get the many other intrinsic performance benefits that are present in Enterprise Edition, along with multiple manageability improvements that make your life so much easier as a DBA.
Columnstore Indexes
If you are using Columnstore indexes, you get the following performance benefits automatically, when you use Enterprise Edition:
- Aggregate Pushdown: This performance feature often gives a 2X-4X query performance gain by pushing qualifying aggregates to the SCAN node, which reduces the number of rows coming out of that iterator.
- Index Build/Rebuild: Enterprise Edition can build/rebuild columnstore indexes with multiple processor cores, while Standard Edition only uses one processor core. This has a pretty significant effect on elapsed times for these operations, depending on your hardware.
- Local Aggregates: Enterprise Edition can use local aggregations to filter the number of rows passing out of a SCAN node, reducing the amount of work that needs to be done by subsequent query nodes. You can confirm this by looking for the “ActualLocallyAggregatedRows” attribute in the XML of the execution plan for the query.
- Single Instruction Multiple Data (SIMD) Optimizations: This feature uses a set of hardware instructions which are capable of processing an array of data in a single instruction, dramatically speeding up aggregate operations. These hardware instructions are present on all modern processors (that have AVX support), but they are only used by Enterprise Edition.
- String Predicate Pushdown: This performance feature can improve performance of queries using predicate(s) on string columns by pushing these predicates to the SCAN node. This can greatly reduce the amount of work that needs to be done by subsequent nodes.
- Degree of Parallelism: Batch mode queries are limited to MAXDOP = 2 on Standard Edition. Enterprise Edition can use all of the cores that are present for the instance. This can be very significant on larger queries on typical, modern server hardware.
- Memory limits: The Columnstore object pool is limited to 32GB per instance on Standard Edition. Enterprise Edition does not have any memory limitation for the Columnstore object pool.
In order to test these performance assertions, I ran some fairly simple tests on the Microsoft ContosoRetailDW database on my Intel Core i7-6700K workstation. I have two named instances of SQL Server 2016 SP1 installed, one using Standard Edition, and the other using Developer Edition (which is equivalent to Enterprise Edition).
All instance-level and database-level configurations and properties are identical between the two instances, and the user and tempdb database file locations are in separate directories on the same, separate flash storage device for each instance. The database compatibility level was changed to 130 in both cases, and the underlying Windows configuration and hardware is the same for both instances. The only difference here is the Edition of each instance.
The first test is a simple query (adapted from Niko Neugebauer) that allows SQL Server 2016 to use aggregate pushdown on the FactOnlineSales table. The results are shown in Table 1.
Edition | Elapsed Time (ms) |
---|---|
Standard Edition | 30 |
Developer Edition | 1 |
Time Difference | 29 |
% Improvement | 96.7% |
Table 1: Aggregate Pushdown Comparison
The next test is timing how long it takes to build a clustered columnstore index on the 12.6 million row FactOnlineSales table. The results are shown in Table 2.
Edition | Elapsed Time (ms) |
---|---|
Standard Edition | 42,197 |
Developer Edition | 14,384 |
Time Difference | 27,813 |
% Improvement | 65.9% |
Table 2: Building Clustered Columnstore Index Comparison
The next test is timing how long it takes to rebuild a clustered columnstore index on the same FactOnlineSales table. The results are shown in Table 3.
Edition | Elapsed Time (ms) |
---|---|
Standard Edition | 33,105 |
Developer Edition | 11,460 |
Time Difference | 21,645 |
% Improvement | 65.4% |
Table 3: Rebuilding Clustered Columnstore Index Comparison
The next test is another simple query that allows SQL Server 2016 to use local aggregation on the FactOnlineSales table. The results are shown in Table 4.
Edition | Elapsed Time (ms) |
---|---|
Standard Edition | 122 |
Developer Edition | 83 |
Time Difference | 39 |
% Improvement | 32.0% |
Table 4: Local Aggregation Comparison
The next test is another simple query that allows SQL Server 2016 to use string predicate pushdown on the FactOnlineSales and DimPromotion tables. The results are shown in Table 5.
Edition | Elapsed Time (ms) |
---|---|
Standard Edition | 2,683 |
Developer Edition | 1,221 |
Time Difference | 1,466 |
% Improvement | 54.6% |
Table 5: String Predicate Pushdown Comparison
These are just some simple examples of the built-in performance advantages for Columnstore indexes in SQL Server 2016 Enterprise Edition compared to SQL Server 2016 Standard Edition on the same hardware. If you want to really dive into Columnstore indexes (which can be a very effective feature for some workloads), you should bookmark and read Niko Neugebauer’s long series of posts at columnstore.net.
DBCC CHECKDB Performance
Another manageability performance improvement that is present on SQL Server 2016 Enterprise Edition is DBCC CHECKDB performance. On Standard Edition, DBCC CHECKDB only uses one processor core, while it can use all available cores on Enterprise Edition. This behavior is unchanged from previous versions of SQL Server. SQL Server 2016 does allow you to restrict the number of cores that DBCC CHECKDB can use with a new WITH (MAXDOP = x) option.
Running DBCC CHECKDB with the WITH PHYSICAL_ONLY option on a somewhat larger database (about 38GB) that I have, yielded the results shown in Table 6.
Edition | Elapsed Time (ms) |
---|---|
Standard Edition | 58,492 |
Developer Edition | 24,897 |
Time Difference | 33,595 |
% Improvement | 57.4% |
Table 6: DBCC CHECKDB WITH PHYSICAL_ONLY Comparison
Running a standard DBCC CHECKDB on the same database yielded the results shown in Table 7.
Edition | Elapsed Time (ms) |
---|---|
Standard Edition | 435,039 |
Developer Edition | 119,767 |
Time Difference | 315,272 |
% Improvement | 72.5% |
Table 7: DBCC CHECKDB Comparison
A very important factor in DBCC CHECKDB performance is the sequential read performance from all of the LUNs where your database data file(s) are located. You can easily check this by running a BACKUP DATABASE command to a NUL device (making sure to use the COPY_ONLY and NO_COMPRESSION options). This will show you your effective sequential read performance, as shown in this example from my workstation:
Keep in mind that all of this testing was done on a single, quad-core desktop processor. A multi-socket server with many more total processor cores will show even more of a performance improvement in many of these tests.
The point of all of this is to show a few tangible examples of the performance improvements you can see, simply by upgrading from SQL Server 2016 Standard Edition SP1 to SQL Server 2016 Enterprise Edition SP1, on the same hardware, making no database or application changes. This list is by no means exhaustive, as there are many other benefits as well.
I always thought developer edition is same as enterprise,with the exception of not able to use in production..
That's right, the only difference is the licensing (and what gets returned from @@VERSION, SERVERPROPERTY(), etc).
Thank you Aaron,i misread the comparisons
Not only the licensing, but the memory and cpu limitation for the Standard edition.
If you have one good way to do things and one better way to do things. Why. .. other than money. … do you want to do it on a good way if you can do it on a better way? Then money spend on hardware become waste of money. Hardware people should be against it.
Is it me, or is there an error in the percentage in Table 1: Aggregate Pushdown Comparison?
I don't think so, taking something from 30 seconds to 1 second is a savings of 29 seconds, and 29/30 = 0.96666666666…
What are you trying to say here, Gilberto?
In an ideal world, where money was no object, everyone would be using SQL Server 2016 Enterprise Edition since it has so many useful features that are not available in other Editions. It also has a number of built-in performance advantages, which was the main point of my article.
I understand the concept aaron, however using that calculation, I would say you "reduced" execution time by 96%. Either way, it's good news :)
Still not sure I understand the problem. Glenn's original wording was "improvement" – do you not consider a "reduction" in execution time an "improvement"?
The formula of (baseline – new) / baseline is great when you are tracking throughput. For example, 100 txns took 10 sec, that is 10 txns/sec. You made a change, now 100 txns took 3 sec, that is 33.3 txns/sec.
To calculate the change in throughput, it is (33.3 – 10) / 10 *100 = 233% improvement. Similarly, to calculate the change in execution time, it is ((10 / 3) – 1) * 100 = 233%. This way, you have ability to get the same number, no matter which metric you choose to calculate.
In the cases above, with the calculation being used for execution time, you will never get better than 100%, if you use the word improvement.
IMO, SQL Enterprise is being short changed based on semantics :)
Okay. I don't think the concept of calling "reducing runtime by 97%" an "improvement" is going to confuse anyone.
Great article!
On the minor issue of the calculated value of the "improvement", I agree with the other commenters.
The formula that led to 97% (meaning how much of the original execution time was shoved using the new process) might not be the most appropriate when comparing processing speeds.
For many people, "improvement" of the execution duration means how much "faster" is the new value (1s) relative to the reference value (30s here).
E.g., dropping the execution time from 30s to 1s gives a 30 x increase in processing speed. I take that as a 3000% "improvement" rather than 97%.
When reading the article, a "% Improvement of 96.7%" made me think that "the new process is almost twice as fast as the reference one" which would be misleading here.
Thanks and have a great 2017!
It all depends if you're trying to improve the duration or the speed. I focus on spending less time doing something. If I can spread a serial task across multiple processors, for example, the CPUs are not doing anything faster, they're just each doing less work at the same time. That's an improvement in duration, not speed.
If people are really concerned about these duration and improvement semantics, they are certainly free to do their own testing and/or calculations. Again, the main point here is to help make the case for why Enterprise Edition is a good investment for performance and scalability reasons, over and above other feature and management advantages.
Great article and excellent information!
However, I’m in agreeance with others as well. It doesn’t make sense to do the (Time Diff) / (Standard Edition) calculation, even if you’re talking about duration. In your last example Developer Edition actually performed the work (regardless of how) 363.2% faster.
It’s also a lot more math to estimate what someone’s production load would look like if they were to move to Enterprise Edition. So if for no other reason than in the interest of simplicity, I would certainly update those mathematical methods.
Hi Glenn,
Wanted to clarify one point for others who will read this excellent post.
You wrote: "The Columnstore object pool is limited to 32GB per instance on Standard Edition".
That's true for columnstore on disk-based tables only.
For memory-optimized tables, the memory used by columnstore counts towards the per-db limit of 32GB. It's a subtle but important distinction.
Excellent article, really interesting to see the differences and why enterprise edition needs the big bucks! :)