Several years ago, Microsoft published a very useful Knowledge Base article about how to configure SQL Server 2012 and SQL Server 2014 for the best performance with heavy workloads on larger sized, modern server hardware. My colleague Aaron Bertrand and I both had a small role in the vetting of the original KB article. This KB article has been kept pretty well up to date since then, and it is still a great reference for useful configuration options for SQL Server 2012/2014.
Since then, Microsoft has also back ported a number of very useful performance improvements from SQL Server 2016 into to both SQL Server 2012 and SQL Server 2014, as long as you are on a new enough build of either of these older versions of SQL Server. Microsoft recommends that you proactively deploy both Service Packs and Cumulative Updates in order to minimize the chance of encountering issues that were corrected in later builds of SQL Server.
As an added bonus, you will also get new features and other enhancements such as I describe in this article. In most cases, you will have to enable a trace flag to get the benefit of these performance improvements.
Dirty Page Manager
First is the use of Dirty Page Manager (DPM) in conjunction with enabling indirect checkpoints for your user and system databases in SQL Server 2012 and SQL Server 2014. Indirect checkpoints are the default for new databases created on SQL Server 2016, and it is the recommended configuration for SQL Server 2016 instances.
If you enable global Trace Flag 3449 (and you are on SQL Server 2012 SP3 CU3 or later or SQL Server 2014 SP1 CU7 or later), you will get much better performance by avoiding a FlushCache call in a number of different common scenarios, such as backup database, backup transaction log, create database, add a file to a database, recover a database, shrink a database file, and during a SQL Server "graceful" shutdown. Trace Flag 3449 takes effect immediately, with no restart required. You should also set TF 3449 as a startup trace flag.
Avoiding these FlushCache calls is particularly important when you have a large amount of RAM (more than 256GB) in your database server, with the benefit increasing with the size of your buffer pool that is in use. You can read about this enhancement in more detail in this blog post:
Next, if you are on SQL Server 2012 SP4 or SQL Server 2014 SP2 (or later), you will get a number of other new performance enhancements, such as automatic Soft NUMA partitioning and dynamic memory object scaling that were originally introduced in SQL Server 2016.
Automatic Soft NUMA Partitioning
In both SQL Server 2012 SP4 and SQL Server 2014 SP2, when you set Trace Flag 8079 as a startup trace flag, SQL Server will scan the hardware layout during Engine startup and automatically configure Soft NUMA on systems reporting 8 or more physical cores per NUMA node. The automatic soft NUMA behavior is Hyperthreading (HT/logical processor) aware, which means that it works with both Intel Hyper-Threading and AMD SMT. When determining the optimal soft NUMA node layout, the logical CPU information is queried and used to prevent groupings of logical only and physical only nodes which could lead to performance variations across the soft NUMA nodes.
Current server processors can have have up to 32 physical cores in a single NUMA node which can expose SMP-like scalability issues within a single hardware NUMA node. Microsoft claims that they see a noticeable performance improvement from this feature using their internal SQL Server 2016 testing harness:
"With HT aware auto soft-NUMA, we get up-to 30% gain in query performance when DOP is set to the number of physical cores on a socket (12 in this case) using Automatic Soft NUMA."
As always, it is a good idea test the performance of your workload with Automatic Soft NUMA before using it in production.
Dynamic Memory Object Scaling
In both SQL Server 2012 SP4 and SQL Server 2014 SP2, SQL Server will dynamically partition memory objects based on the number of NUMA nodes and logical processor cores to scale better on modern server hardware. The goal of dynamic promotion is to automatically partition a thread safe memory object (CMEMTHREAD) if it becomes a bottleneck.
Unpartitioned memory objects will be dynamically promoted to be partitioned by NUMA node (the number of partitions equals the number of NUMA nodes) based on the workload and bottleneck, and memory objects partitioned by NUMA node can be further promoted to be partitioned by logical CPU cores (the number of partitions equals the number of logical CPU cores). This enhancement eliminates the need for Trace Flag 8048 once you have SQL 2014 SP2 or SQL Server 2012 SP4 installed.
SOS_RWLock Spinlock Improvement
In SQL Server 2014 SP2, there are improvements (again back ported from SQL Server 2016) that remove the need for spinlocks for SOS_RWLock operations. Microsoft describes this improvement in more detail like this:
"The SOS_RWLock is a synchronization primitive used in various places throughout the SQL Server code base. As the name implies the code can have multiple shared (readers) or single (writer) ownership. This improvement removes the need for spinlock for SOS_RWLock and instead uses lock-free techniques similar to in-memory OLTP. With this change, many threads can read a data structure protected by SOS_RWLock in parallel without blocking each other and thereby providing increased scalability. Before this change, the older spinlock implementation allowed only one thread to acquire the SOS_RWLock at a time even to read a data structure."
You will also get a number of useful manageability improvements in both SQL Server 2012 SP4 and SQL Server 2014 SP2. You can read about these improvements in more detail in these blog posts:
The key takeaway from all of this is that if you are running SQL Server 2012 (which fell out of mainstream support on July 11, 2017), you want to be on SQL Server 2012 SP4, while if you are are running SQL Server 2014, you want to be on SQL Server 2014 SP2 or newer. You also will need to enable the applicable trace flags and set the relevant database properties to take advantage of these improvements.