Three Easy SQL Server Performance Wins

As any veteran production DBA knows, you are often under significant pressure to diagnose and alleviate database performance issues as quickly as possible. Here are three things you might be able take advantage of, depending on your workload and infrastructure, to have a very noticeable positive impact on your database performance.

Basic Row Store Index Tuning

Most SQL Server instances that I have encountered in my career have had some relatively easy row store index tuning opportunities. One nice thing about row store index tuning is that it is more often under your direct control as a DBA, especially compared to query or stored procedure tuning, which are often under the control of developers or 3^rd party vendors.

Some DBAs are reluctant to do any index tuning (especially on 3^rd party databases) because they are worried about breaking something or jeopardizing vendor support for the database or application. Obviously, you need to be more careful with 3^rd party databases, and try to reach out to the vendor before making any index changes yourself, but in some situations, you may have no other viable alternative (besides throwing faster hardware and storage at the problem).

You can run a few key queries from my SQL Server Diagnostic Information Queries to get a good idea if you might have some easy index tuning opportunities on your instance or database. You should be on the lookout for missing index requests, missing index warnings, under-used or non-used non-clustered indexes, and possible data compression opportunities.

It does take some experience, good judgement, and knowledge of your workload to do proper index tuning. It is all too common to see people do incorrect index tuning, by rashly making many index changes without doing the proper analysis.

Here are some queries that I like to use, at the database level:

-- Missing Indexes for current database by Index Advantage  (Query 1) (Missing Indexes)

SELECT DISTINCT CONVERT(decimal(18,2), user_seeks * avg_total_user_cost * (avg_user_impact * 0.01)) AS [index_advantage], 
  migs.last_user_seek, mid.[statement] AS [Database.Schema.Table],
  mid.equality_columns, mid.inequality_columns, mid.included_columns,
  migs.unique_compiles, migs.user_seeks, migs.avg_total_user_cost, migs.avg_user_impact,
  OBJECT_NAME(mid.[object_id]) AS [Table Name], p.rows AS [Table Rows]
  FROM sys.dm_db_missing_index_group_stats AS migs WITH (NOLOCK)
  INNER JOIN sys.dm_db_missing_index_groups AS mig WITH (NOLOCK)
  ON migs.group_handle = mig.index_group_handle
  INNER JOIN sys.dm_db_missing_index_details AS mid WITH (NOLOCK)
  ON mig.index_handle = mid.index_handle
  INNER JOIN sys.partitions AS p WITH (NOLOCK)
  ON p.[object_id] = mid.[object_id]
  WHERE mid.database_id = DB_ID()
  AND p.index_id < 2 
  ORDER BY index_advantage DESC OPTION (RECOMPILE);

  ------
  -- Look at index advantage, last user seek time, number of user seeks to help determine source and importance
  -- SQL Server is overly eager to add included columns, so beware
  -- Do not just blindly add indexes that show up from this query!!!

  -- Find missing index warnings for cached plans in the current database  (Query 2) (Missing Index Warnings)
  -- Note: This query could take some time on a busy instance

  SELECT TOP(25) OBJECT_NAME(objectid) AS [ObjectName], 
                 cp.objtype, cp.usecounts, cp.size_in_bytes, query_plan
  FROM sys.dm_exec_cached_plans AS cp WITH (NOLOCK)
  CROSS APPLY sys.dm_exec_query_plan(cp.plan_handle) AS qp
  WHERE CAST(query_plan AS NVARCHAR(MAX)) LIKE N'%MissingIndex%'
  AND dbid = DB_ID()
  ORDER BY cp.usecounts DESC OPTION (RECOMPILE);

  ------
  -- Helps you connect missing indexes to specific stored procedures or queries
  -- This can help you decide whether to add them or not
  -- Possible Bad NC Indexes (writes >= reads)  (Query 3) (Bad NC Indexes)

  SELECT OBJECT_NAME(s.[object_id]) AS [Table Name], i.name AS [Index Name], i.index_id, 
  i.is_disabled, i.is_hypothetical, i.has_filter, i.fill_factor,
  s.user_updates AS [Total Writes], s.user_seeks + s.user_scans + s.user_lookups AS [Total Reads],
  s.user_updates - (s.user_seeks + s.user_scans + s.user_lookups) AS [Difference]
  FROM sys.dm_db_index_usage_stats AS s WITH (NOLOCK)
  INNER JOIN sys.indexes AS i WITH (NOLOCK)
  ON s.[object_id] = i.[object_id]
  AND i.index_id = s.index_id
  WHERE OBJECTPROPERTY(s.[object_id],'IsUserTable') = 1
  AND s.database_id = DB_ID()
  AND s.user_updates > (s.user_seeks + s.user_scans + s.user_lookups)
  AND i.index_id > 1 AND i.[type_desc] = N'NONCLUSTERED'
  AND i.is_primary_key = 0 AND i.is_unique_constraint = 0 AND i.is_unique = 0
  ORDER BY [Difference] DESC, [Total Writes] DESC, [Total Reads] ASC OPTION (RECOMPILE);

  ------
  -- Look for indexes with high numbers of writes and zero or very low numbers of reads
  -- Consider your complete workload, and how long your instance has been running
  -- Investigate further before dropping an index!
  -- Breaks down buffers used by current database by object (table, index) in the buffer cache  (Query 4) (Buffer Usage)
  -- Note: This query could take some time on a busy instance
  SELECT OBJECT_NAME(p.[object_id]) AS [Object Name], p.index_id, 
  CAST(COUNT(*)/128.0 AS DECIMAL(10, 2)) AS [Buffer size(MB)],  
  COUNT(*) AS [BufferCount], p.[Rows] AS [Row Count],
  p.data_compression_desc AS [Compression Type]
  FROM sys.allocation_units AS a WITH (NOLOCK)
  INNER JOIN sys.dm_os_buffer_descriptors AS b WITH (NOLOCK)
  ON a.allocation_unit_id = b.allocation_unit_id
  INNER JOIN sys.partitions AS p WITH (NOLOCK)
  ON a.container_id = p.hobt_id
  WHERE b.database_id = CONVERT(int, DB_ID())
  AND p.[object_id] > 100
  AND OBJECT_NAME(p.[object_id]) NOT LIKE N'plan_%'
  AND OBJECT_NAME(p.[object_id]) NOT LIKE N'sys%'
  AND OBJECT_NAME(p.[object_id]) NOT LIKE N'xml_index_nodes%'
  GROUP BY p.[object_id], p.index_id, p.data_compression_desc, p.[Rows]
  ORDER BY [BufferCount] DESC OPTION (RECOMPILE);

  ------
  -- Tells you what tables and indexes are using the most memory in the buffer cache
  -- It can help identify possible candidates for data compression

Using Delayed Durability

The delayed durability feature was added to the product in SQL Server 2014, so it has been available for quite some time. Delayed durable transaction commits are asynchronous and report a transaction commit as successful before the log records for the transaction are actually written to the storage subsystem. Delayed durable transactions don’t actually become durable until the transaction log entries are flushed to disk.

This feature is available in all editions of SQL Server. Despite this, I rarely see it being used when I look at client databases. Delayed durability does open up the possibility of some data loss, up to an entire log buffer in a worst case scenario (as explained by Paul Randal here), so it is definitely not appropriate for an RPO scenario where absolutely no data loss is acceptable.

Delayed durability reduces transaction latency because it doesn’t wait for log IO to finish and return control back to the client, and it also reduces locking and disk contention for concurrent transactions. These two benefits can often have a very positive effect on your query and application performance with the appropriate very write-heavy workload.

Delayed durability will most often help heavy OLTP-type workloads that have very frequent, small write transactions where you are seeing high file-level write latency from sys.dm_io_virtual_file_stats on the transaction log file and/or you are seeing high WRITELOG waits from sys.dm_os_wait_stats.

You can easily force SQL Server 2014 or newer to use delayed durability for all transactions (with no code changes) by running the following command:

ALTER DATABASE AdventureWorks2014 SET DELAYED_DURABILITY = FORCED;

I have had clients who programmatically toggle delayed durability on and off at different times of the day (such as during scheduled ETL or maintenance activity). I have also had clients who use delayed durability at all times since they have an appropriate workload and data loss risk tolerance.

Finally, I have had clients who would never consider using delayed durability, or simply don’t need it with their workload. If you suspect that your workload might benefit from using delayed durability, but you are concerned about the possible data loss, then there are other alternatives that you can consider.

One alternative is the persisted log buffer feature in SQL Server 2016 SP1, where you can create a second, 20MB transaction log file on a direct access mode (DAX) storage volume that is hosted on an NV-DIMM persistent memory device. This extra transaction log file is used to cache the tail of the log, with byte-level access that bypasses the conventional block-level storage stack.

If you think that your workload might benefit from using the persisted log buffer feature, you can experiment with temporarily using delayed durability to see if there is an actual performance benefit with your workload before you spend the money on the NV-DIMM persistent memory that you would need to use the persisted log buffer feature.

Moving tempdb to Intel Optane DC P4800X Storage

I have had great success with several recent clients who moved their tempdb database files from some other type of storage to a logical drive that was backed by a couple of Intel Optane DC P4800X PCIe NVMe storage cards (in a software RAID 1 array).

These storage cards are available in 375GB, 750GB, and 1.5TB capacities (although the 1.5TB capacity is brand new and still hard to find). They have extremely low latency (much lower than any type of NAND flash storage), excellent random I/O performance at low queue depths (much better than NAND flash storage), with consistent read response times under a very heavy write workload.

They also have higher write endurance than “write-intensive” enterprise NAND flash storage, and their performance does not deteriorate as they are close to being full. These characteristics make these cards extremely well-suited for many heavy tempdb workloads, particularly heavy OLTP workloads and situations where you are using RCSI in your user databases (which puts the resulting version store workload on tempdb).

It is also very common to see high file-level write latency on tempdb data files from the sys.dm_io_virtual_file_stats DMV, so moving your tempdb data files to Optane storage is one way to directly address that issue, that might be quicker and easier than conventional workload tuning.

Another possible use for Optane storage cards is as a home for your transaction log file(s). You can also use Optane storage with legacy versions of SQL Server (as long as your OS and hardware support it). It is a possible alternative to using delayed durability (which requires SQL Server 2014) or using the persisted log buffer feature (which requires SQL Server 2016 SP1).

Conclusion

I discussed three techniques for scoring a quick performance win with SQL Server:

Conventional row store index tuning is applicable to all versions of SQL Server, and it is one of the best tools in your arsenal.
Delayed durability is available in SQL Server 2014 and newer, and it can be very beneficial with some workload types (and RPO requirements). Persisted log buffer is available in SQL Server 2016 SP1, and it gives similar benefits as delayed durability, without the danger of data loss.
Moving certain types of database files to Intel Optane storage can help alleviate performance issues with tempdb or with user database transaction log files. You can use Optane storage with legacy versions of SQL Server, and there are no code or configuration changes required.

Three Easy SQL Server Performance Wins

Basic Row Store Index Tuning

Using Delayed Durability

Moving tempdb to Intel Optane DC P4800X Storage

Conclusion

1 thought on “Three Easy SQL Server Performance Wins”