An approach to index tuning - Part 2

In my last post, I started to outline the process I go through when tuning queries – specifically when I discover that I need to add a new index, or modify an existing one. Up to this point we’ve identified the problematic query, the index I need, what indexes currently exist on the table, and whether or not those indexes are being used. Once we have that data, we can move on to the next steps in the process.

Step 5: What Uses an Index

In addition to seeing how often an index is used (or not), it’s beneficial to know what queries use an index, particularly if I’m looking to merge it with another index. Fortunately, Jonathan Kehayias has already written a query to help identify which plans use a specific index. His version can be used for the plan cache – the only challenge there is the information is transient, so you may not capture every query that uses a particular index. Query Store can help with that – I’ve modified his query to get the same information from the plans in Query Store:

  SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
  DECLARE @IndexName AS NVARCHAR(128) = N'[IX_Sales_OrderLines_AllocatedStockItems]',
          @lb AS nchar(1) = N'[', @rb AS nchar(1) = N']';

  -- Make sure the name passed is appropriately quoted
  IF (LEFT(@IndexName, 1) <> @lb AND RIGHT(@IndexName, 1) <> @rb) SET @IndexName = QUOTENAME(@IndexName);

  --Handle the case where the left or right was quoted manually but not the opposite side
  IF LEFT(@IndexName, 1)  <> @lb SET @IndexName = @rb + @IndexName;
  IF RIGHT(@IndexName, 1) <> @rb SET @IndexName = @IndexName + @rb;

  ;WITH XMLNAMESPACES (DEFAULT 'http://schemas.microsoft.com/sqlserver/2004/07/showplan')   
  SELECT
    stmt.value('(@StatementText)[1]', 'varchar(max)') AS SQL_Text,
    obj.value('(@Database)[1]', 'varchar(128)') AS DatabaseName,
    obj.value('(@Schema)[1]', 'varchar(128)') AS SchemaName,
    obj.value('(@Table)[1]', 'varchar(128)') AS TableName,
    obj.value('(@Index)[1]', 'varchar(128)') AS IndexName,
    obj.value('(@IndexKind)[1]', 'varchar(128)') AS IndexKind,
    query_plan
  FROM 	
  (
    SELECT query_plan
    FROM
    (
      SELECT TRY_CONVERT(XML, [qsp].[query_plan]) AS [query_plan]
      FROM sys.query_store_plan [qsp]
    ) tp
  ) AS tab (query_plan)
  CROSS APPLY query_plan.nodes('/ShowPlanXML/BatchSequence/Batch/Statements/StmtSimple') AS batch(stmt)
  CROSS APPLY stmt.nodes('.//IndexScan/Object[@Index=sql:variable("@IndexName")]') AS idx(obj)
  OPTION(MAXDOP 1, RECOMPILE);

It’s worth noting that this is another point where I can find myself very deep in a rabbit hole, depending on the number of indexes I’m reviewing and the number of queries that use them. If possible, I’ll also consider execution counts (from Query Store or the plan cache) to not just understand what query uses an index, but how frequently that query executes. This is where index tuning becomes an art. I can collect a ridiculous amount of data… but I don’t have infinite time for analysis, so I have to make a judgement call as to how many queries I’m going to review.

Step 6: Testing

At its simplest, testing an index means taking the problematic query and capturing the plan and performance data (duration, IO, CPU, etc.), and then creating the index, re-running the query and capturing the same information. If performance improves, you’re good to go!

It’s rarely that simple.

To begin with, I often have at least two variations of an index that I want to test, sometimes more. I start with my baseline, then I create all the index variations, clear the plan cache, and see what SQL Server chooses. Then I roll through and force each index with a hint, capturing the plan and performance metrics for each execution. Note: this assumes I have enough disk space for all the indexes… if not, then I create them one at a time, and test. Finally, I compare the numbers. If I am just adding a new index, I am almost done. But if I’m modifying an index or merging a couple together, it can get complicated.

In an ideal world, if I modify an existing index, I find the most frequent/important queries that use the current index and get their plans and performance metrics (this is easy with Query Store). Then I change the index, run all those queries again, and see if I get significant changes in plan shape and/or performance.

If I merge two indexes, I do the same thing, but with all queries that use either index, and then re-test with the merged index.

If I’m adding/changing/merging multiple indexes for a table, then I need to get all relevant queries and their plans and metrics, change the indexes, then get all the information again and compare. This can be extremely time-consuming, depending on how many different queries there. This is where it’s an art form and you have to determine how many queries you really need to test. It’s a function of frequency of execution, query importance/relevance, and the time I have available/allocated.

Finally, if I add an index to a table, and I don’t remove any existing ones, then I have added overhead for INSERTs, DELETEs, and potentially UPDATEs. Performance testing this change is possible, but you need a test environment and the ability to run a load test and capture pre- and post-change metrics related to duration, IO, and CPU.

It’s a lot friends, which is why it’s ironic that I initially thought about stating that index tuning was easy. It may not always be simple, but it is possible. It’s a matter of diligence and keeping track of it all.

Step 7: Implementation

After I’ve vetted the new index(es) as much as possible, we are ready for production. I admit that I view index changes as low risk, particularly new ones. If it’s a problem, you can drop it immediately, and revert to the original state. With a modify/merge/drop scenario, you want to have everything scripted, so you can change and recreate indexes as needed to reset the indexes. I always recommend initially disabling indexes instead of dropping them, as then you don’t have to worry about the definition – if you need to add the index back you simply rebuild it.

Summary

Your method for adding and/or consolidating indexes may be different! Just like query tuning, there is no perfect process. For anyone new to index tuning, this hopefully provides a starter of items to review and important considerations. It’s impossible to add indexes without adding some amount of overhead – and again this is where the art comes in: you have to determine if the benefit of the index outweighs its cost for modifications.

Index tuning is a perpetual, iterative process – I don’t think you’re ever done, because code changes, new tables or functionality are added, and data in the tables changes. Kimberly has two posts (https://www.sqlskills.com/blogs/kimberly/spring-cleaning-your-indexes-part-i/ and https://www.sqlskills.com/blogs/kimberly/spring-cleaning-your-indexes-part-ii/) that talk about cleaning up your indexes — now is as good as time as any to get started! And finally, whenever someone asks, “how many indexes should there be for a table?” I answer with something like, “the fewest number you need to satisfy as many queries as possible.” There is no magic number — I have seen tables with zero indexes, and I have seen tables with over 100 (I am sure some of you have seen higher counts). Neither zero nor 100 is good, but the “right” number is one you have to figure out using the data available and your experience.

An approach to index tuning – Part 2

Step 5: What Uses an Index

Step 6: Testing

Step 7: Implementation

Summary

1 thought on “An approach to index tuning – Part 2”