One of the first things I look at when I'm troubleshooting a performance issue is wait statistics via the sys.dm_os_wait_stats DMV. To see what SQL Server is waiting on, I use the query from Glenn Berry’s current set of SQL Server Diagnostic Queries. Depending on the output, I start digging into specific areas within SQL Server.
As an example, if I see high CXPACKET waits I check the number of cores on the server, the number of NUMA nodes, and the values for max degree of parallelism and cost threshold for parallelism. This is background information that I use to understand the configuration. Before I even consider making any changes, I gather more quantitative data, as a system with CXPACKET waits does not necessarily have an incorrect setting for max degree of parallelism.
Similarly, a system that has high waits for I/O-related wait types such as PAGEIOLATCH_XX, WRITELOG, and IO_COMPLETION does not necessarily have an inferior storage subsystem. When I see I/O-related wait types as the top waits, I immediately want to understand more about the underlying storage. Is it direct-attached storage or a SAN? What is the RAID level, how many disks exist in the array, and what are the speed of the disks? I also want to know if other files or databases share the storage. And while it's important to understand the configuration, a logical next step is to look at virtual file stats via the sys.dm_io_virtual_file_stats DMV.
Introduced in SQL Server 2005, this DMV is a replacement for the fn_virtualfilestats function that those of you who ran on SQL Server 2000 and earlier probably know and love. The DMV contains cumulative I/O information for each database file, but the data resets on instance restart, when a database is closed, taken offline, detached and reattached, etc. It's critical to understand that virtual file stats data is not representative of current performance – it is a snapshot that is an aggregation of I/O data since the last clearing by one of the aforementioned events. Even though the data is not point-in-time, it can still be useful. If the highest waits for an instance are I/O-related, but the average wait time is less than 10 ms, storage is probably not an issue – but correlating the output with what you see in sys.dm_io_virtual_stats is still worthwhile to confirm low latencies. Further, even if you see high latencies in sys.dm_io_virtual_stats, you still haven’t proven that storage is a problem.
To look at virtual file stats, I set up two copies of the AdventureWorks2012 database, which you can download from Codeplex. For the first copy, hereafter known as EX_AdventureWorks2012, I ran Jonathan Kehayias' script to expand the Sales.SalesOrderHeader and Sales.SalesOrderDetail tables to 1.2 million and 4.9 million rows, respectively. For the second database, BIG_AdventureWorks2012, I used the script from my previous partitioning post to create a copy of the Sales.SalesOrderHeader table with 123 million rows. Both databases were stored on an external USB drive (Seagate Slim 500GB), with tempdb on my local disk (SSD).
Before testing, I created four custom stored procedures in each database (Create_Custom_SPs.zip), which would serve as my "normal" workload. My testing process was as follows for each database:
- Restart the instance.
- Capture virtual file stats.
- Run the "normal" workload for two minutes (procedures called repeatedly via a PowerShell script).
- Capture virtual file stats.
- Rebuild all indexes for the appropriate SalesOrder table(s).
- Capture virtual file stats.
To capture virtual file stats, I created a table to hold historical information, and then used a variation of Jimmy May's query from his DMV All-Stars script for the snapshot:
USE [msdb]; GO CREATE TABLE [dbo].[SQLskills_FileLatency] ( [RowID] [INT] IDENTITY(1,1) NOT NULL, [CaptureID] [INT] NOT NULL, [CaptureDate] [DATETIME2](7) NULL, [ReadLatency] [BIGINT] NULL, [WriteLatency] [BIGINT] NULL, [Latency] [BIGINT] NULL, [AvgBPerRead] [BIGINT] NULL, [AvgBPerWrite] [BIGINT] NULL, [AvgBPerTransfer] [BIGINT] NULL, [Drive] [NVARCHAR](2) NULL, [DB] [NVARCHAR](128) NULL, [database_id] [SMALLINT] NOT NULL, [file_id] [SMALLINT] NOT NULL, [sample_ms] [INT] NOT NULL, [num_of_reads] [BIGINT] NOT NULL, [num_of_bytes_read] [BIGINT] NOT NULL, [io_stall_read_ms] [BIGINT] NOT NULL, [num_of_writes] [BIGINT] NOT NULL, [num_of_bytes_written] [BIGINT] NOT NULL, [io_stall_write_ms] [BIGINT] NOT NULL, [io_stall] [BIGINT] NOT NULL, [size_on_disk_MB] [NUMERIC](25, 6) NULL, [file_handle] [VARBINARY](8) NOT NULL, [physical_name] [NVARCHAR](260) NOT NULL ) ON [PRIMARY]; GO CREATE CLUSTERED INDEX CI_SQLskills_FileLatency ON [dbo].[SQLskills_FileLatency] ([CaptureDate], [RowID]); CREATE NONCLUSTERED INDEX NCI_SQLskills_FileLatency ON [dbo].[SQLskills_FileLatency] ([CaptureID]); DECLARE @CaptureID INT; SELECT @CaptureID = MAX(CaptureID) FROM [msdb].[dbo].[SQLskills_FileLatency]; PRINT (@CaptureID); IF @CaptureID IS NULL BEGIN SET @CaptureID = 1; END ELSE BEGIN SET @CaptureID = @CaptureID + 1; END INSERT INTO [msdb].[dbo].[SQLskills_FileLatency] ( [CaptureID], [CaptureDate], [ReadLatency], [WriteLatency], [Latency], [AvgBPerRead], [AvgBPerWrite], [AvgBPerTransfer], [Drive], [DB], [database_id], [file_id], [sample_ms], [num_of_reads], [num_of_bytes_read], [io_stall_read_ms], [num_of_writes], [num_of_bytes_written], [io_stall_write_ms], [io_stall], [size_on_disk_MB], [file_handle], [physical_name] ) SELECT --virtual file latency @CaptureID, GETDATE(), CASE WHEN [num_of_reads] = 0 THEN 0 ELSE ([io_stall_read_ms]/[num_of_reads]) END [ReadLatency], CASE WHEN [io_stall_write_ms] = 0 THEN 0 ELSE ([io_stall_write_ms]/[num_of_writes]) END [WriteLatency], CASE WHEN ([num_of_reads] = 0 AND [num_of_writes] = 0) THEN 0 ELSE ([io_stall]/([num_of_reads] + [num_of_writes])) END [Latency], --avg bytes per IOP CASE WHEN [num_of_reads] = 0 THEN 0 ELSE ([num_of_bytes_read]/[num_of_reads]) END [AvgBPerRead], CASE WHEN [io_stall_write_ms] = 0 THEN 0 ELSE ([num_of_bytes_written]/[num_of_writes]) END [AvgBPerWrite], CASE WHEN ([num_of_reads] = 0 AND [num_of_writes] = 0) THEN 0 ELSE (([num_of_bytes_read] + [num_of_bytes_written])/([num_of_reads] + [num_of_writes])) END [AvgBPerTransfer], LEFT([mf].[physical_name],2) [Drive], DB_NAME([vfs].[database_id]) [DB], [vfs].[database_id], [vfs].[file_id], [vfs].[sample_ms], [vfs].[num_of_reads], [vfs].[num_of_bytes_read], [vfs].[io_stall_read_ms], [vfs].[num_of_writes], [vfs].[num_of_bytes_written], [vfs].[io_stall_write_ms], [vfs].[io_stall], [vfs].[size_on_disk_bytes]/1024/1024. [size_on_disk_MB], [vfs].[file_handle], [mf].[physical_name] FROM [sys].[dm_io_virtual_file_stats](NULL,NULL) AS vfs JOIN [sys].[master_files] [mf] ON [vfs].[database_id] = [mf].[database_id] AND [vfs].[file_id] = [mf].[file_id] ORDER BY [Latency] DESC;
I restarted the instance and then immediately captured file stats. When I filtered the output to only view the EX_AdventureWorks2012 and tempdb database files, only tempdb data was captured as no data had been requested from the EX_AdventureWorks2012 database:
I then ran the "normal" workload for two minutes (the number of executions of each stored procedure varied slightly), and after it completed captured file stats again:
We see a latency of 57ms for the EX_AdventureWorks2012 data file. Not ideal, but over time with my normal workload, this would probably even out. There is minimal latency for tempdb, which is expected as the workload I ran doesn’t generate much tempdb activity. Next I rebuilt all indexes for the Sales.SalesOrderHeaderEnlarged and Sales.SalesOrderDetailEnlarged tables:
USE [EX_AdventureWorks2012]; GO ALTER INDEX ALL ON Sales.SalesOrderHeaderEnlarged REBUILD; ALTER INDEX ALL ON Sales.SalesOrderDetailEnlarged REBUILD;
The rebuilds took less than a minute, and notice the spike in read latency for the EX_AdventureWorks2012 data file, and the spikes in write latency for the EX_AdventureWorks2012 data and log files:
According to that snapshot of file stats, latency is horrible; over 600ms for writes! If I saw this value for a production system it would be easy to immediately suspect problems with storage. However, it's also worth noting that AvgBPerWrite increased as well, and bigger block writes take longer to complete. The AvgBPerWrite increase is expected for the index rebuild task.
Understand that as you look at this data, you are not getting a complete picture. A better way to look at latencies using virtual file stats is to take snapshots and then calculate latency for the elapsed time period. For example, the script below uses two snapshots (Current and Previous) and then calculates the number of reads and writes in that time period, the difference in io_stall_read_ms and io_stall_write_ms values, and then divides io_stall_read_ms delta by number of reads, and io_stall_write_ms delta by number of writes. With this method, we calculate the amount of time SQL Server was waiting on I/O for reads or writes, and then divide it by the number of reads or writes to determine latency.
DECLARE @CurrentID INT, @PreviousID INT; SET @CurrentID = 3; SET @PreviousID = @CurrentID - 1; WITH [p] AS ( SELECT [CaptureDate], [database_id], [file_id], [ReadLatency], [WriteLatency], [num_of_reads], [io_stall_read_ms], [num_of_writes], [io_stall_write_ms] FROM [msdb].[dbo].[SQLskills_FileLatency] WHERE [CaptureID] = @PreviousID ) SELECT [c].[CaptureDate] [CurrentCaptureDate], [p].[CaptureDate] [PreviousCaptureDate], DATEDIFF(MINUTE, [p].[CaptureDate], [c].[CaptureDate]) [MinBetweenCaptures], [c].[DB], [c].[physical_name], [c].[ReadLatency] [CurrentReadLatency], [p].[ReadLatency] [PreviousReadLatency], [c].[WriteLatency] [CurrentWriteLatency], [p].[WriteLatency] [PreviousWriteLatency], [c].[io_stall_read_ms]- [p].[io_stall_read_ms] [delta_io_stall_read], [c].[num_of_reads] - [p].[num_of_reads] [delta_num_of_reads], [c].[io_stall_write_ms] - [p].[io_stall_write_ms] [delta_io_stall_write], [c].[num_of_writes] - [p].[num_of_writes] [delta_num_of_writes], CASE WHEN ([c].[num_of_reads] - [p].[num_of_reads]) = 0 THEN NULL ELSE ([c].[io_stall_read_ms] - [p].[io_stall_read_ms])/([c].[num_of_reads] - [p].[num_of_reads]) END [IntervalReadLatency], CASE WHEN ([c].[num_of_writes] - [p].[num_of_writes]) = 0 THEN NULL ELSE ([c].[io_stall_write_ms] - [p].[io_stall_write_ms])/([c].[num_of_writes] - [p].[num_of_writes]) END [IntervalWriteLatency] FROM [msdb].[dbo].[SQLskills_FileLatency] [c] JOIN [p] ON [c].[database_id] = [p].[database_id] AND [c].[file_id] = [p].[file_id] WHERE [c].[CaptureID] = @CurrentID AND [c].[database_id] IN (2, 11);
When we execute this to calculate latency during the index rebuild, we get the following:
Now we can see that the actual latency during that time was high – which we would expect. And if we then went back to our normal workload and ran it for a few hours, the average values calculated from virtual file stats would decrease over time. In fact, if we look at PerfMon data which was captured during the test (and then processed through PAL), we see significant spikes in Avg. Disk sec/Read and Avg. Disk sec/Write which correlates to the time that the index rebuild was running. But at other times, the latency values are well below acceptable values:
You can see the same behavior for the BIG_AdventureWorks 2012 database. Here is the latency information based on the virtual file stats snapshot before the index rebuild and after:
And Performance Monitor data shows the same spikes during the rebuild:
Virtual file stats are a great starting point when you want to understand I/O performance for a SQL Server instance. If you see I/O-related waits when looking at wait statistics, looking at sys.dm_io_virtual_file_stats is a logical next step. However, understand that the data you're viewing is an aggregate since the stats last cleared by one of the associated events (instance restart, offline of database, etc). If you see low latencies, then the I/O subsystem is keeping up with the performance load. However, if you see high latencies, it's not a foregone conclusion that storage is a problem. To really know what's going on you can start to snapshot file stats, as shown here, or you can simply use Performance Monitor to look at latency in real time. It's very easy to create a Data Collector Set in PerfMon that captures the Physical Disk counters Avg. Disk Sec/Read and Avg. Disk Sec/Read for all disks that host database files. Schedule the Data Collector to start and stop on a regular basis, and sample every n seconds (e.g. 15), and once you've captured PerfMon data for an appropriate time, run it through PAL to examine latency over time.
If you do find that I/O latency occurs during your normal workload, and not just during maintenance tasks that drive I/O, you still cannot point to storage as the underlying problem. Storage latency can exist for a variety of reasons, such as:
- SQL Server has to read too much data as a result of inefficient query plans or missing indexes
- Too little memory is allocated to the instance and the same data is read from disk over and over because it cannot stay in memory
- Implicit conversions cause index or table scans
- Queries perform SELECT * when not all columns are required
- Forwarded record problems in heaps cause additional I/O
- Low page densities from index fragmentation, page splits, or incorrect fill factor settings cause additional I/O
Whatever the root cause, what's essential to understand about performance – particularly as it relates to I/O – is that there is rarely one data point that you can use to pinpoint the problem. Finding the true issue takes multiple facts that, when pieced together, help you uncover the problem.
Finally, note that in some cases the storage latency may be completely acceptable. Before you demand faster storage or changes to code, review the workload patterns and Service Level Agreement (SLA) for the database. In the case of a Data Warehouse that services reports to users, the SLA for queries is probably not the same sub-second values you would expect for a high-volume OLTP system. In the DW solution, I/O latencies greater than one second might be perfectly acceptable and expected. Understand the expectations of the business and its users, and then determine what action, if any, to take. And if changes are required, gather the quantitative data you need to support your argument, namely wait statistics, virtual file statistics, and latencies from Performance Monitor.