Jonathan Kehayias

Looking at Database Snapshot Performance

SentryOne eBooks

In these books, you will find useful, hand-picked articles that will help give insight into some of your most vexing performance problems. These articles were written by several of the SQL Server industry’s leading experts, including Paul White, Paul Randal, Jonathan Kehayias, Erin Stellato, Glenn Berry, Aaron Bertrand, and Joe Sack.

Free Download

Featured Author

Paul Randal, CEO of SQLskills, writes about knee-jerk performance tuning, DBCC, and SQL Server internals.

Paul’s Posts

A database snapshot provides a read-only view of a SQL Server database that is transactionally consistent with the state of the source database at the time that the database snapshot was created. A number of reasons exist for using database snapshots, for example reporting against a mirrored database, and DBCC CHECKDB also uses internal database snapshots from SQL Server 2005 onwards.

Database snapshots also provide the ability to roll back all of the changes that have occurred to a database since the database snapshot was created, but with a nasty side effect on the database’s transaction log that Paul blogged about here.

One of the things that isn’t commonly considered or shown around database snapshots is the performance impact that the snapshot has for the database write workload. The SQLCAT team published a whitepaper for SQL Server 2005, Database Snapshot Performance Considerations under I/O-Intensive Workloads, that investigated the performance impacts of database snapshots, and after working with a client recently where database snapshots resulted in performance problems, I wanted to test SQL Server 2012 and determine if there were any changes to the overhead of database snapshots seven years and three SQL Server releases later.

Test Configuration

To perform the testing of the effect of database snapshots on write workload performance, I used our Dell R720 performing a 1,000,000 row insert into a new table in an enlarged version of the AdventureWorks2012 database.  The AdventureWorks2012 database was created with 8 data files spread across two Fusion-io ioDrive Duo 640GB SSDs that were each setup as two individual 320GB disks in Windows, presenting a total of 4 disks.  To simplify explaining the configuration, the storage layout used for these tests is shown in the table below:

Disk Configuration Usage
K 15K RAID 5 – 6 Disk Snapshot
L Fusion-io Card2 – Side B Log File
M Fusion-io Card2 – Side A 4 Data Files
N Fusion-io Card1 – Side A 4 Data Files
Q Fusion-io Card1 – Side B Tempdb
R LSI Nytro BLP4-1600 Snapshot

Table 1 – Server Disk Layout and Usage

The storage for the database snapshot was either a RAID-5 array of six 15k RPM SAS drives connected through iSCSI, or an LSI Nytro BLP4-1600 PCI-E card.

The test workload used the following SELECT INTO statement to generate a 1,000,000 row table which was dropped between each of the tests.

SELECT TOP 1000000 *
INTO tmp_SalesOrderHeader
FROM Sales.SalesOrderHeaderEnlarged;

The tests were timed to measure the duration without a database snapshot, and then the duration with a database snapshot created on each of the storage devices to measure the performance degradation caused by writing page changes to the database snapshot sparse file.  The tests were also run using two database snapshots on the same storage device to ascertain what the overhead of having additional database snapshots might be for the duplicated write operations that potentially have to be performed.

Results

Each test configuration was executed ten times and the average duration, converted from milliseconds to seconds for easier viewing, is shown in Figure 1, for 0, 1, or 2 database snapshots.

Snapshot Duration
Figure 1 – Snapshot Duration

The baseline tests with no database snapshots executed on average in 1.8 seconds, and even when the storage for the database snapshot files was equivalent in performance, the existence of a single database snapshot imposed overhead to the write performance for the database.  The overhead of the second database snapshot is lower than having the first database snapshot in each of the tests, though the 15K RPM disks had a much more difficult time keeping up with the added write workload from the second database snapshot for the database.

The performance on the LSI Nytro card initially surprised me since it was also a PCI-X SSD.  However, after discussing the results with Glenn, he mentioned that the Sandforce controller compression and slower write performance for random, low-compression data from his past tests on the drive. However, it still easily outclassed the spinning media.

Before running the tests I was interested to know what wait types would occur during the tests, so as a part of the test configuration, I cleared sys.dm_os_wait_stats with DBCC SQLPERF and captured the output from the DMV for each test run into a table.  The top waits for the single snapshot configurations were PREEMPTIVE_OS_WRITEFILE and WRITE_COMPLETION as shown in Figure 2, for 1 or 2 database snapshots.

Snapshot Top Waits
Figure 2 – Snapshot Top Waits

One of the interesting items was the addition of FCB_REPLICA_WRITE waits when a second snapshot was created. After reviewing the single database snapshot wait results, and re-running a couple of rounds of tests, this wait never occurs for a single snapshot and only occurs when more than one snapshot exists and is associated with copying the pages into the database snapshot files. The wait times for the PREEMPTIVE_OS_WRITEFILE waits trend closely with the increases in execution duration for each of the configurations.

With these results in mind, when reviewing a system using the Waits and Queues methodology, seeing this wait type with higher values might be worth investigating whether or not database snapshots exist for any of the databases on the server.

Conclusion

When using database snapshots, even in SQL Server 2012, there is an overhead associated with the additional writes required for copying data pages to the sparse files for the snapshots. If using database snapshots is a part of your general configuration, I would really be careful about planning the I/O subsystem to meet the workload requirements for concurrent I/O activity to the database snapshot sparse files.

From the results of these tests I’d even consider placing database snapshots on SSDs ahead of tempdb for the write performance, and also for lower performance impact from the snapshot maintenance.

As always, your mileage may vary, and you will certainly want to test the performance of any configuration before placing it into production usage.