Delayed Durability in SQL Server 2014

Delayed Durability is a late-breaking but interesting feature in SQL Server 2014; the high-level elevator pitch of the feature is, quite simply:

"Trade durability for performance."

Some background first. By default, SQL Server uses a write-ahead log (WAL), which means that changes are written to the log before they are allowed to be committed. In systems where transaction log writes become the bottleneck, and where there is a moderate tolerance for data loss, you now have the option to temporarily suspend the requirement to wait for the log flush and acknowledgement. This happens to quite literally take the D out of ACID, at least for a small portion of data (more on this later).

You kind of already make this sacrifice now. In full recovery mode, there is always some risk of data loss, it's just measured in terms of time rather than size. For example, if you back up the transaction log every five minutes, you could lose up to just under 5 minutes of data if something catastrophic happened. I'm not talking simple failover here, but let's say the server literally catches on fire or someone trips over the power cord – the database may very well be unrecoverable and you may have to go back to the point in time of the last log backup. And that's assuming you are even testing your backups by restoring them somewhere – in the event of a critical failure you may not have the recovery point you think you have. We tend not to think about this scenario, of course, because we never expect bad things™ to happen.

How it works

Delayed durability enables write transactions to continue running as if the log had been flushed to disk; in reality, the writes to disk have been grouped and deferred, to be handled in the background. The transaction is optimistic; it assumes that the log flush will happen. The system uses a 60KB chunk of log buffer, and attempts to flush the log to disk when this 60KB block is full (at the latest – it can and often will happen before that). You can set this option at the database level, at the individual transaction level, or – in the case of natively compiled procedures in In-Memory OLTP – at the procedure level. The database setting wins in the case of a conflict; for example, if the database is set to disabled, trying to commit a transaction using the delayed option will simply be ignored, with no error message. Also, some transactions are always fully durable, regardless of database settings or commit settings; for example, system transactions, cross-database transactions, and operations involving FileTable, Change Tracking, Change Data Capture, and Replication.

At the database level, you can use:

ALTER DATABASE dbname SET DELAYED_DURABILITY = DISABLED | ALLOWED | FORCED;

If you set it to ALLOWED, this means that any individual transaction can use Delayed Durability; FORCED means that all transactions that can use Delayed Durability will (the exceptions above are still relevant in this case). You will likely want to use ALLOWED rather than FORCED – but the latter can be useful in the case of an existing application where you want to use this option throughout and also minimize the amount of code that has to be touched. An important thing to note about ALLOWED is that fully durable transactions may have to wait longer, as they will force the flush of any delayed durable transactions first.

At the transaction level, you can say:

COMMIT TRANSACTION WITH (DELAYED_DURABILITY = ON);

And in an In-Memory OLTP natively-compiled procedure, you can add the following option to the BEGIN ATOMIC block:

BEGIN ATOMIC WITH (DELAYED_DURABILITY = ON, ...)

A common question is around what happens with locking and isolation semantics. Nothing changes, really. Locking and blocking still happen, and transactions are committed in the same way and with the same rules. The only difference is that, by allowing the commit to occur without waiting for the log to flush to disk, any related locks are released that much sooner.

When You Should Use It

In addition to the benefit you get from allowing the transactions to proceed without waiting for the log write to happen, you also get fewer log writes of larger sizes. This can work out very well if your system has a high proportion of transactions that are actually smaller than 60KB, and particularly when the log disk is slow (though I found similar benefits on SSD and traditional HDD). It doesn't work out so well if your transactions are, for the most part, larger than 60KB, if they are typically long-running, or if you have high throughput and high concurrency. What can happen here is that you can fill the entire log buffer before the flush finishes, which just means transferring your waits to a different resource and, ultimately, not improving the perceived performance by the users of the application.

In other words, if your transaction log is not currently a bottleneck, don't turn this feature on. How can you tell if your transaction log is currently a bottleneck? The first indicator would be high WRITELOG waits, particularly when coupled with PAGEIOLATCH_**. Paul Randal (@PaulRandal) has a great four-part series on identifying transaction log problems, as well as configuring for optimal performance:

Also see this blog post from Kimberly Tripp (@KimberlyLTripp), 8 Steps to Better Transaction Log Throughput, and the SQL CAT team's blog post, Diagnosing Transaction Log Performance Issues and Limits of the Log Manager.

This investigation may lead you to the conclusion that Delayed Durability is worth looking into; it may not. Testing your workload will be the most reliable way to know for sure. Like many other additions in recent versions of SQL Server (*cough* Hekaton), this feature is NOT designed to improve every single workload – and as noted above, it can actually make some workloads worse. See this blog post by Simon Harvey for some other questions you should ask yourself about your workload to determine if it is feasible to sacrifice some durability to achieve better performance.

Potential for data loss

I'm going to mention this several times, and add emphasis every time I do: You need to be tolerant to data loss. Under a well-performing disk, the maximum you should expect to lose in a catastrophe – or even a planned and graceful shutdown – is up to one full block (60KB). However, in the case where your I/O subsystem can't keep up, it is possible that you could lose as much as the entire log buffer (~7MB).

To clarify, from the documentation (emphasis mine):

For delayed durability, there is no difference between an unexpected shutdown and an expected shutdown/restart of SQL Server. Like catastrophic events, you should plan for data loss. In a planned shutdown/restart some transactions that have not been written to disk may first be saved to disk, but you should not plan on it. Plan as though a shutdown/restart, whether planned or unplanned, loses the data the same as a catastrophic event.

So it is very important that you weigh your data loss risk with your need to alleviate transaction log performance issues. If you run a bank or anything dealing with money, it may be much safer and more appropriate for you to move your log to faster disk than to roll the dice using this feature. If you are trying to improve the response time in your Web Gamerz Chat Room application, maybe the risk is less severe.

You can control this behavior to some degree in order to minimize your risk of data loss. You can force all delayed durable transactions to be flushed to disk in one of two ways:

Commit any fully durable transaction.
Call sys.sp_flush_log manually.

This allows you to revert to controlling data loss in terms of time, rather than size; you could schedule the flush every 5 seconds, for example. But you will want to find your sweet spot here; flushing too often can offset the Delayed Durability benefit in the first place. In any case, you will still need to be tolerant to data loss, even if it is only <n> seconds' worth.

You would think that CHECKPOINT might help here, but this operation actually does not technically guarantee the log will be flushed to disk.

Interaction with HA/DR

You might wonder how Delayed Durablity functions with HA/DR features such as log shipping, replication, and Availability Groups. With most of these it works unchanged. Log shipping and replication will replay the log records that have been hardened, so the same potential for data loss exists there. With AGs in asynchronous mode, we're not waiting for the secondary acknowledge anyway, so it will behave the same as today. With synchronous, however, we can't commit on the primary until the transaction is committed and hardened to the remote log. Even in that scenario we may have some benefit locally by not having to wait for the local log to write, we still have to wait for the remote activity. So in that scenario there is less benefit, and potentially none; except perhaps in the rare scenario where the primary's log disk is really slow and the secondary's log disk is really fast. I suspect the same conditions hold true for sync/async mirroring, but you won't get any official commitment from me about how a shiny new feature works with a deprecated one. :-)

Performance Observations

This wouldn't be much of a post here if I didn't show some actual performance observations. I set up 8 databases to test the effects of two different workload patterns with the following attributes:

Recovery model: simple vs. full
Log location: SSD vs. HDD
Durability: delayed vs. fully durable

I am really, really, really ~~lazy~~efficient about this kind of thing. Since I want to avoid repeating the same operations within each database, I created the following table temporarily in model:

USE model;
GO

CREATE TABLE dbo.TheTable
(
  TheID INT IDENTITY(1,1) PRIMARY KEY,
  TheDate DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
  RowGuid UNIQUEIDENTIFIER NOT NULL DEFAULT NEWID()
);

Then I built a set of dynamic SQL command to builds these 8 databases, rather than create the databases individually and then muck with the settings:

-- C and D are SSD, G is HDD

DECLARE @sql NVARCHAR(MAX) = N'';

;WITH l AS (SELECT l FROM (VALUES('D'),('G')) AS l(l)),
r AS (SELECT r FROM (VALUES('FULL'),('SIMPLE')) AS r(r)),
d AS (SELECT d FROM (VALUES('FORCED'),('DISABLED')) AS d(d)),
x AS (SELECT l.l, r.r, d.d, n = CONVERT(CHAR(1),ROW_NUMBER() OVER 
  (ORDER BY d.d DESC, l.l)) FROM l CROSS JOIN r CROSS JOIN d)
SELECT @sql += N'
CREATE DATABASE dd' + n + ' ON '
+ '(name = ''dd' + n + '_data'','
+ ' filename = ''C:\SQLData\dd' + n + '.mdf'', size = 1024MB)
LOG ON (name = ''dd' + n + '_log'','
+ ' filename = ''' + l + ':\SQLLog\dd' + n + '.ldf'', size = 1024MB);
  ALTER DATABASE dd' + n + ' SET RECOVERY ' + r  + ';
  ALTER DATABASE dd' + n + ' SET DELAYED_DURABILITY = ' + d + ';'
FROM x ORDER BY d, l;

PRINT @sql;
-- EXEC sp_executesql @sql;

Feel free to run this code yourself (with the EXEC still commented out) to see that this would create 4 databases with Delayed Durability OFF (two in FULL recovery, two in SIMPLE, one of each with log on slow disk, and one of each with log on SSD). Repeat that pattern for 4 databases with Delayed Durability FORCED – I did this to simplify the code in the test, rather than to reflect what I would do in real life (where I would likely want to treat some transactions as critical, and some as, well, less than critical).

For sanity checking, I ran the following query to ensure that the databases had the right matrix of attributes:

SELECT d.name, d.recovery_model_desc, d.delayed_durability_desc, 
  log_disk = CASE WHEN mf.physical_name LIKE N'D%' THEN 'SSD' else 'HDD' END
FROM sys.databases AS d
INNER JOIN sys.master_files AS mf
ON d.database_id = mf.database_id
WHERE d.name LIKE N'dd[1-8]'
AND mf.[type] = 1; -- log

Results:

name	recovery_model	delayed_durability	log_disk
dd1	FULL	FORCED	SSD
dd2	SIMPLE	FORCED	SSD
dd3	FULL	FORCED	HDD
dd4	SIMPLE	FORCED	HDD
dd5	FULL	DISABLED	SSD
dd6	SIMPLE	DISABLED	SSD
dd7	FULL	DISABLED	HDD
dd8	SIMPLE	DISABLED	HDD

Relevant configuration of the 8 test databases

I also ran the test cleanly multiple times to ensure that a 1 GB data file and 1 GB log file would be sufficient to run the entire set of workloads without introducing any autogrowth events into the equation. As a best practice, I routinely go out of my way to ensure customers' systems have enough allocated space (and proper alerts built in) such that no growth event ever occurs at an unexpected time. In the real world I know this doesn't always happen, but it is ideal.

I set up the system to be monitored with SQL Sentry – this would allow me to easily show most of the performance metrics I wanted to highlight. But I also created a temporary table to store batch metrics including duration and very specific output from sys.dm_io_virtual_file_stats:

SELECT test = 1, cycle = 1, start_time = GETDATE(), * 
INTO #Metrics 
FROM sys.dm_io_virtual_file_stats(DB_ID('dd1'), 2) WHERE 1 = 0;

This would allow me to record the start and finish time of each individual batch, and measure deltas in the DMV between start time and end time (only reliable in this case because I know I'm the only user on the system).

Lots of small transactions

The first test I wanted to perform was a lot of small transactions. For each database, I wanted to end up with 500,000 separate batches of a single insert each:

INSERT #Metrics SELECT 1, 1, GETDATE(), * 
  FROM sys.dm_io_virtual_file_stats(DB_ID('dd1'), 2);
GO
INSERT dbo.TheTable DEFAULT VALUES;
GO 500000
INSERT #Metrics SELECT 1, 2, GETDATE(), * 
  FROM sys.dm_io_virtual_file_stats(DB_ID('dd1'), 2);

Remember, I try to be ~~lazy~~efficient about this kind of thing. So to generate the code for all 8 databases, I ran this:

;WITH x AS 
(
  SELECT TOP (8) number FROM master..spt_values 
  WHERE type = N'P' ORDER BY number
)
SELECT CONVERT(NVARCHAR(MAX), N'') + N'
INSERT #Metrics SELECT 1, 1, GETDATE(), * 
  FROM sys.dm_io_virtual_file_stats(DB_ID(''dd' + RTRIM(number+1) + '''), 2);
GO
INSERT dbo.TheTable DEFAULT VALUES;
GO 500000
INSERT #Metrics SELECT 1, 2, GETDATE(), * 
  FROM sys.dm_io_virtual_file_stats(DB_ID(''dd' + RTRIM(number+1) + '''), 2);'
FROM x;

I ran this test and then looked at the #Metrics table with the following query:

SELECT 
  [database] = db_name(m1.database_id),
  num_writes = m2.num_of_writes - m1.num_of_writes, 
  write_bytes = m2.num_of_bytes_written - m1.num_of_bytes_written,
  bytes_per_write = (m2.num_of_bytes_written - m1.num_of_bytes_written)*1.0
    /(m2.num_of_writes - m1.num_of_writes),
  io_stall_ms = m2.io_stall_write_ms - m1.io_stall_write_ms,
  m1.start_time,
  end_time = m2.start_time,
  duration = DATEDIFF(SECOND, m1.start_time, m2.start_time)
FROM #Metrics AS m1
INNER JOIN #Metrics AS m2
ON m1.database_id = m2.database_id
WHERE m1.cycle = 1 AND m2.cycle = 2
AND m1.test = 1 AND m2.test = 1;

This yielded the following results (and I confirmed through multiple tests that the results were consistent):

database	writes	bytes	bytes/write	io_stall_ms	start_time	end_time	duration (seconds)
dd1	8,068	261,894,656	32,460.91	6,232	2014-04-26 17:20:00	2014-04-26 17:21:08	68
dd2	8,072	261,682,688	32,418.56	2,740	2014-04-26 17:21:08	2014-04-26 17:22:16	68
dd3	8,246	262,254,592	31,803.85	3,996	2014-04-26 17:22:16	2014-04-26 17:23:24	68
dd4	8,055	261,688,320	32,487.68	4,231	2014-04-26 17:23:24	2014-04-26 17:24:32	68
dd5	500,012	526,448,640	1,052.87	35,593	2014-04-26 17:24:32	2014-04-26 17:26:32	120
dd6	500,014	525,870,080	1,051.71	35,435	2014-04-26 17:26:32	2014-04-26 17:28:31	119
dd7	500,015	526,120,448	1,052.20	50,857	2014-04-26 17:28:31	2014-04-26 17:30:45	134
dd8	500,017	525,886,976	1,051.73	49,680	2014-04-26 17:30:45	2014-04-26 17:32:58	133

Small transactions: Duration and results from sys.dm_io_virtual_file_stats

Definitely some interesting observations here:

Number of individual write operations was very small for the Delayed Durability databases (~60X for traditional).
Total number of bytes written was cut in half using Delayed Durability (I presume because all of the writes in the traditional case contained a lot of wasted space).
The number of bytes per write was a lot higher for Delayed Durability. This was not overly surprising, since the whole purpose of the feature is to bundle writes together in larger batches.
The total duration of I/O stalls was volatile, but roughly an order of magnitude lower for Delayed Durability. The stalls under fully durable transactions were much more sensitive to the type of disk.
If anything hasn't convinced you so far, the duration column is very telling. Fully durable batches that take two minutes or more are cut almost in half.

The start/end time columns allowed me to focus on the Performance Advisor dashboard for the precise period where these transactions were happening, where we can draw a lot of additional visual indicators:

SQL Sentry dashboard – click to enlarge

Further observations here:

On several graphs, you can clearly see exactly when the non-Delayed Durability portion of the batch took over (~5:24:32 PM).
There is no observable impact to CPU or memory when using Delayed Durability.
You can see a tremendous impact to batches/transactions per second in the first graph under SQL Server Activity.
SQL Server waits go through the roof when the fully durable transactions started. These were comprised almost exclusively of WRITELOG waits, with a small number of PAGEIOLOATCH_EX and PAGEIOLATCH_UP waits for good measure.
The total number of log flushes throughout the Delayed Durability operations was quite small (low 100s/sec), while this jumped to over 4,000/sec for the traditional behavior (and slightly lower for the HDD duration of the test).

Fewer, larger transactions

For the next test, I wanted to see what would happen if we performed fewer operations, but made sure each statement affected a larger amount of data. I wanted this batch to run against each database:

CREATE TABLE dbo.Rnd
(
  batch TINYINT,
  TheID INT
);

INSERT dbo.Rnd SELECT TOP (1000) 1, TheID FROM dbo.TheTable ORDER BY NEWID();
INSERT dbo.Rnd SELECT TOP (10)   2, TheID FROM dbo.TheTable ORDER BY NEWID();
INSERT dbo.Rnd SELECT TOP (300)  3, TheID FROM dbo.TheTable ORDER BY NEWID();
GO

INSERT #Metrics SELECT 1, GETDATE(), * 
  FROM sys.dm_io_virtual_file_stats(DB_ID('dd1'), 2);
GO
UPDATE t SET TheDate = DATEADD(MINUTE, 1, TheDate)
  FROM dbo.TheTable AS t
  INNER JOIN dbo.Rnd AS r
  ON t.TheID = r.TheID
  WHERE r.batch = 1;
GO 10000
UPDATE t SET RowGuid = NEWID()
  FROM dbo.TheTable AS t
  INNER JOIN dbo.Rnd AS r
  ON t.TheID = r.TheID
  WHERE r.batch = 2;
GO 10000
DELETE dbo.TheTable WHERE TheID IN (SELECT TheID   FROM dbo.Rnd WHERE batch = 3);
DELETE dbo.TheTable WHERE TheID IN (SELECT TheID+1 FROM dbo.Rnd WHERE batch = 3);
DELETE dbo.TheTable WHERE TheID IN (SELECT TheID-1 FROM dbo.Rnd WHERE batch = 3);
GO
INSERT #Metrics SELECT 2, GETDATE(), * 
  FROM sys.dm_io_virtual_file_stats(DB_ID('dd1'), 2);

So again I used the lazy method to produce 8 copies of this script, one per database:

;WITH x AS (SELECT TOP (8) number FROM master..spt_values WHERE type = N'P' ORDER BY number)
SELECT N'
USE dd' + RTRIM(Number+1) + ';
GO

CREATE TABLE dbo.Rnd
(
  batch TINYINT,
  TheID INT
);

INSERT dbo.Rnd SELECT TOP (1000) 1, TheID FROM dbo.TheTable ORDER BY NEWID();
INSERT dbo.Rnd SELECT TOP (10)   2, TheID FROM dbo.TheTable ORDER BY NEWID();
INSERT dbo.Rnd SELECT TOP (300)  3, TheID FROM dbo.TheTable ORDER BY NEWID();
GO

INSERT #Metrics SELECT 2, 1, GETDATE(), * 
  FROM sys.dm_io_virtual_file_stats(DB_ID(''dd' + RTRIM(number+1) + ''', 2);
GO
UPDATE t SET TheDate = DATEADD(MINUTE, 1, TheDate)
  FROM dbo.TheTable AS t
  INNER JOIN dbo.rnd AS r
  ON t.TheID = r.TheID
  WHERE r.cycle = 1;
GO 10000
UPDATE t SET RowGuid = NEWID()
  FROM dbo.TheTable AS t
  INNER JOIN dbo.rnd AS r
  ON t.TheID = r.TheID
  WHERE r.cycle = 2;
GO 10000
DELETE dbo.TheTable WHERE TheID IN (SELECT TheID   FROM dbo.rnd WHERE cycle = 3);
DELETE dbo.TheTable WHERE TheID IN (SELECT TheID+1 FROM dbo.rnd WHERE cycle = 3);
DELETE dbo.TheTable WHERE TheID IN (SELECT TheID-1 FROM dbo.rnd WHERE cycle = 3);
GO
INSERT #Metrics SELECT 2, 2, GETDATE(), * 
  FROM sys.dm_io_virtual_file_stats(DB_ID(''dd' + RTRIM(number+1) + '''), 2);'
FROM x;

I ran this batch, then changed the query against #Metrics above to look at the second test instead of the first. The results:

database	writes	bytes	bytes/write	io_stall_ms	start_time	end_time	duration (seconds)
dd1	20,970	1,271,911,936	60,653.88	12,577	2014-04-26 17:41:21	2014-04-26 17:43:46	145
dd2	20,997	1,272,145,408	60,587.00	14,698	2014-04-26 17:43:46	2014-04-26 17:46:11	145
dd3	20,973	1,272,982,016	60,696.22	12,085	2014-04-26 17:46:11	2014-04-26 17:48:33	142
dd4	20,958	1,272,064,512	60,695.89	11,795	2014-04-26 17:48:33	2014-04-26 17:50:56	143
dd5	30,138	1,282,231,808	42,545.35	7,402	2014-04-26 17:50:56	2014-04-26 17:53:23	147
dd6	30,138	1,282,260,992	42,546.31	7,806	2014-04-26 17:53:23	2014-04-26 17:55:53	150
dd7	30,129	1,281,575,424	42,536.27	9,888	2014-04-26 17:55:53	2014-04-26 17:58:25	152
dd8	30,130	1,281,449,472	42,530.68	11,452	2014-04-26 17:58:25	2014-04-26 18:00:55	150

Larger transactions: Duration and results from sys.dm_io_virtual_file_stats

This time, the impact of Delayed Durability is much less noticeable. We see a slightly smaller number of write operations, at a slightly larger number of bytes per write, with the total bytes written almost identical. In this case we actually see the I/O stalls are higher for Delayed Durability, and this likely accounts for the fact that the durations were almost identical as well.

From the Performance Advisor dashboard, we some similarities with the previous test, and some stark differences as well:

SQL Sentry dashboard – click to enlarge

One of the big differences to point out here is that the delta in wait stats is not quite as pronounced as with the previous test – there is still a much higher frequency of WRITELOG waits for the fully durable batches, but nowhere near the levels seen with the smaller transactions. Another thing you can spot immediately is that the previously observed impact on batches and transactions per second is no longer present. And finally, while there are more log flushes with fully durable transactions than when delayed, this disparity is far less pronounced than with the smaller transactions.

Conclusion

It should be clear that there are certain workload types that may benefit greatly from Delayed Durability – provided, of course, that you have a tolerance for data loss. This feature is not restricted to In-Memory OLTP, is available on all editions of SQL Server 2014, and can be implemented with little to no code changes. It can certainly be a powerful technique if your workload can support it. But again, you will need to test your workload to be sure that it will benefit from this feature, and also strongly consider whether this increases your exposure to the risk of data loss.

As an aside, this may seem to the SQL Server crowd like a fresh new idea, but in truth Oracle introduced this as "Asynchronous Commit" in 2006 (see COMMIT WRITE ... NOWAIT as documented here and blogged about in 2007). And the idea itself has been around for nearly 3 decades; see Hal Berenson's brief chronicle of its history.

Next Time

One idea that I have batted around is to try to improve the performance of tempdb by forcing Delayed Durability there. One special property of tempdb that makes it such a tempting candidate is that it is transient by nature – anything in tempdb is designed, explicitly, to be tossable in the wake of a wide variety of system events. I am saying this now without having any idea if there is a workload shape where this will work out well; but I do plan to try it out, and if I find anything interesting, you can be sure I will post about it here.

19 thoughts on “Delayed Durability in SQL Server 2014”

Mike says:

June 30, 2014 at 4:27 PM

Really nice write up Aaron. The tempdb angle has really piqued my curiosity.
manu says:

July 15, 2014 at 1:55 AM

Very nicely written. Thanks.
LondonDBA says:

July 18, 2014 at 8:04 AM

Thanks for the great article, Aaron. Under the section 'Potential for data loss' it mentions that you could lose up to 7MB, the size of the entire log buffers. I understand that each log buffer is 60KB so that works out as about 120 individual log buffers making up the entire set of log buffers (the object that's referred to the log cache I guess). Is there any Microsoft article (MSDN or CSS) that mentions that the total number of log buffers totals to 7MB ?

Also, I was under the impression that as soon as a 60KB log buffer is full it is written to disk so I was thinking that the maximum data loss would be 60KB. But I'm guessing the maximum data loss scenario you're referring to is when all 120 individual log buffers are not completely full so they're not flushed to disk and it's then that the SQL Server goes down. Just wondering if I'm correctly outlining this scenario.

Thanks again.
Aaron Bertrand says:

July 22, 2014 at 1:39 AM

@LondonDBA you are right, the ~7MB refers to 120 log buffer blocks at just under 60kb each, assuming that the worst case scenario would be a really slow I/O subsystem and the lucky coincidence of having 59kb or so in every single log buffer block at the time disaster struck. This is a lot like winning the lottery, and I doubt the data loss would ever be even close to that severe, but it can't be guaranteed to be any better. I don't know that the 7MB figure is called out in any explicit documentation on MSDN, but I do believe I saw it referenced on a slide deck at TechEd or the PASS Community Summit. I don't think they will officially document it at this time because the potential for data loss could change (at any point – under a trace flag, via a CU, etc. they could change the number of log blocks in the buffer or the size of the blocks themselves).
@retracement says:

August 1, 2014 at 12:52 PM

The 7MB is referenced in the SQLPASS 2013 Summit session 'DBA-406-M SQL Server Transaction Log Internals' presented by Tim Chapman and Denzil Ribeiro so that might have been where you saw it Aaron.
Aaron Bertrand says:

August 1, 2014 at 12:55 PM

Thanks Mark, I think you're right!
Rajeev Chaudhary says:

May 29, 2015 at 7:35 PM

Hi Aaron

I did some testing to reproduce data loss with delayed durability. I
1. Took a test system with 4 CPUs and 64 GB RAM. Actually, it was a dev system just handed over for deployment, which I decided to use for little testing. Installed SQL 2014(ver 12.0.2402) on it.
2. Created a database with just one table in it that has one column of char(2). Enabled "Forced" Delayed Durability on the database.
3. Created a script file for DISKPART to take log and data drives offline.
4. Created statement to insert only one record with "(DELAYED_DURABILITY = ON)" within the transaction.
5. Created statement to run Diskpart using XP_CMDSHELL
6. Created statement to kill(taskkill) SQL process using XP_CMDSHELL
7. Selected the transaction for insert and the statements in step 5 and 6 and executed them in on go. This inserted one record in the table with delayed durability, took the log and data drives offline, and killed SQL process.
8. Brought the disks online after confirming that SQL was down
9. Brought SQL online and checked records in the table.
To my surprise, the table shows the new record that should have been lost. I have tested this scenario many times with same result.

There is nothing else running on this test box. SQL agent is stopped throughout the test. After restart, I have verified that the ;pg record fixed length and log record length are not large enough to fill the log buffer(60KB).

Any ideas on why the transaction is always making it into the log file?
Aaron Bertrand says:

May 29, 2015 at 7:38 PM

I think there is a misunderstanding – the claim is that data loss is *possible*, not that it is likely, never mind *guaranteed.* I'd say the feature is working better than you should have expected. :-)

Now, try making sure that your insert transaction exceeds 60kb (or 8MB if you want to have some real fun).
Rajeev Chaudhary says:

May 30, 2015 at 10:42 PM

Thanks Aaron for your reply!
It appears that SQL tries to flush log as soon as it can without waiting for 60KB log block to get full even in case of delayed durable transactions just like it does in case of durable transactions. The only thing that has happened with delayed durability is decoupling of transaction commit from log flush.

I have read on several blogs about SQL flushing log only after 60KB log block is full unless a fully durable transaction comes in or someone runs log flush command manually. This statement does not appear to be accurate.

Please let me know if I am mistaken.
Aaron Bertrand says:

May 31, 2015 at 9:06 PM

I must not have read any of the articles you're talking about, because I don't recall anyone claiming that SQL Server will sit around and wait until a 60kb log block is full before writing. The disclaimer is that you *could* lose that much data if a perfect storm happens – you've committed some data and lightning strikes before SQL Server has had a chance to write to disk. This could happen regardless of whether you've committed 1kb, 30kb, or 59kb. It could be even more than one log block depending on your I/O subsystem. That doesn't mean it is easy to reproduce that scenario – you just need to keep in mind that it is possible.
Mohammed Owais says:

September 8, 2015 at 9:45 AM

Hey Aaron,

I love this article and keep coming back to reference it. I just noticed your last paragraph about tempdb, and I was wondering about it myself. I tried changing the setting and saw _no_ improvement in performance. Take a look at this short but very interesting post from a SQL Server development team member that explains why: http://blogs.msdn.com/b/ialonso/archive/2014/07/23/applying-delayed-durability-forced-on-tempdb.aspx . It turns out that even though you cannot apply Delayed Durability on tempdb, the way it flushes the log cache is already very similar to the DD setting. Apparently, it has been this way for a while now.
Simon Harvey says:

February 1, 2016 at 10:35 AM

Great article Aaron.

Really appreciate some of the detail you went into on what is actually being logged with DD on and off. Should give people some good insight into when and why DD might make a difference
Aaron Bertrand says:

February 1, 2016 at 3:37 PM

Thanks Simon, my colleague Melissa Connors also wrote a post on DD recently that you might find interesting:
- Delayed Durability while Purging Data
Hany Helmy says:

March 29, 2017 at 2:30 PM

Hi Aaron,

Great article, very informative, 1 point to add, Delayed Durability is not supported with Transactional Replication, Microsoft Docs:

https://docs.microsoft.com/en-us/sql/relational-databases/logs/control-transaction-durability

I have tested myself, if it is enabled on a replicated database, good bye to replication, it will give the famous error: "The row was not found on the subscriber", then you will have to generate a new Snapshot to overcome this issue. Sad.

Regards,
Hany Helmy
Manish says:

September 5, 2017 at 3:06 AM

Yes already mentioned by Aaron as CDC is fully durable. FYI, CDC uses log reader the same in transaction replication.
George Walkey says:

June 14, 2018 at 11:33 AM

Did you ever try this on TempDB?
ScottPletcher says:

June 14, 2018 at 7:58 PM

The new COMPRESS() function is great but is rather slow.
Does anyone know if delayed durability on an INSERT with a COMPRESS() function means that the COMPRESS() itself does not need to complete before the INSERT is considered "complete" and control is sent back to the caller / the code doing the "INSERT"?
hany says:

June 20, 2018 at 6:36 AM

@Manish, CDC is not considered as replication at all even if it uses log reader the same in transaction replication. I was talking about replication technology only.
andy says:

July 10, 2018 at 8:41 AM

Great article – on conducting some tests we can see the improvement DD has on writes to our Process_Log table that happens when a certain step is processed in an SP. However to officially implement this i could with help around its ability to work within the full structure of our system

let me try an explain it :

We have an SP (SP#1) that executes another SP (SP#2) called "activity_log.sql" at each important step

SP#1 has a step called 'Write client data to table' within a transaction

SP #2 is called at this step and this writes the step details into the Activity_Log table recording a success of fail within another transaction

The transaction is SP#2 is essentially nested in the transaction in SP#1

The Delayed Durability = 'On' is set within the Commit on the transaction in SP#2

Will this affect any further processing in original SP#1? Will DD remain on for the duration of SP#1 or only during the write to Activity_log table in SP#2?

There are lots of SP#1's so simply moving the DD to SP#1 is not an possibity

Comments are closed.