There is a persistent myth in the SQL Server world that both the DROP TABLE
and TRUNCATE TABLE
commands are non-logged.
They’re not. They’re both fully-logged, but efficiently logged.
You can easily prove this to yourself. Run the following code to set up a test database and table, and show we have 10,000 rows in our table:
CREATE DATABASE [TruncateTest];
GO
ALTER DATABASE [TruncateTest] SET RECOVERY SIMPLE;
GO
USE [TruncateTest];
GO
CREATE TABLE [TestTable] (
[c1] INT IDENTITY,
[c2] CHAR (8000) DEFAULT 'a');
GO
SET NOCOUNT ON;
GO
INSERT INTO [TestTable] DEFAULT VALUES;
GO 10000
SELECT
COUNT (*) AS N'RowCount'
FROM
[TestTable];
GO
Results:
———–
10000
And then the following code, which truncates the table in a transaction and checks the row count:
BEGIN TRAN;
GO
TRUNCATE TABLE [TestTable];
GO
SELECT
COUNT (*) AS N'RowCount'
FROM
[TestTable];
GO
Results:
———–
0
Now the table is empty. But I can roll back the transaction and put all the data back again:
ROLLBACK TRAN;
GO
SELECT
COUNT (*) AS N'RowCount'
FROM
[TestTable];
GO
Results:
———–
10000
Clearly the TRUNCATE
operation must be logged otherwise the roll back operation would not work.
So where does the misconception come from?
It comes from the behavior of DROP
and TRUNCATE
operations on large tables. They will complete almost instantaneously, and if you look in the transaction log using fn_dblog
right afterwards, you’ll only see a small number of log records generated from the operation. That small number doesn’t correlate with the size of the table being truncated or dropped, so it seems as if DROP
and TRUNCATE
operations are non-logged.
But they’re fully logged, as I demonstrated above. So where are the log records for the operations?
The answer is that the log records will be created, just not immediately, by a mechanism called ‘deferred drop’, which was added in SQL Server 2000 SP3.
When a table is dropped or truncated, all the data file pages allocated for the table must be deallocated. The mechanism for this before SQL Server 2000 SP3 was as follows:
Begin
Acquire an eXclusive allocation lock on the extent
Probe the page lock for each page in the extent (acquire the lock in eXclusive mode, and immediately drop it, making sure no-one else has the page locked)
Do NOT release the extent lock, guaranteeing that no-one else can use that extent
Move to the next extent
End
As all the extent locks were held until the end of the operation, and each lock takes a small amount of memory, it was possible for the lock manager to run out of memory when a DROP
or TRUNCATE
of a very large table occurred. Some SQL Server customers started to find they ran into out-of-memory conditions on SQL Server 2000, as tables grew very large and vastly outstripped the growth in system memory.
The deferred-drop mechanism simulates the DROP
or TRUNCATE
operation completing immediately, by unhooking the allocations for the table and putting them on the ‘deferred-drop queue’, for later processing by a background task. This unhook-and-transfer operation only generates a handful of log records. This is the operation that is being done and rolled-back in my code example above.
The ‘deferred-drop background task’ spins up every few seconds and deallocates all the pages and extents on the deferred-drop queue in small batches, guaranteeing that the operation will not run out of memory. These deallocations are all fully-logged, but remember that deallocating a page full of data or index records does not log individual deletes of those records; instead the entire page is just marked as deallocated in the relevant PFS (Page Free Space) allocation byte-map.
From SQL Server 2000 SP3 onwards, when you perform a DROP
or TRUNCATE
of a table, you’ll only see a few log records being generated. If you wait a minute or so, and then look in the transaction log again, you’ll see thousands of log records have been generated by the deferred-drop operation, each deallocating a page or extent. The operation is fully and efficiently logged.
Here’s an example, using the scenario we created above:
CHECKPOINT;
GO
TRUNCATE TABLE [TestTable];
GO
SELECT
COUNT (*) AS N'LogRecCount'
FROM
fn_dblog (NULL, NULL);
GO
Results:
———–
25
As you can see, there clearly aren’t log records deallocating the 10,000 pages, plus 1,250 extents in the TestTable table.
If I wait a few seconds, and then run the fn_dblog
code again, I get:
———–
3811
You might wonder why there aren’t at least 10,000 log records – one for each page being deallocated. That’s because the page deallocations are even logged efficiently – with one log record reflecting PFS page allocation changes for 8 consecutive data file pages, instead of one log record for each data file page reflecting its allocation status changing in the PFS page.
SQL Server always tries to produce as little transaction log as possible, while still adhering to the rules about full or minimal logging based on the current recovery model. If you want to look at the actual log records generated by the unhook-and-transfer and deferred-drop mechanisms, simply substitute * for COUNT (*) in the fn_dblog code above and look for a transaction with the Transaction Name set to DeferredAllocUnitDrop::Process
.
In future posts I’ll discuss the internals that underpin other persistent myths around performance aspects of the SQL Server Storage Engine.
Think about this too. If truncate were not logged you could never recover a database after a truncate operation. You'd never be able to log ship a database that had a truncated table either.
If a transactional replication is set up, at least in SQL Server 2005, I don't think you can truncate a table from the publisher.
Thanks. Useful Info.
Thanks for nice & clear information. Explanation with code really help me a lot to understand this concept :-)
Regards,
Yashwant Vishwakarma | http://www.sqlocean.com
Thanks for sharing the knowledge…
This was an interview question that I got totally wrong when I was a new DBA back in 2002. But to be fair, it is easy to understand why this myth perpetuates since MSDN publishes the definition listed below:
https://msdn.microsoft.com/en-us/library/ms177570.aspx
"Removes all rows from a table or specified partitions of a table, without logging the individual row deletions. TRUNCATE TABLE is similar to the DELETE statement with no WHERE clause; however, TRUNCATE TABLE is faster and uses fewer system and transaction log resources."
As you can see the msdn definition above does not say that Truncate Table IS a logged event just that is uses less transaction log resources. Which I admit SHOULD be the next logical conclusion. But my lizard brain (14 years ago) did not take the time to see anything but the words Truncate without logging.
Lannie,
You have not found more interesting document. Here is it:
Microsoft directly declare what https://msdn.microsoft.com/en-us/library/ms151740(v=sql.110).aspx
"TRUNCATE TABLE is a non-logged operation that does not fire triggers."
The name of myth creator is Microsoft.
The question is: if you truncate is logged, even as deffered action, why replication broke?
Yup – it's really confusing to many people. What they should say is that the individual rows are not deleted, so they are not logged.
I was going to drop a large table (300+ GB of images) that is part of an AG and after talking to a colleague I am want to know if that is feasible (not destroy the latency/take forever to catch-up) or will this need to be done in batched deletes? I was lead here by a forum post from sqlservercentral.com (https://www.sqlservercentral.com/Forums/Topic1571504-2799-1.aspx) that mentioned this article and I was wondering if there is any more detail on what happens of an instantaneous drop of a large table on a Primary that needs to sync to the Read-Only Secondary.
It's going to do the operation in the exact same way as I describe in the post – with one log record for each extent in the table, in small batches. If your network and disk hardware can keep up, you should be fine, but if you're concerned, I'd try it on a test system first (at at the very least, perform the operation in the quietest period of the day/week).
Thanks Paul for the quick reply!
Our production environment is in Azure with the Primary in the East DC and the Secondary in the West DC (standard VNETs – No expressroute) and we are using Premium Storage. But like you said it may be best to do a small 5-10GB test DROP TABLE run and see how the AG keeps up!
The reason you were able to rollback the truncation was that you wrapped it in a transaction. It's this transaction that got logged, and which you then rolled back.
The contents of the truncation itself have no record in the log.
No, you are easily-demonstrably incorrect. Regardless of whether I start an explicit transaction, SQL Server logs the truncate operation, as a transaction – either right away, or as a deferred drop operation. It has nothing to do with whether I started an explicit transaction or not. That's the whole point of debunking this myth. The deallocation of the pages and extents involved absolutely must be logged under all circumstances – that's an invariant in the way the allocation/deallocation code in the Storage Engine works, and has been that way since at least SQL Server 7.0.
The truncation command is logged, but the contents of the truncation is not.
The deallocations of the pages and extents are logged, as I stated clearly in the blog post. Which part of the post are you disagreeing with?
And again, it's not my transaction that caused any logging – the truncate is logged. Logged doesn't mean that individual row deletes are logged, as I explained in the post.
If the drop is asynchronous and continous after the commit, what would happen if sql server crashed before all page deletions are logged?
The entry in the deferred drop background task is also logged and won't be cleared until the deferred drop completes – so a crash isn't a problem.