There is a persistent myth in the SQL Server world that both the
DROP TABLE and
TRUNCATE TABLE commands are non-logged.
They’re not. They’re both fully-logged, but efficiently logged.
You can easily prove this to yourself. Run the following code to set up a test database and table, and show we have 10,000 rows in our table:
CREATE DATABASE [TruncateTest]; GO ALTER DATABASE [TruncateTest] SET RECOVERY SIMPLE; GO USE [TruncateTest]; GO CREATE TABLE [TestTable] ( [c1] INT IDENTITY, [c2] CHAR (8000) DEFAULT 'a'); GO SET NOCOUNT ON; GO INSERT INTO [TestTable] DEFAULT VALUES; GO 10000 SELECT COUNT (*) AS N'RowCount' FROM [TestTable]; GO
And then the following code, which truncates the table in a transaction and checks the row count:
BEGIN TRAN; GO TRUNCATE TABLE [TestTable]; GO SELECT COUNT (*) AS N'RowCount' FROM [TestTable]; GO
Now the table is empty. But I can roll back the transaction and put all the data back again:
ROLLBACK TRAN; GO SELECT COUNT (*) AS N'RowCount' FROM [TestTable]; GO
TRUNCATE operation must be logged otherwise the roll back operation would not work.
So where does the misconception come from?
It comes from the behavior of
TRUNCATE operations on large tables. They will complete almost instantaneously, and if you look in the transaction log using
fn_dblog right afterwards, you’ll only see a small number of log records generated from the operation. That small number doesn’t correlate with the size of the table being truncated or dropped, so it seems as if
TRUNCATE operations are non-logged.
But they’re fully logged, as I demonstrated above. So where are the log records for the operations?
The answer is that the log records will be created, just not immediately, by a mechanism called ‘deferred drop’, which was added in SQL Server 2000 SP3.
When a table is dropped or truncated, all the data file pages allocated for the table must be deallocated. The mechanism for this before SQL Server 2000 SP3 was as follows:
Acquire an eXclusive allocation lock on the extent
Probe the page lock for each page in the extent (acquire the lock in eXclusive mode, and immediately drop it, making sure no-one else has the page locked)
Do NOT release the extent lock, guaranteeing that no-one else can use that extent
Move to the next extent
As all the extent locks were held until the end of the operation, and each lock takes a small amount of memory, it was possible for the lock manager to run out of memory when a
TRUNCATE of a very large table occurred. Some SQL Server customers started to find they ran into out-of-memory conditions on SQL Server 2000, as tables grew very large and vastly outstripped the growth in system memory.
The deferred-drop mechanism simulates the
TRUNCATE operation completing immediately, by unhooking the allocations for the table and putting them on the ‘deferred-drop queue’, for later processing by a background task. This unhook-and-transfer operation only generates a handful of log records. This is the operation that is being done and rolled-back in my code example above.
The ‘deferred-drop background task’ spins up every few seconds and deallocates all the pages and extents on the deferred-drop queue in small batches, guaranteeing that the operation will not run out of memory. These deallocations are all fully-logged, but remember that deallocating a page full of data or index records does not log individual deletes of those records; instead the entire page is just marked as deallocated in the relevant PFS (Page Free Space) allocation byte-map.
From SQL Server 2000 SP3 onwards, when you perform a
TRUNCATE of a table, you’ll only see a few log records being generated. If you wait a minute or so, and then look in the transaction log again, you’ll see thousands of log records have been generated by the deferred-drop operation, each deallocating a page or extent. The operation is fully and efficiently logged.
Here’s an example, using the scenario we created above:
CHECKPOINT; GO TRUNCATE TABLE [TestTable]; GO SELECT COUNT (*) AS N'LogRecCount' FROM fn_dblog (NULL, NULL); GO
As you can see, there clearly aren’t log records deallocating the 10,000 pages, plus 1,250 extents in the TestTable table.
If I wait a few seconds, and then run the
fn_dblog code again, I get:
You might wonder why there aren’t at least 10,000 log records – one for each page being deallocated. That’s because the page deallocations are even logged efficiently – with one log record reflecting PFS page allocation changes for 8 consecutive data file pages, instead of one log record for each data file page reflecting its allocation status changing in the PFS page.
SQL Server always tries to produce as little transaction log as possible, while still adhering to the rules about full or minimal logging based on the current recovery model. If you want to look at the actual log records generated by the unhook-and-transfer and deferred-drop mechanisms, simply substitute * for COUNT (*) in the fn_dblog code above and look for a transaction with the Transaction Name set to
In future posts I’ll discuss the internals that underpin other persistent myths around performance aspects of the SQL Server Storage Engine.