Break large delete operations into chunks
SentryOne - SQL Sentry
Mar 132013

Far too often I see folks complaining about how their transaction log took over their hard disk. Many times it turns out that they were performing a large delete operation, such as purging or archiving data, in one large transaction.

I wanted to run some tests to show the impact, on both duration and the transaction log, of performing the same data operation in chunks versus a single transaction. I created a database and populated it with a largish table (SalesOrderDetailEnlarged, from this AdventureWorks enlarging script from Jonathan Kehayias (blog | @SQLPoolBoy)). This table has 4.85 million rows and has a reserved space usage of 711 MB (478 MB in data and 233 MB in indexes).

After populating the table, I backed up the database, backed up the log, and ran a DBCC SHRINKFILE (don't shoot me) so that the impact on the log file could be established from a baseline (knowing full well that these operations *will* cause the transaction log to grow).

I purposely used a mechanical disk as opposed to an SSD. While we may start seeing a more popular trend of moving to SSD, it hasn't happened yet on a large enough scale; in many cases it's still too cost prohibitive to do so in large storage devices.

The Tests

So next I had to determine what I wanted to test for greatest impact. Since I was involved in a discussion with a co-worker just yesterday about deleting data in chunks, I chose deletes. And since the clustered index on this table is on SalesOrderID, I didn't want to use that – that would be too easy (and would very rarely match the way deletes are handled in real life). So I decided instead to go after a series of ProductID values, which would ensure I would hit a large number of pages and require a lot of logging. I determined which products to delete by the following query:

  ProductID, ProductCount = COUNT(*)
FROM dbo.SalesOrderDetailEnlarged
ORDER BY ProductCount DESC;

This yielded the following results:

ProductID  ProductCount
---------  ------------
870	   187520
712	   135280
873	   134160

This would delete 456,960 rows (about 10% of the table), spread across many orders. This isn't a realistic modification in this context, since it will mess with pre-calculated order totals, and you can't really remove a product from an order that has already shipped. But using a database we all know and love, it is analogous to, say, deleting a user from a forum site, and also deleting all of their messages – a real scenario I have seen in the wild.

So one test would be to perform the following, one-shot delete:

DELETE dbo.SalesOrderDetailEnlarged WHERE ProductID IN (712, 870, 873);

I know this is going to require a massive scan and take a huge toll on the transaction log. That's kind of the point. :-)

While that was running, I put together a different script that will perform this delete in chunks: 25,000, 50,000, 75,000 and 100,000 rows at a time. Each chunk will be committed in its own transaction (so that if you need to stop the script, you can, and all previous chunks will already be committed, instead of having to start over), and depending on the recovery model, will be followed by either a CHECKPOINT or a BACKUP LOG to minimize the ongoing impact on the transaction log. (I will also test without these operations.) It will look something like this (I am not going to bother with error handling and other niceties for this test, but you shouldn't be as cavalier):

SET @r = 1;
WHILE @r > 0
  DELETE TOP (100000) -- this will change
    WHERE ProductID IN (712, 870, 873);
  SET @r = @@ROWCOUNT;
  -- CHECKPOINT;    -- if simple
  -- BACKUP LOG ... -- if full

Of course, after each test, I would restore the original backup of the database WITH REPLACE, RECOVERY, set the recovery model accordingly, and run the next test.

The Results

The outcome of the first test was not very surprising at all. To perform the delete in a single statement, it took 42 seconds in full, and 43 seconds in simple. In both cases this grew the log to 579 MB.

The next set of tests had a couple of surprises for me. One is that, while these chunking methods did significantly reduce impact to the log file, only a couple of combinations came close in duration, and none were actually faster. Another is that, in general, chunking in full recovery (without performing a log backup between steps) performed better than equivalent operations in simple recovery. Here are the results for duration and log impact:

Duration, in seconds, of various delete operations removing 457K rows

Log size, in MB, after various delete operations removing 457K rows

Again, in general, while log size is significantly reduced, duration is increased. You can use this type of scale to determine whether it's more important to reduce the impact to disk space or to minimize the amount of time spent. For a small hit in duration (and after all, most of these processes are run in the background), you can have a significant savings (up to 94%, in these tests) in log space usage.

Note that I did not try any of these tests with compression enabled (possibly a future test!), and I left the log autogrow settings at the terrible defaults (10%) – partly out of laziness and partly because many environments out there have retained this awful setting.

But what if I have more data?

Next I thought I should test this on a slightly larger database. So I made another database and created a new, larger copy of dbo.SalesOrderDetailEnlarged. Roughly ten times larger, in fact. This time instead of a primary key on SalesOrderID, SalesorderDetailID, I just made it a clustered index (to allow for duplicates), and populated it this way:

  INTO dbo.SalesOrderDetailReallyReallyEnlarged 
  FROM AdventureWorks2012.Sales.SalesOrderDetailEnlarged AS c
    SELECT TOP 10 Number FROM master..spt_values
  ) AS x;
CREATE CLUSTERED INDEX so ON dbo.SalesOrderDetailReallyReallyEnlarged
-- I also made this index non-unique:
CREATE NONCLUSTERED INDEX rg ON dbo.SalesOrderDetailReallyReallyEnlarged(rowguid);
CREATE NONCLUSTERED INDEX p ON dbo.SalesOrderDetailReallyReallyEnlarged(ProductID);

Due to disk space limitations, I had to move off of my laptop's VM for this test (and chose a 40-core box, with 128 GB of RAM, that just happened to be sitting around quasi-idle :-)), and still it was not a quick process by any means. Population of the table and creation of the indexes took ~24 minutes.

The table has 48.5 million rows and takes up 7.9 GB in disk (4.9 GB in data, and 2.9 GB in index).

This time, my query to determine a good set of candidate ProductID values to delete:

  ProductID, ProductCount = COUNT(*)
FROM dbo.SalesOrderDetailReallyReallyEnlarged
ORDER BY ProductCount DESC;

Yielded the following results:

ProductID  ProductCount
---------  ------------
870	   1828320
712	   1318980
873	   1308060

So we are going to delete 4,455,360 rows, a little under 10% of the table. Following a similar pattern to the above test, we're going to delete all in one shot, then in chunks of 500,000, 250,000 and 100,000 rows.


duration_10xDuration, in seconds, of various delete operations removing 4.5MM rows

log_size_10xLog size, in MB, after various delete operations removing 4.5MM rows

So again, we see a significant reduction in log file size (over 97% in cases with the smallest chunk size of 100K); however, at this scale, we see a few cases where we also accomplish the delete in less time, even with all the autogrow events that must have occurred. That sounds an awful lot like win-win to me!

This time with a bigger log

Now, I was curious how these different deletes would compare with a log file pre-sized to accommodate for such large operations. Sticking with our larger database, I pre-expanded the log file to 6 GB, backed it up, then ran the tests again:

(NAME=delete_test_log, SIZE=6000MB);

Results, comparing duration with a fixed log file to the case where the file had to autogrow continuously:

Duration, in seconds, of various delete operations removing 4.5MM rows, comparing fixed log size and autogrow

Again we see that the methods that chunk deletes into batches, and do *not* perform a log backup or a checkpoint after each step, rival the equivalent single operation in terms of duration. In fact, see that most actually perform in less overall time, with the added bonus that other transactions will be able to get in and out between steps. Which is a good thing unless you want this delete operation to block all unrelated transactions.


It is clear that there is no single, correct answer to this problem – there are a lot of inherent "it depends" variables. It may take some experimenting to find your magic number, as there will be a balance between the overhead it takes to backup the log and how much work and time you save at different chunk sizes. But if you are planning to delete or archive a large number of rows, it is quite likely that you will be better off, overall, performing the changes in chunks, rather than in one, massive transaction – even though the duration numbers seem to make that a less attractive operation. It's not all about duration – if you don't have a sufficiently pre-allocated log file, and don't have the space to accommodate such a massive transaction, it is probably much better to minimize log file growth at the cost of duration, in which case you'll want to ignore the duration graphs above and pay attention to the log size graphs.

If you can afford the space, you still may or may not want to pre-size your transaction log accordingly. Depending on the scenario, sometimes using the default autogrow settings ended up slightly faster in my tests than using a fixed log file with plenty of room. Plus, it may be tough to guess exactly how much you'll need to accommodate a large transaction you haven't run yet. If you can't test a realistic scenario, try your best to picture your worst case scenario – then, for safety, double it. Kimberly Tripp (blog | @KimberlyLTripp) has some great advice in this post: 8 Steps to better Transaction Log throughput – in this context, specifically, look at point #6. Regardless of how you decide to calculate your log space requirements, if you're going to end up needing the space anyway, better to take it in a controlled fashion well in advance, than to halt your business processes while they wait for an autogrow (never mind multiple!).

Another very important facet of this that I did not measure explicitly is the impact to concurrency – a bunch of shorter transactions will, in theory, have less impact on concurrent operations. While a single delete took slightly less time than the longer, batched operations, it held all of its locks for that entire duration, while the chunked operations would allow for other queued transactions to sneak in between each transaction. In a future post I'll try to take a closer look on this impact (and I have plans for other deeper analysis as well).

  22 Responses to “Break large delete operations into chunks”

  1. Very valuable. I was surprised that in some cases chunking was able to outperform the big deletes as one would naively assume that a single query plan with all modifications nicely sorted per-partition would be most efficient. Maybe this happened because we spilled to disk with one of the sorting tables, or we exhaused some other cache.

    An entirely different strategy to do this would be:

    select * into NewTable from T where NOT(filter); create index ix on NewTable ...; alter table T switch NewTable

  2. […] Break Large Delete Operations Into Chunks by Aaron Bertrand […]

  3. Solid gold – thanks for posting.

  4. […] for a solution to the problem, I came across this blog post on breaking large delete operations into chunks. It shows in good detail how the simple version above behaves against running a loop of a few tens […]

  5. Just wanted to say thanks for the very detailed info in your post. Saved my bacon!

  6. Great post Aaron, as always :)

  7. Thanks a lot for your explanation ;). It helped me to solve my problem.

    All the best,


  8. Hi there

    It is a very interesting article about deleting and give all the info that i need and more than that, but i have question about the difference
    between checkpoints and backup log in the mentioned example? I have Oracle background :)

    thanks in advanced.

    • Both clear up log space for reuse (which can prevent the log from having to grow). CHECKPOINT is to be used in simple recovery mode because you can't back up the log there. You need to back up the log when in full recovery mode – even with manual checkpoints SQL Server still can't reuse those portions of the log until they're backed up.

  9. Thanks. helped to solve my problem.

  10. hi Aaron

    a very valuable article .
    but i have a doubt there , why 'chunks delete' generate less logging ? could you explain that here ? as i googled lots of , nothing reasonable found .

    best regards Min Yang

    • Hi Min, it's not that it generates less logging, it's that it generates less logging at one time. If you delete 1 billion rows, you have to log that entire thing as one single atomic operation. However if you break it up into chucks of 100,000 rows 10,000 times, you only ever have to log 100,000 rows at one time, once you back up the log or checkpoint (depending on recovery model), you can free up the log used for those first 100,000 rows to log another 100,000. This really reduces the stress on the transaction log and should eliminate most, if not all, of the growth required to accommodate your delete action.

      Now, in some cases, that might mean the whole operation will take longer, but as I mentioned in the post, there are some other benefits that might outweigh absolute clock time…

  11. In my recent project my task was to clean an entire database by using sql statement and each table having many constraints like Primary Key and Foreign Key. There are more than 1000 tables in database so its not possible to write a delete query on each and ever table.

    By using a stored procedure named


    which allows us to easily process some code against each and every table in a single database. It means that it is used to process a single T-SQL command or a different T-SQL commands against every table in the database.

    So follow the below steps to truncate all tables in a SQL Server Database:-

    --Step 1- Disable all constraints:
    EXEC sys.sp_msforeachtable 'ALTER TABLE ? NOCHECK CONSTRAINT ALL' 
    --Step 2- Execute a Delete or truncate operation on each table:
    EXEC sys.sp_msforeachtable 'DELETE FROM ?'  
    --Step 3- Enable all constraints:

    Refer to

    • This works, but you should be aware of these points:

      1. sp_MSforeachtable seems convenient, but it's undocumented, unsupported, and likely prone to the same problems inherent in sp_MSforeachdb (see here and here for background).
      2. The arguments to the procedure should have an N prefix, in case any of the tables have Unicode characters in their names.
      3. Commands may fail if you have tables with improper identifiers (start with a number, or contain spaces, dashes, apostrophes or other invalid characters).
      4. Commands may fail if they are in a schema other than the caller's default schema.

      I would rather do this:

      DECLARE @sql NVARCHAR(MAX) = N'';
        FROM sys.schemas AS s
        INNER JOIN sys.tables AS t
        ON s.[schema_id] = t.[schema_id];
      SELECT @sql += N'DELETE ' + QUOTENAME( + N'.' + QUOTENAME( + N';'
        FROM sys.schemas AS s
        INNER JOIN sys.tables AS t
        ON s.[schema_id] = t.[schema_id];
        FROM sys.schemas AS s
        INNER JOIN sys.tables AS t
        ON s.[schema_id] = t.[schema_id];
      PRINT @sql; -- this may be too long to properly debug through PRINT
      -- EXEC sys.sp_executesql @sql;
  12. Hi Aaron, thanks for your post on this subject.

    I have just finished writing a stored procedure to purge data from a number of tables.
    I am using a similar technic with TOP (X) but with a CTE to sort by the Primary Key. I would think that there could be an advantage in deleting in this manner but I am not sure if SQL Server is smart enough not to re-gig it's index structures when you are not deleting/updating randomly as oppose to deleting in the same order that the data was created in the first place.
    Do you have any insight on weather there is less IO when deleting data this way?

    Regards, Ignacio

    • Hi Ignacio, I purposely used TOP without ORDER BY not only because dictating order would require more cumbersome code (like the CTE you mention), but also to allow SQL Server to determine the most efficient way to delete TOP N rows. Personally, I don't really care which n rows are deleted first, next, last, etc. Depending on the WHERE clause, forcing SQL Server to scan the PK (which may or may not be clustered) might *not* be the most efficient way to find n rows from before a certain date or for a certain customer. If the most efficient way to delete based on your filter happens to match the PK, then SQL Server should do that automatically (I'd love to hear about cases where something different happens).

  13. Thanks, Aaron. I frequently face the need to chunk through DELETEs as you describe. I have a question, though. You encased the DELETE statement within an explicit transaction. Since the DELETE will always include an implicit transaction, what's the purpose of the explicit one?

    Thanks, Vern Rabe

    • Hi Vern, I just prefer to be explicit, especially when I'm potentially recommending other things that rely on or interact with transaction semantics (which as checkpoint / backup log), and especially when people might borrow this code to perform deletes against multiple (potentially related) tables, add error handling and rollback capability, etc.

  14. This post really helped get some clarity in my mind about how to approach a problem I was having. Thank you.

  15. […] I suggest that everyone who has had to delete (not truncate) large amounts of data from a table should read Aaron Bertrand article on "Break large delete operations into chunks". […]

 Leave a Reply