A common requirement in ETL and various reporting scenarios is to quietly load a SQL Server staging table in the background, so users querying the data aren’t impacted by the writes and vice-versa. The trick is how and when you point users at the new, refreshed version of the data.
Simplified Example of a Staging Table: A Farmer’s Market Analogy
So, what is a staging table in SQL? A staging table can be more easily understood using a real-world example: Let’s say you have a table full of vegetables you're selling at the local farmer's market. As your vegetables sell and you bring in new inventory:
- When you bring a load of new vegetables, it's going to take you 20 minutes to clear off the table and replace the remaining stock with the newer product.
- You don't want customers to sit there and wait 20 minutes for the switch to happen, since most will get their vegetables elsewhere.
Now, what if you had a second empty table where you load the new vegetables, and while you're doing that, customers can still buy the older vegetables from the first table? (Let's pretend it's not because the older vegetables went bad or are otherwise less desirable.)
Refreshing Tables in SQL Server
There are several methods to reload entire tables while they’re actively being queried; two decades ago, I took unbridled advantage of sp_rename
— I’d play a shell game with an empty shadow copy of the table, happily reloading the shadow copy and then only performing the rename inside a transaction.
In SQL Server 2005, I started using schemas to hold shadow copies of tables I simply transferred around using the same shell game technique, which I wrote about in these two posts:
The only advantage of transferring objects between schemas over renaming them is there are no warning messages about renaming an object – which isn’t even a problem, per se, except the warning messages fill up agent history logs that much faster.
Both approaches still require a schema modification (Sch-M) lock, so they must wait for any existing transactions to release their own locks. Once they acquire their Sch-M lock, they block any subsequent queries requiring schema stability locks (Sch-S)… which is nearly every query. It can rapidly become a blocking chain nightmare, as any new queries needing Sch-S have to get in a queue behind the Sch-M. (And no, you can’t get around this by using RCSI or NOLOCK
everywhere, since even those queries still require Sch-S. You can’t acquire Sch-S with a Sch-M in place, as they’re incompatible—Michael J. Swart talks about that here.)
Kendra Little really opened my eyes about the dangers with schema transfer in her post, “Staging Data: Locking Danger with ALTER SCHEMA TRANSFER.” There she shows why schema transfer can be worse than rename. She later detailed a third and much less impactful way of swapping tables out, which I now use exclusively: partition switching. This method allows the switch to wait at a lower priority, which isn’t even an option with the rename or schema transfer techniques. Joe Sack went into detail about this enhancement added back in SQL Server 2014: “Exploring Low Priority Lock Wait Options in SQL Server 2014 CTP1.”
SQL Server Partition Switching Example
Let’s look at a basic example, following Kendra’s thorough gist here. First, we’ll create two new databases:
CREATE DATABASE NewWay;
CREATE DATABASE OldWay;
GO
In the new database, we’ll create a table to hold our vegetable inventory, and two copies of the table for our shell game:
USE NewWay;
GO
CREATE TABLE dbo.Vegetables_NewWay
(
VegetableID int,
Name sysname,
WhenPicked datetime,
BackStory nvarchar(max)
);
GO
-- we need to create two extra copies of the table.
CREATE TABLE dbo.Vegetables_NewWay_prev
(
VegetableID int,
Name sysname,
WhenPicked datetime,
BackStory nvarchar(max)
);
GO
CREATE TABLE dbo.Vegetables_NewWay_hold
(
VegetableID int,
Name sysname,
WhenPicked datetime,
BackStory nvarchar(max)
);
GO
We create a procedure that loads the staging copy of the table, then uses a transaction to switch the current copy out.
CREATE PROCEDURE dbo.DoTheVeggieSwap_NewWay
AS
BEGIN
SET NOCOUNT ON;
TRUNCATE TABLE dbo.Vegetables_NewWay_prev;
INSERT dbo.Vegetables_NewWay_prev
SELECT TOP (1000000) s.session_id, o.name, s.last_successful_logon,
LEFT(m.definition, 500)
FROM sys.dm_exec_sessions AS s
CROSS JOIN model.sys.all_objects AS o
INNER JOIN model.sys.all_sql_modules AS m
ON o.[object_id] = m.[object_id];
-- need to take Sch-M locks here:
BEGIN TRANSACTION;
ALTER TABLE dbo.Vegetables_NewWay
SWITCH TO dbo.Vegetables_NewWay_hold
WITH (WAIT_AT_LOW_PRIORITY
(MAX_DURATION = 1 MINUTES,
ABORT_AFTER_WAIT = BLOCKERS));
ALTER TABLE dbo.Vegetables_NewWay_prev
SWITCH TO dbo.Vegetables_NewWay;
COMMIT TRANSACTION;
-- and now users will query the new data in dbo
-- can switch the old copy back and truncate it
-- without interfering with other queries
ALTER TABLE dbo.Vegetables_NewWay_hold
SWITCH TO dbo.Vegetables_NewWay_prev;
TRUNCATE TABLE dbo.Vegetables_NewWay_prev;
END
GO
The beauty of WAIT_AT_LOW_PRIORITY
is you can completely control the behavior with the ABORT_AFTER_WAIT
option:
ABORT_AFTER_WAIT setting |
Description / symptoms |
---|---|
SELF | This means the switch will give up after n minutes.
For the session attempting to perform the switch, this will surface as the error message: Lock request time out period exceeded.
|
BLOCKERS | This dictates the switch will wait up to n minutes, then force itself to the front of the line by killing all the blockers ahead of it.
Sessions trying to interact with the table that get bumped by the switch operation will see some combination of these error messages: Your session has been disconnected because of a high priority DDL operation.
Cannot continue the execution because the session is in the kill state. A severe error occurred on the current command. The results, if any, should be discarded. |
NONE | This says the switch will happily wait until it gets its turn, regardless of MAX_DURATION .
This is the same behavior you’d get with rename, schema transfer, or partition switch without |
The BLOCKERS
option isn’t the friendliest way to handle things, since you’re already saying it’s okay through this staging/switch operation for users to see data that’s a little out of date. I’d likely prefer to use SELF
and have the operation try again in cases where it couldn’t get the required locks in the allotted time. I’d keep track of how often it fails, though, especially consecutive failures, because you want to make sure the data doesn’t ever get too stale.
Compared to the Old Way of Switching Between Schemas
Here’s how I would have handled the switching before:
USE OldWay;
GO
-- create two schemas and two copies of the table
CREATE SCHEMA prev AUTHORIZATION dbo;
GO
CREATE SCHEMA hold AUTHORIZATION dbo;
GO
CREATE TABLE dbo.Vegetables_OldWay
(
VegetableID int,
Name sysname,
WhenPicked datetime,
BackStory nvarchar(max)
);
GO
CREATE TABLE prev.Vegetables_OldWay
(
VegetableID int,
Name sysname,
WhenPicked datetime,
BackStory nvarchar(max)
);
GO
CREATE PROCEDURE dbo.DoTheVeggieSwap_OldWay
AS
BEGIN
SET NOCOUNT ON;
TRUNCATE TABLE prev.Vegetables_OldWay;
INSERT prev.Vegetables_OldWay
SELECT TOP (1000000) s.session_id, o.name, s.last_successful_logon,
LEFT(m.definition, 500)
FROM sys.dm_exec_sessions AS s
CROSS JOIN model.sys.all_objects AS o
INNER JOIN model.sys.all_sql_modules AS m
ON o.[object_id] = m.[object_id];
-- need to take Sch-M locks here:
BEGIN TRANSACTION;
ALTER SCHEMA hold TRANSFER dbo.Vegetables_OldWay;
ALTER SCHEMA dbo TRANSFER prev.Vegetables_OldWay;
COMMIT TRANSACTION;
-- and now users will query the new data in dbo
-- can transfer the old copy back and truncate it without
-- interfering with other queries:
ALTER SCHEMA prev TRANSFER hold.Vegetables_OldWay;
TRUNCATE TABLE prev.Vegetables_OldWay;
END
GO
I ran concurrency tests by using two windows of Erik Ejlskov Jensen's SQLQueryStress: one to repeat a call to the procedure every minute, and the other to run 16 threads like this, thousands of times:
BEGIN TRANSACTION;
UPDATE TOP (1) dbo.
Duration and Error Rates | Schema Transfer | ABORT_AFTER_WAIT: SELF | ABORT_AFTER_WAIT: BLOCKERS |
---|---|---|---|
Avg Duration – Transfer/Switch | 96.4 seconds | 68.4 seconds | 20.8 seconds |
Avg Duration – DML | 18.7 seconds | 2.7 seconds | 2.9 seconds |
Exceptions – Transfer/Switch | 0 | 0.5/minute | 0 |
Exceptions – DML | 0 | 0 | 25.5/minute |
Note the durations and exception counts will be highly dependent on your server specs and what else is going on within your environment. Also note that, while there were no exceptions for the schema transfer tests when using SQLQueryStress, you might hit some more strict timeouts depending on the consuming application. And it was so much slower on average, because the blocking piled up much more aggressively. Nobody ever wants exceptions, but when there’s a trade-off like this, you might prefer a few exceptions here and there (depending on the frequency of the refresh operation) over everyone waiting longer all the time.
Partition Switching vs. Rename/Schema Transfer to Refresh SQL Server Tables
Partition switching allows you to pick which part of your process bears the cost of concurrency. You can give preference to the switching process, so the data is more reliably fresh, but this means some of your queries will fail. Conversely, you can prioritize the queries, at the cost of a slower refresh process (and the occasional failure there). The main thrust is SQL Server partition switching is a superior method to refresh SQL Server tables compared to the previous rename/schema transfer techniques on almost all points, and you can use more robust retry logic or experiment with duration tolerances to land at the sweet spot for your workload.
Assuming you just want to truncate and reload, could you not just do
INSERT staging …
BEGIN TRAN
TRUNCATE TABLE main
ALTER TABLE staging SWITCH TO main …
COMMIT
obviating the need for a second staging table?
No really good reason except that I didn't really explain in the post that I wanted to keep the transaction free of the load *and* any truncating work. I can't do that with only one extra copy of the table.
Very interesting, thanks! And it's better than a synonyms switcheroo for the same reason, I think.