Aaron Bertrand

T-SQL Tuesday #64 : One Trigger or Many?

SentryOne eBooks

In these books, you will find useful, hand-picked articles that will help give insight into some of your most vexing performance problems. These articles were written by several of the SQL Server industry’s leading experts, including Paul White, Paul Randal, Jonathan Kehayias, Erin Stellato, Glenn Berry, Aaron Bertrand, and Joe Sack.

Free Download

Featured Author

Jonathan Kehayias is a Principal Consultant with SQLskills and the youngest MCM ever.

Jonathan’s Posts

http://sqljudo.wordpress.com/2015/03/02/tsql-tue-64-calling-all-tuners-and-gear-heads/

It's that Tuesday of the month – you know, the one when the blogger block party known as T-SQL Tuesday happens. This month it is hosted by Russ Thomas (@SQLJudo), and the topic is, "Calling All Tuners and Gear Heads." I'm going to treat a performance-related problem here, though I do apologize that it might not be fully in line with the guidelines Russ set out in his invitation (I'm not going to use hints, trace flags or plan guides).

At SQLBits last week, I gave a presentation on triggers, and my good friend and fellow MVP Erland Sommarskog happened to attend. At one point I suggested that before creating a new trigger on a table, you should check to see if any triggers already exist, and consider combining the logic instead of adding an additional trigger. My reasons were primarily for code maintainability, but also for performance. Erland asked if I had ever tested to see if there was any additional overhead in having multiple triggers fire for the same action, and I had to admit that, no, I hadn't done anything extensive. So I'm going to do that now.

In AdventureWorks2014, I created a simple set of tables that basically represent sys.all_objects (~2,700 rows) and sys.all_columns (~9,500 rows). I wanted to measure the effect on the workload of various approaches to updating both tables – essentially you have users updating the columns table, and you use a trigger to update a different column in the same table, and a few columns in the objects table.

  • T1: Baseline: Assume that you can control all data access through a stored procedure; in this case, the updates against both tables can be performed directly, with no need for triggers. (This isn't practical in the real world, because you can't reliably prohibit direct access to the tables.)
  • T2: Single trigger against other table: Assume that you can control the update statement against the affected table and add other columns, but the updates to the secondary table need to be implemented with a trigger. We'll update all three columns with one statement.
  • T3: Single trigger against both tables: In this case, we have a trigger with two statements, one that updates the other column in the affected table, and one that updates all three columns in the secondary table.
  • T4: Single trigger against both tables: Like T3, but this time, we have a trigger with four statements, one that updates the other column in the affected table, and a statement for each column updated in the secondary table. This might be the way it's handled if the requirements are added over time and a separate statement is deemed safer in terms of regression testing.
  • T5: Two triggers: One trigger updates just the affected table; the other uses a single statement to update the three columns in the secondary table. This might be the way it's done if the other triggers aren't noticed or if modifying them is prohibited.
  • T6: Four triggers: One trigger updates just the affected table; the other three update each column in the secondary table. Again, this might be the way it's done if you don't know the other triggers exist, or if you're afraid to touch the other triggers due to regression concerns.

Here is the source data we're dealing with:

-- sys.all_objects:
SELECT * INTO dbo.src FROM sys.all_objects;
CREATE UNIQUE CLUSTERED INDEX x ON dbo.src([object_id]);
GO

-- sys.all_columns:
SELECT * INTO dbo.tr1 FROM sys.all_columns;
CREATE UNIQUE CLUSTERED INDEX x ON dbo.tr1([object_id], column_id);
-- repeat 5 times: tr2, tr3, tr4, tr5, tr6

Now, for each of the 6 tests, we're going to run our updates 1,000 times, and measure the length of time

T1: Baseline

This is the scenario where we're lucky enough to avoid triggers (again, not very realistic). In this case, we'll be measuring the reads and duration of this batch. I put /*real*/ into the query text so that I can easily pull the stats for just these statements, and not any statements from within the triggers, since ultimately the metrics roll up to the statements that invoke the triggers. Also note that the actual updates I'm making do not really make any sense, so ignore that I'm setting the collation to the server/instance name and the object's principal_id to the current session's session_id.

UPDATE /*real*/ dbo.tr1 SET name += N'',
  collation_name = @@SERVERNAME
  WHERE name LIKE '%s%';

UPDATE /*real*/ s SET modify_date = GETDATE(), is_ms_shipped = 0, principal_id = @@SPID
  FROM dbo.src AS s
  INNER JOIN dbo.tr1 AS t
  ON s.[object_id] = t.[object_id]
  WHERE t.name LIKE '%s%';

GO 1000

T2: Single Trigger

For this we need the following simple trigger, which only updates dbo.src:

CREATE TRIGGER dbo.tr_tr2
ON dbo.tr2
AFTER UPDATE
AS
BEGIN
  SET NOCOUNT ON;
  UPDATE s SET modify_date = GETDATE(), is_ms_shipped = 0, principal_id = SUSER_ID()
    FROM dbo.src AS s 
	INNER JOIN inserted AS i
	ON s.[object_id] = i.[object_id];
END
GO

Then our batch only needs to update the two columns in the primary table:

UPDATE /*real*/ dbo.tr2 SET name += N'', collation_name = @@SERVERNAME
  WHERE name LIKE '%s%';
GO 1000

T3: Single trigger against both tables

For this test, our trigger looks like this:

CREATE TRIGGER dbo.tr_tr3
ON dbo.tr3
AFTER UPDATE
AS
BEGIN
  SET NOCOUNT ON;
  UPDATE t SET collation_name = @@SERVERNAME
    FROM dbo.tr3 AS t
	INNER JOIN inserted AS i
	ON t.[object_id] = i.[object_id];

  UPDATE s SET modify_date = GETDATE(), is_ms_shipped = 0, principal_id = @@SPID
    FROM dbo.src AS s
    INNER JOIN inserted AS i
    ON s.[object_id] = i.[object_id];
END
GO

And now the batch we're testing merely has to update the original column in the primary table; the other one is handled by the trigger:

UPDATE /*real*/ dbo.tr3 SET name += N''
  WHERE name LIKE '%s%';
GO 1000

T4: Single trigger against both tables

This is just like T3, but now the trigger has four statements:

CREATE TRIGGER dbo.tr_tr4
ON dbo.tr4
AFTER UPDATE
AS
BEGIN
  SET NOCOUNT ON;
  UPDATE t SET collation_name = @@SERVERNAME
    FROM dbo.tr4 AS t
	INNER JOIN inserted AS i
	ON t.[object_id] = i.[object_id];

  UPDATE s SET modify_date = GETDATE()
    FROM dbo.src AS s
    INNER JOIN inserted AS i
    ON s.[object_id] = i.[object_id];

  UPDATE s SET is_ms_shipped = 0
    FROM dbo.src AS s
    INNER JOIN inserted AS i
    ON s.[object_id] = i.[object_id];

  UPDATE s SET principal_id = @@SPID
    FROM dbo.src AS s
    INNER JOIN inserted AS i
    ON s.[object_id] = i.[object_id];
END
GO

The test batch is unchanged:

UPDATE /*real*/ dbo.tr4 SET name += N''
  WHERE name LIKE '%s%';
GO 1000

T5: Two triggers

Here we have one trigger to update the primary table, and one trigger to update the secondary table:

CREATE TRIGGER dbo.tr_tr5_1
ON dbo.tr5
AFTER UPDATE
AS
BEGIN
  SET NOCOUNT ON;
  UPDATE t SET collation_name = @@SERVERNAME
    FROM dbo.tr5 AS t
	INNER JOIN inserted AS i
	ON t.[object_id] = i.[object_id];
END
GO

CREATE TRIGGER dbo.tr_tr5_2
ON dbo.tr5
AFTER UPDATE
AS
BEGIN
  SET NOCOUNT ON;
  UPDATE s SET modify_date = GETDATE(), is_ms_shipped = 0, principal_id = @@SPID
    FROM dbo.src AS s
    INNER JOIN inserted AS i
    ON s.[object_id] = i.[object_id];
END
GO

The test batch is again very basic:

UPDATE /*real*/ dbo.tr5 SET name += N''
  WHERE name LIKE '%s%';
GO 1000

T6: Four triggers

This time we have a trigger for each column that is affected; one in the primary table, and three in the secondary tables.

CREATE TRIGGER dbo.tr_tr6_1
ON dbo.tr6
AFTER UPDATE
AS
BEGIN
  SET NOCOUNT ON;
  UPDATE t SET collation_name = @@SERVERNAME
    FROM dbo.tr6 AS t
    INNER JOIN inserted AS i
    ON t.[object_id] = i.[object_id];
END
GO

CREATE TRIGGER dbo.tr_tr6_2
ON dbo.tr6
AFTER UPDATE
AS
BEGIN
  SET NOCOUNT ON;
  UPDATE s SET modify_date = GETDATE()
    FROM dbo.src AS s
    INNER JOIN inserted AS i
    ON s.[object_id] = i.[object_id];
END
GO

CREATE TRIGGER dbo.tr_tr6_3
ON dbo.tr6
AFTER UPDATE
AS
BEGIN
  SET NOCOUNT ON;
  UPDATE s SET is_ms_shipped = 0
    FROM dbo.src AS s
    INNER JOIN inserted AS i
    ON s.[object_id] = i.[object_id];
END
GO

CREATE TRIGGER dbo.tr_tr6_4
ON dbo.tr6
AFTER UPDATE
AS
BEGIN
  SET NOCOUNT ON;
  UPDATE s SET principal_id = @@SPID
    FROM dbo.src AS s
    INNER JOIN inserted AS i
    ON s.[object_id] = i.[object_id];
END
GO

And the test batch:

UPDATE /*real*/ dbo.tr6 SET name += N''
  WHERE name LIKE '%s%';
GO 1000

Measuring workload impact

Finally, I wrote a simple query against sys.dm_exec_query_stats to measure reads and duration for each test:

SELECT 
  [cmd] = SUBSTRING(t.text, CHARINDEX(N'U', t.text), 23), 
  avg_elapsed_time = total_elapsed_time / execution_count * 1.0,
  total_logical_reads
FROM sys.dm_exec_query_stats AS s 
CROSS APPLY sys.dm_exec_sql_text(s.sql_handle) AS t
WHERE t.text LIKE N'%UPDATE /*real*/%'
ORDER BY cmd;

Results

I ran the tests 10 times, collected the results, and averaged everything. Here is how it broke down:

Test/Batch Average Duration
(microseconds)
Total Reads
(8K pages)
T1: UPDATE /*real*/ dbo.tr1 … 22,608 205,134
T2: UPDATE /*real*/ dbo.tr2 … 32,749 11,331,628
T3: UPDATE /*real*/ dbo.tr3 … 72,899 22,838,308
T4: UPDATE /*real*/ dbo.tr4 … 78,372 44,463,275
T5: UPDATE /*real*/ dbo.tr5 … 88,563 41,514,778
T6: UPDATE /*real*/ dbo.tr6 … 127,079 100,330,753

 
And here is a graphical representation of the duration:

Average Duration (microseconds)

Conclusion

It is clear that, in this case, there is some substantial overhead for each trigger that gets invoked – all of these batches ultimately affected the same number of rows, but in some cases the same rows were touched multiple times. I will probably perform further follow-on testing to measure the difference when the same row is never touched more than once – a more complicated schema, perhaps, where 5 or 10 other tables have to be touched every time, and these different statements could be in a single trigger or in multiple. My guess is that the overhead differences will be driven more by things like concurrency and the number of rows affected than by the overhead of the trigger itself – but we shall see.

Want to try the demo yourself? Download the script here.