Can comments hamper stored procedure performance? - SQLPerformance.com
SentryOne - SQL Sentry
Nov 092016
 

Every once in a while, a conversation crops up where people are convinced that comments either do or don't have an impact on performance.

In general, I will say that, no, comments do not impact performance, but there is always room for an "it depends" disclaimer. Let's create a sample database and a table full of junk:

CREATE DATABASE CommentTesting;
GO
USE CommentTesting;
GO
SELECT TOP (1000) n = NEWID(), * INTO dbo.SampleTable 
  FROM sys.all_columns ORDER BY NEWID();
GO
CREATE UNIQUE CLUSTERED INDEX x ON dbo.SampleTable(n);
GO

Now, I want to create four stored procedures – one with 20 characters of comments, one with 2000, one with 20,000, and one with 200,000. And I want to do that again where the comments are embedded *within* a query statement within the procedure, as opposed to being independent (which will have an effect on the plan XML). Finally, I repeated the process adding OPTION (RECOMPILE) to the query.

DECLARE @comments nvarchar(max) = N'', 
        @basesql  nvarchar(max),
        @sql      nvarchar(max);
 
SELECT TOP (5000) -- * 40 character strings
  @comments += N'--' + RTRIM(NEWID()) + CHAR(13) + CHAR(10)
FROM sys.all_columns;
 
SET @basesql = N'CREATE PROCEDURE dbo.$name$
AS
BEGIN
  SET NOCOUNT ON;
 
  /* $comments1$ */
 
  DECLARE @x int;
  SELECT @x = COUNT(*) /* $comments2$ */ FROM dbo.SampleTable OPTION (RECOMPILE);
END';
 
SET @sql = REPLACE(REPLACE(@basesql, N'$name$', N'Small_Separate'),      N'$comments1$', LEFT(@comments, 20));
EXEC sys.sp_executesql @sql;
 
SET @sql = REPLACE(REPLACE(@basesql, N'$name$', N'Medium_Separate'),     N'$comments1$', LEFT(@comments, 2000));
EXEC sys.sp_executesql @sql;
 
SET @sql = REPLACE(REPLACE(@basesql, N'$name$', N'Large_Separate'),      N'$comments1$', LEFT(@comments, 20000));
EXEC sys.sp_executesql @sql;
 
SET @sql = REPLACE(REPLACE(@basesql, N'$name$', N'ExtraLarge_Separate'), N'$comments1$', LEFT(@comments, 200000));
EXEC sys.sp_executesql @sql;
 
SET @sql = REPLACE(REPLACE(@basesql, N'$name$', N'Small_Embedded'),      N'$comments2$', LEFT(@comments, 20));
EXEC sys.sp_executesql @sql;
 
SET @sql = REPLACE(REPLACE(@basesql, N'$name$', N'Medium_Embedded'),     N'$comments2$', LEFT(@comments, 2000));
EXEC sys.sp_executesql @sql;
 
SET @sql = REPLACE(REPLACE(@basesql, N'$name$', N'Large_Embedded'),      N'$comments2$', LEFT(@comments, 20000));
EXEC sys.sp_executesql @sql;
 
SET @sql = REPLACE(REPLACE(@basesql, N'$name$', N'ExtraLarge_Embedded'), N'$comments2$', LEFT(@comments, 200000));
EXEC sys.sp_executesql @sql;

Now, I needed to generate the code to run each procedure 100,000 times, measure the duration from sys.dm_exec_procedure_stats, and also check the size of the plan in cache.

DECLARE @hammer nvarchar(max) = N'';
 
SELECT @hammer += N'
DBCC FREEPROCCACHE;
DBCC DROPCLEANBUFFERS;
GO
EXEC dbo.' + [name] + N';
GO 100000
 
SELECT [size of ' + [name] + ' (b)] = DATALENGTH(definition)
  FROM sys.sql_modules
  WHERE [object_id] = ' + CONVERT(varchar(32),([object_id])) + N';
 
SELECT [size of ' + [name] + ' (b)] = size_in_bytes
  FROM sys.dm_exec_cached_plans AS p
  CROSS APPLY sys.dm_exec_sql_text(p.plan_handle) AS t
  WHERE t.objectid = ' + CONVERT(varchar(32),([object_id])) + N';
 
SELECT N''' + [name] + N''', 
  avg_dur = total_elapsed_time*1.0/execution_count
  FROM sys.dm_exec_procedure_stats
  WHERE [object_id] = ' + CONVERT(varchar(32),([object_id])) + N';'
FROM sys.procedures
WHERE [name] LIKE N'%[_]Separate' OR [name] LIKE N'%[_]Embedded';
 
PRINT @hammer;

First, let's look at the size of the procedure bodies. No surprises here, just confirming that my construction code above generated the expected size of comments in each procedure:

Procedure Size (bytes)
Small_Separate / Small_Embedded 378
Medium_Separate / Medium_Embedded 4,340
Large_Separate / Large_Separate 40,338
ExtraLarge_Separate / ExtraLarge_Separate 400,348

 
Next, how large were the plans in the cache?

Procedure Size (bytes)
Small_Separate / Small_Embedded 40,360
Medium_Separate / Medium_Embedded 40,360
Large_Separate / Large_Separate 40,360
ExtraLarge_Separate / ExtraLarge_Separate 40,360

 
Finally, what was the performance like? Without OPTION (RECOMPILE), here is the average execution time, in milliseconds – pretty consistent across all procedures:

Average duration (milliseconds) - without OPTION (RECOMPILE)
Average duration (milliseconds) – without OPTION (RECOMPILE)

With statement-level OPTION (RECOMPILE), we can see about a 50% hit in average duration across the board compared to no recompilation, but still pretty even:

Average duration (milliseconds) - with OPTION (RECOMPILE)
Average duration (milliseconds) – with OPTION (RECOMPILE)

In both cases, while the OPTION (RECOMPILE) version generally ran slower, there was virtually ZERO difference in runtime, regardless of comment size in the procedure body.

What about higher compilation costs?

Next, I wanted to see if these large comments would have a huge impact on compile costs, for example if the procedures were created WITH RECOMPILE. The construction code above was easy to change to account for this. But in this case, I couldn't rely on sys.dm_exec_procedure_stats, because this doesn't work for procedures WITH RECOMPILE. So my generation code for the test was a little different, since I'd have to track average duration manually:

DECLARE @hammer nvarchar(max) = N'';
 
SELECT @hammer += N'
DBCC FREEPROCCACHE;
DBCC DROPCLEANBUFFERS;
SELECT SYSDATETIME();
GO
EXEC dbo.' + [name] + N';
GO 100000
SELECT SYSDATETIME();';
 
PRINT @hammer;

In this case, I couldn't check the size of the plans in cache, but I was able to determine average runtime of the procedures, and there was a difference based on comment size (or, perhaps, just procedure body size):

Average duration (milliseconds) - WITH RECOMPILE at procedure level
Average duration (milliseconds) – WITH RECOMPILE at procedure level

If we put them all together on a graph, it's clear how much more expensive the WITH RECOMPILE usage can be:

Average duration (milliseconds) - comparing all three methods
Average duration (milliseconds) – comparing all three methods

I will probably take a closer look at this at a later time to see exactly where that hockey stick comes into play – I envision testing in 10,000-character increments. For now, though, I'm pretty satisfied that I have answered the question.

Summary

Comments seem to be completely unrelated to actual, observable stored procedure performance, except in the case where the procedure is defined WITH RECOMPILE. Personally, I don't see this being used in the wild anymore, but YMMV. For the subtle differences between this option and statement-level OPTION (RECOMPILE), see Paul White's article, "Parameter Sniffing, Embedding, and the RECOMPILE Options."

Personally, I think comments can be extremely valuable for anyone who has to review, maintain, or troubleshoot your code. This includes future you. I highly recommend against worrying about the performance impact of a reasonable amount of comments, and instead focus on prioritizing the usefulness of context that the comments provide. As someone on Twitter said, there is a limit. If your comments amount to the abridged version of War and Peace, you might consider – at the risk of decoupling the code from its documentation – putting that documentation elsewhere, and reference the link in the procedure body comments.

To minimize the risk of decoupling, or the documentation and code otherwise becoming out of sync over time, you could create a second procedure, with the suffix _documentation or _comments, and putting the comments (or a commented version of the code) there. Maybe put it in a different schema to keep it out of the main sort lists. At least the documentation stays with the database wherever it goes, though it doesn't guarantee it will be maintained. It's unfortunate that a normal procedure can't be created WITH SCHEMABINDING, in which case you could explicitly tie the comment procedure to the source.

  13 Responses to “Can comments hamper stored procedure performance?”

  1. This is a cool fundamentals post! Thanks for sharing.

  2. Wow, that's some perfectionist job:)

  3. Great analyses!

  4. Awesome post! Thanks.

  5. Awesome.

  6. I wonder if the impact occurs when the definition hits the 4k/8k character limit and has to be stored in BLOB?

  7. This is a great analysis that explains it well. Thank you.

  8. Never thought of decoupling comments as described above. Thanks!

  9. Great article, as always!

    I would never expect comments to effect the final query plan. But I would think it would have to effect the total pool space used by that proc. Thus, my concern with excessive comments, and I've seen some extreme examples, is that they indirectly cause more recompiles and other (minor?) performance issues because less pool buffer space is available for executable code, given too much extraneous commenting.

    • Thanks Scott! Remember that the plan cache stores XML versions of the plan itself and a hash of the query text. From everything I can tell, comments are stripped from the query text, which is why you won't see a huge difference… Actually that makes me think that a good follow-on test would be to create a whole bunch of different procedures and see total plan cache impact depending on whether the procedures have small or large comments. I focused here on performance and not plan cache usage.

  10. Is there any performance overhead if you have highly attributed tables and procedures with a lot of extended property meta data ?

  11. Great job! Thank you.

 Leave a Reply

(required)

(required)