Aaron Bertrand

SQL Server 2016 : sys.dm_exec_function_stats

SentryOne Newsletters

The SQLPerformance.com bi-weekly newsletter keeps you up to speed on the most recent blog posts and forum discussions in the SQL Server community.

eNews is a bi-monthly newsletter with fun information about SentryOne, tips to help improve your productivity, and much more.

Subscribe

Featured Author

Paul Randal, CEO of SQLskills, writes about knee-jerk performance tuning, DBCC, and SQL Server internals.

Paul’s Posts

In SQL Server 2016 CTP 2.1, there is one new object that appeared after CTP 2.0: sys.dm_exec_function_stats. This is intended to provide similar functionality to sys.dm_exec_procedure_stats, sys.dm_exec_query_stats, and sys.dm_exec_trigger_stats. SQL Server 2016 : sys.dm_exec_function_stats So it is now possible to track aggregate runtime metrics for user-defined functions.

Or is it?

In CTP 2.1 at least, I could only derive any meaningful metrics here for regular scalar functions – nothing was registered for inline or multi-statement TVFs. I am not surprised about the inline functions, since they are essentially expanded before execution anyway. But since multi-statement TVFs are often performance problems, I was hoping they would show up too. They do still appear in sys.dm_exec_query_stats, so you can still derive their performance metrics from there, but it can get tricky to perform aggregations when you really do have multiple statements that perform some share of the work – nothing is rolled up for you.

Let's take a quick look at how this plays out. Let's say we have a simple table with 100,000 rows:

SELECT TOP (100000) o1.[object_id], o1.create_date
  INTO dbo.src
  FROM sys.all_objects AS o1
  CROSS JOIN sys.all_objects AS o2
  ORDER BY o1.[object_id];
GO
CREATE CLUSTERED INDEX x ON dbo.src([object_id]);
GO
-- prime the cache
SELECT [object_id], create_date FROM dbo.src;

I wanted to compare what happens when we investigate scalar UDFs, multi-statement table-valued functions, and inline table-valued functions, and how we see what work was done in each case. First, imagine something trivial that we can do in the SELECT clause, but that we may want to compartmentalize away, like formatting a date as a string:

CREATE PROCEDURE dbo.p_dt_Standard
  @dt_ CHAR(10) = NULL
AS
BEGIN
  SET NOCOUNT ON;
  SELECT @dt_ = CONVERT(CHAR(10), create_date, 120)
    FROM dbo.src
    ORDER BY [object_id];
END
GO

(I assign the output to a variable, which forces the entire table to be scanned, but prevents the performance metrics from being influenced by the efforts of SSMS to consume and render the output. Thanks for the reminder, Mikael Eriksson.)

A lot of times you'll see people putting that conversion into a function, and it can be scalar or TVF, like these:

CREATE FUNCTION dbo.dt_Inline(@dt_ DATETIME)
RETURNS TABLE
AS
  RETURN (SELECT dt_ = CONVERT(CHAR(10), @dt_, 120));
GO

CREATE FUNCTION dbo.dt_Multi(@dt_ DATETIME)
RETURNS @t TABLE(dt_ CHAR(10))
AS
BEGIN
  INSERT @t(dt_) SELECT CONVERT(CHAR(10), @dt_, 120);
  RETURN;
END
GO

CREATE FUNCTION dbo.dt_Scalar(@dt_ DATETIME)
RETURNS CHAR(10)
AS
BEGIN
  RETURN (SELECT CONVERT(CHAR(10), @dt_, 120));
END
GO

I created procedure wrappers around these functions as follows:

CREATE PROCEDURE dbo.p_dt_Inline
  @dt_ CHAR(10) = NULL
AS
BEGIN
  SET NOCOUNT ON;
  SELECT @dt_ = dt.dt_
    FROM dbo.src AS o
    CROSS APPLY dbo.dt_Inline(o.create_date) AS dt
    ORDER BY o.[object_id];
END
GO

CREATE PROCEDURE dbo.p_dt_Multi
  @dt_ CHAR(10) = NULL
AS
BEGIN
  SET NOCOUNT ON;
  SELECT @dt_ = dt.dt_
    FROM dbo.src
    CROSS APPLY dbo.dt_Multi(create_date) AS dt
    ORDER BY [object_id];
END
GO

CREATE PROCEDURE dbo.p_dt_Scalar
  @dt_ CHAR(10) = NULL
AS
BEGIN
  SET NOCOUNT ON;
  SELECT @dt_ = dt = dbo.dt_Scalar(create_date)
    FROM dbo.src
    ORDER BY [object_id];
END
GO

(And no, the dt_ convention you're seeing is not some new thing I think is a good idea, it was just the simplest way I could isolate all of these queries in the DMVs from everything else being collected. It also made it easy to append suffixes to easily distinguish between the query inside the stored procedure and the ad hoc version.)

Next, I created a #temp table to store timings, and repeated this process (both executing the stored procedure twice, and executing the body of the procedure as an isolated ad hoc query twice, and tracking the timing of each one):

CREATE TABLE #t
(
  ID INT IDENTITY(1,1), 
  q VARCHAR(32), 
  s DATETIME2, 
  e DATETIME2
);
GO

INSERT #t(q,s) VALUES('p Standard',SYSDATETIME());
GO

EXEC dbo.p_dt_Standard;
GO 2

UPDATE #t SET e = SYSDATETIME() WHERE ID = 1;
GO

INSERT #t(q,s) VALUES('ad hoc Standard',SYSDATETIME());
GO

DECLARE @dt_st CHAR(10);
  SELECT @dt_st = CONVERT(CHAR(10), create_date, 120)
    FROM dbo.src
    ORDER BY [object_id];
GO 2

UPDATE #t SET e = SYSDATETIME() WHERE ID = 2;
GO
-- repeat for inline, multi and scalar versions

Then I ran some diagnostic queries, and here were the results:

sys.dm_exec_function_stats

SELECT name = OBJECT_NAME(object_id), 
  execution_count,
  time_milliseconds = total_elapsed_time/1000
FROM sys.dm_exec_function_stats
WHERE database_id = DB_ID()
ORDER BY name;

Results:

name        execution_count    time_milliseconds
---------   ---------------    -----------------
dt_Scalar   400000             1116

That is not a typo; only the scalar UDF shows any presence in the new DMV.

sys.dm_exec_procedure_stats

SELECT name = OBJECT_NAME(object_id), 
  execution_count,
  time_milliseconds = total_elapsed_time/1000
FROM sys.dm_exec_procedure_stats
WHERE database_id = DB_ID()
ORDER BY name;

Results:

name            execution_count    time_milliseconds
-------------   ---------------    -----------------
p_dt_Inline     2                  74
p_dt_Multi      2                  269
p_dt_Scalar     2                  1063
p_dt_Standard   2                  75

This is not a surprising result: using a scalar function leads to an order-of-magnitude performance penalty, while the multi-statement TVF was only about 4x worse. Over multiple tests, the inline function was always as fast or a millisecond or two faster than no function at all.

sys.dm_exec_query_stats

SELECT 
  query = SUBSTRING([text],s,e), 
  execution_count, 
  time_milliseconds
FROM
(
  SELECT t.[text],
    s = s.statement_start_offset/2 + 1,
    e = COALESCE(NULLIF(s.statement_end_offset,-1),8000)/2,
    s.execution_count,
    time_milliseconds = s.total_elapsed_time/1000
  FROM sys.dm_exec_query_stats AS s
  OUTER APPLY sys.dm_exec_sql_text(s.[sql_handle]) AS t
  WHERE t.[text] LIKE N'%dt[_]%' 
) AS x;

Truncated results, re-ordered manually:

query (truncated)                                                       execution_count    time_milliseconds
--------------------------------------------------------------------    ---------------    -----------------
-- p Standard:
SELECT @dt_ = CONVERT(CHAR(10), create_date, 120) ...                   2                  75
-- ad hoc Standard:
SELECT @dt_st = CONVERT(CHAR(10), create_date, 120) ...                 2                  72

-- p Inline:
SELECT @dt_ = dt.dt_ FROM dbo.src AS o CROSS APPLY dbo.dt_Inline...     2                  74
-- ad hoc Inline:
SELECT @dt_in = dt.dt_ FROM dbo.src AS o CROSS APPLY dbo.dt_Inline...   2                  72

-- all Multi:
INSERT @t(dt_) SELECT CONVERT(CHAR(10), @dt_, 120);                     184                5
-- p Multi:
SELECT @dt_ = dt.dt_ FROM dbo.src CROSS APPLY dbo.dt_Multi...           2                  270
-- ad hoc Multi:
SELECT @dt_m = dt.dt_ FROM dbo.src AS o CROSS APPLY dbo.dt_Multi...     2                  257

-- all scalar:
RETURN (SELECT CONVERT(CHAR(10), @dt_, 120));                           400000             581
-- p Scalar:
SELECT @dt_ = dbo.dt_Scalar(create_date)...                             2                  986
-- ad hoc Scalar:
SELECT @dt_sc = dbo.dt_Scalar(create_date)...                           2                  902

An important thing to note here is that the time in milliseconds for the INSERT in the multi-statement TVF and the RETURN statement in the scalar function are also accounted for within the individual SELECTs, so it does not make sense to just add up all of the timings.

Manual timings

And then finally, the timings from the #temp table:

SELECT query = q, 
    time_milliseconds = DATEDIFF(millisecond, s, e) 
  FROM #t 
  ORDER BY ID;

Results:

query             time_milliseconds
---------------   -----------------
p Standard        107
ad hoc Standard   78
p Inline          80
ad hoc Inline     78
p Multi           351
ad hoc Multi      263
p Scalar          992
ad hoc Scalar     907

Additional interesting results here: the procedure wrapper always had some overhead, though how significant that is might be truly subjective.

Summary

My point here today was merely to show the new DMV in action, and set expectations correctly – some performance metrics for functions will still be misleading, and some will still not be available at all (or at least be very tedious to piece together for yourself).

I do think this new DMV covers one of the biggest pieces of query monitoring that SQL Server was missing before, though: that scalar functions are sometimes invisible performance killers, because the only reliable way to identify their usage was to parse the query text, which is far from foolproof. Never mind the fact that that won't allow you to isolate their impact on performance, or that you'd have to have known to be looking for scalar UDFs in the query text in the first place.

Appendix

I've attached the script: DMExecFunctionStats.zip

Also, as of CTP1, here is the set of columns:

database_id object_id type type_desc
sql_handle plan_handle cached_time last_execution_time execution_count
total_worker_time last_worker_time min_worker_time max_worker_time
total_physical_reads last_physical_reads min_physical_reads max_physical_reads
total_logical_writes last_logical_writes min_logical_writes max_logical_writes
total_logical_reads last_logical_reads min_logical_reads max_logical_reads
total_elapsed_time last_elapsed_time min_elapsed_time max_elapsed_time

Columns currently in sys.dm_exec_function_stats