In the first post in this series, I showed the analysis I used to determine that the default trace is not for us. While looking at what information we actually needed to collect in its place (file size changes), and how this should be exposed to users, I considered the following points about the default trace:
- it only captures automatic events;
- it does not capture the "culprit" batch that caused the event, unless you're lucky enough that it was also captured for another reason (e.g. DDL); and,
- it captures events using local time (our servers are Eastern and obey DST).
In its defense, it does capture a lot of important information about those automatic events. After we disable the default trace, we may still want to review events that happened before the change and were captured in those files. But once the default trace is disabled, the row no longer exists in sys.traces
, so you can't determine the path to the .trc
files from there. Here is where the inflexibility of the default trace actually provides a benefit: the files are hardcoded to reside in the same folder as SERVERPROPERTY(N'ErrorLogFileName')
. So, even if the default trace is disabled, we can still pull data from the files using the following query (with adjustments to show the data in UTC):
;WITH dst AS
(
SELECT s,e,d
FROM (VALUES ('20190310','20191103',240),('20191103','20200308',300),
('20200308','20201101',240),('20201101','20210314',300),
('20210314','20211107',240)) AS dst(s,e,d)
), -- will add 2022, 2023, etc. later (if DST is still a thing then)
p AS
(
SELECT TracePath = REVERSE(SUBSTRING(p, CHARINDEX(N'\', p), 260)) + N'log.trc' FROM
(SELECT REVERSE((CONVERT(nvarchar(max), SERVERPROPERTY(N'ErrorLogFileName'))))) AS s(p)
),
trc AS
(
SELECT src = 'trc',
t.DatabaseName,
t.[FileName],
DurationSeconds = CONVERT(decimal(18,3),Duration/1000000.0),
StartTimeUTC = DATEADD(MINUTE, COALESCE(st1.d,0), t.StartTime),
EndTimeUTC = DATEADD(MINUTE, COALESCE(st2.d,0), t.EndTime),
FileType = CASE t.EventClass WHEN 92 THEN 'Data' WHEN 93 THEN 'Log' END,
Culprit = TextData,
IsAutomatic = 'true',
ChangeMB = CONVERT(bigint, IntegerData)*8/1024,
Principal = t.LoginName,
t.HostName,
App = CASE WHEN ApplicationName LIKE N'%Management Studio%Query%'
THEN N'SSMS - Query Window'
WHEN ApplicationName LIKE N'%Management Studio%'
THEN N'SSMS - GUI!'
ELSE ApplicationName END
FROM p CROSS APPLY sys.fn_trace_gettable(p.TracePath, DEFAULT) AS t
LEFT OUTER JOIN dst AS st1 ON t.StartTime >= DATEADD(HOUR,2,st1.s)
AND t.StartTime < DATEADD(HOUR,2,st1.e)
LEFT OUTER JOIN dst AS st2 ON t.EndTime >= DATEADD(HOUR,2,st2.s)
AND t.EndTime < DATEADD(HOUR,2,st2.e)
WHERE t.EventClass IN (92,93)
)
SELECT *
FROM trc
ORDER BY StartTimeUTC DESC;
Sample results from one server, where there were definitely some manual and automatic events that happened (click to enlarge):
Writing a replacement
The Extended Events session I formulated to replace this, which would also capture manual file size changes and the query text that caused automatic events, is as follows:
CREATE EVENT SESSION FileSizeChanges ON SERVER
ADD EVENT sqlserver.database_file_size_change
(
ACTION
(
sqlserver.sql_text,
sqlserver.client_app_name,
sqlserver.client_hostname,
sqlserver.username,
sqlserver.server_principal_name
)
)
ADD TARGET package0.event_file
(
SET filename = N'W:\SomePath\FileSizeChanges.xel',
MAX_FILE_SIZE = 100, -- MB
MAX_ROLLOVER_FILES = 100 -- 100 MB x 100 = max 10 GB
)
WITH
(
MAX_MEMORY = 8192 KB,
EVENT_RETENTION_MODE = ALLOW_SINGLE_EVENT_LOSS,
MAX_DISPATCH_LATENCY = 30 SECONDS,
MAX_EVENT_SIZE = 0 KB,
TRACK_CAUSALITY = OFF,
STARTUP_STATE = ON
);
ALTER EVENT SESSION FileSizeChanges ON SERVER STATE = START;
While it looks like username
and server_principal_name
may be redundant, I actually found cases where the latter was NULL
(and it seems this problem has been around for a while).
Checking the results
I enabled that session on February 22nd, so it is missing the events the default trace captured on the 12th, but it captured more than the single autogrowth event from the 23rd. I ran the following query to get a result of the same shape as above:
;WITH FileInfo(XEPath) AS
(
SELECT LEFT(BasePath,COALESCE(NULLIF(CHARINDEX(SessionName,BasePath)-1,-1),0))
+ SessionName + N'*.xel'
FROM
(
SELECT xmlsrc.data.value(N'(@name)[1]', N'nvarchar(max)'), SessionName
FROM
(
SELECT CONVERT(xml,target_data), s.[name]
FROM sys.dm_xe_session_targets AS t
INNER JOIN sys.dm_xe_sessions AS s
ON s.[address] = t.event_session_address
WHERE s.[name] = N'FileSizeChanges'
) AS xefile (TargetData, SessionName)
CROSS APPLY TargetData.nodes(N'//EventFileTarget/File') AS xmlsrc(data)
) AS InnerData(BasePath, SessionName)
),
SessionData(EventData) AS
(
SELECT CONVERT(xml, TargetData.event_data) FROM FileInfo
CROSS APPLY sys.fn_xe_file_target_read_file(FileInfo.XEPath, NULL, NULL, NULL) AS TargetData
),
src AS
(
SELECT
EndTimeUTC = x.d.value(N'(@timestamp)[1]', N'datetime2'),
DatabaseID = x.d.value(N'(data [@name="database_id"]/value)[1]', N'int'),
[FileName] = x.d.value(N'(data [@name="file_name"]/value)[1]', N'sysname'),
Duration = x.d.value(N'(data [@name="duration"]/value)[1]', N'int'),
FileType = x.d.value(N'(data [@name="file_type"]/text)[1]', N'varchar(4)'),
Culprit = x.d.value(N'(action[@name="sql_text"]/value)[1]', N'nvarchar(max)'),
IsAutomatic = x.d.value(N'(data [@name="is_automatic"]/value)[1]', N'varchar(5)'),
ChangeKB = x.d.value(N'(data [@name="size_change_kb"]/value)[1]', N'bigint'),
Principal = x.d.value(N'(action[@name="server_principal_name"]/value)[1]', N'sysname'),
username = x.d.value(N'(action[@name="username"]/value)[1]', N'sysname'),
AppName = x.d.value(N'(action[@name="client_app_name"]/value)[1]', N'sysname'),
HostName = x.d.value(N'(action[@name="client_hostname"]/value)[1]', N'sysname')
FROM SessionData CROSS APPLY EventData.nodes(N'/event') AS x(d)
)
SELECT
src = 'xe',
DatabaseName = DB_NAME(DatabaseID),
[FileName],
DurationSeconds = CONVERT(decimal(18,3), Duration/1000000.0),
StartTimeUTC = DATEADD(MICROSECOND, -Duration, EndTimeUTC),
EndTimeUTC,
FileType,
Culprit,
IsAutomatic,
ChangeMB = CONVERT(decimal(18,3), ChangeKB / 1024.0),
Principal = COALESCE([Principal], COALESCE(NULLIF(username,N''), N'?')),
HostName,
App = CASE WHEN AppName LIKE N'%Management Studio%Query%'
THEN N'SSMS - Query Window'
WHEN AppName LIKE N'%Management Studio%'
THEN N'SSMS - GUI!'
ELSE AppName END
FROM src
ORDER BY StartTimeUTC DESC;
The results show the additional fun I had, including (gasp!) running a "Shrink Database" task from the UI (click to enlarge):
Deploying it everywhere
Confident that I could now get a more complete picture of file size change events on any server, it was time to deploy. I used a CMS query window to first disable the default trace everywhere, and set the show advanced options
back the way I found it:
IF EXISTS
(
SELECT 1 FROM sys.configurations
WHERE name = N'default trace enabled'
AND value_in_use = 1
)
BEGIN
DECLARE @OriginalAdvancedOptions bit =
(
SELECT CONVERT(bit, value_in_use)
FROM sys.configurations
WHERE name = N'show advanced options'
);
IF @OriginalAdvancedOptions = 0 -- need to turn this on if it's not already
BEGIN
EXEC sys.sp_configure @configname = N'show advanced options', @configvalue = 1;
RECONFIGURE WITH OVERRIDE;
END
EXEC sys.sp_configure @configname = N'default trace enabled', @configvalue = 0;
IF @OriginalAdvancedOptions = 0 -- need to set it back (else leave it)
BEGIN
EXEC sys.sp_configure @configname = N'show advanced options', @configvalue = 0;
END
RECONFIGURE WITH OVERRIDE;
END
GO
Then, to create the Extended Event session, I need to use dynamic SQL, because some servers might have different paths for SERVERPROPERTY(N'ErrorLogFileName')
and that argument can't be parameterized.
DECLARE @path nvarchar(max) = (SELECT REVERSE(SUBSTRING(p, CHARINDEX(N'\', p), 4000))
FROM (SELECT REVERSE((CONVERT(nvarchar(max), SERVERPROPERTY(N'ErrorLogFileName'))))) AS s(p));
IF EXISTS (SELECT 1 FROM sys.dm_xe_sessions WHERE name = N'FileSizeChanges')
BEGIN
EXEC sys.sp_executesql N'DROP EVENT SESSION FileSizeChanges ON SERVER;';
END
DECLARE @sql nvarchar(max) = N' CREATE EVENT SESSION FileSizeChanges ON SERVER
ADD EVENT sqlserver.database_file_size_change
(
ACTION
(
sqlserver.sql_text,
sqlserver.client_app_name,
sqlserver.client_hostname,
sqlserver.username,
sqlserver.server_principal_name
)
)
ADD TARGET package0.event_file
(
SET filename = ''' + @path + N'FileSizeChanges.xel'',
MAX_FILE_SIZE = 100, -- MB
MAX_ROLLOVER_FILES = 100 -- 100 MB x 100 = max 10 GB
)
WITH
(
MAX_MEMORY = 8192 KB,
EVENT_RETENTION_MODE = ALLOW_SINGLE_EVENT_LOSS,
MAX_DISPATCH_LATENCY = 30 SECONDS,
MAX_EVENT_SIZE = 0 KB,
TRACK_CAUSALITY = OFF,
STARTUP_STATE = ON
);
ALTER EVENT SESSION FileSizeChanges ON SERVER STATE = START;';
EXEC sys.sp_executesql @sql;
The proof is in the pudding
I created a mock workload that was intentionally heavy with things that would log events to the default trace, then ran it multiple times with and without the default trace enabled, to show that observer effect can have an impact on the workload.
SELECT [starting] = sysdatetime();
GO
EXEC sys.sp_executesql N'CREATE OR ALTER PROCEDURE dbo.dostuff
AS
BEGIN
SET NOCOUNT ON;
SELECT DISTINCT TOP (1000) object_id, column_id INTO #blat FROM sys.all_columns;
ALTER TABLE #blat ADD CONSTRAINT PK_wahoo PRIMARY KEY (object_id, column_id);
ALTER TABLE #blat ADD CONSTRAINT DF_what DEFAULT(1) FOR object_id;
CREATE INDEX IX_what ON #blat(column_id);
DROP TABLE #blat;
END';
EXEC dbo.dostuff;
CREATE USER smidgen WITHOUT LOGIN;
ALTER ROLE db_owner ADD MEMBER smidgen;
DBCC TRACEON(2861) WITH NO_INFOMSGS;
DBCC TRACEOFF(2861) WITH NO_INFOMSGS;
DROP USER smidgen;
GO 5000
SELECT [finished] = sysdatetime();
GO
Average runtime:
Default trace | Average workload time |
---|---|
Enabled |
147.4s
|
Disabled |
131.6s
|
A 10-11% runtime reduction is certainly not huge in isolation, but it's a big win if you think about the cumulative impact across an entire fleet of 200+ servers.
What's next?
Don't do this yet. We still have to talk about some side effects of disabling the default trace, and create views so that users can easily consume the session data without becoming XQuery experts. Stay tuned!