Erin Stellato

A Look At DBCC CHECKCONSTRAINTS and I/O

SentryOne eBooks

In these books, you will find useful, hand-picked articles that will help give insight into some of your most vexing performance problems. These articles were written by several of the SQL Server industry’s leading experts, including Paul White, Paul Randal, Jonathan Kehayias, Erin Stellato, Glenn Berry, Aaron Bertrand, and Joe Sack.

Free Download

Featured Author

Erin Stellato is a Principal Consultant with SQLskills and a Microsoft Data Platform MVP.

Erin’s Posts

A common element used in database design is the constraint. Constraints come in a variety of flavors (e.g. default, unique) and enforce the integrity of the column(s) on which they exist. When implemented well, constraints are a powerful component in the design of a database because they prevent data that doesn’t meet set criteria from getting into a database. However, constraints can be violated using commands such as WITH NOCHECK and IGNORE_CONSTRAINTS. In addition, when using the REPAIR_ALLOW_DATA_LOSS option with any DBCC CHECK command to repair database corruption, constraints are not considered.

Consequently, it is possible to have invalid data in the database – either data that doesn’t adhere to a constraint, or data that no longer maintains the expected primary-foreign key relationship. SQL Server includes the DBCC CHECKCONSTRAINTS statement to find data that violates constraints. After any repair option executes, run DBCC CHECKCONSTRAINTS for the entire database to ensure there are no issues, and there may be times when it’s appropriate to run CHECKCONSTRAINTS for a select constraint or a table. Maintaining data integrity is critical, and while it’s not typical to run DBCC CHECKCONSTRAINTS on a scheduled basis to find invalid data, when you do need to run it, it’s a good idea to understand the performance impact it may create.

DBCC CHECKCONSTRAINTS can execute for a single constraint, a table, or the entire database. Like other check commands, it can take substantial time to complete and will consume system resources, particularly for larger databases. Unlike other check commands, CHECKCONSTRAINTS does not use a database snapshot.

With Extended Events we can examine resource usage when we execute DBCC CHECKCONSTRAINTS for the table. To better show the impact, I ran the Create Enlarged AdventureWorks Tables.sql script from Jonathan Kehayias (blog | @SQLPoolBoy) to create larger tables. Jonathan’s script only creates the indexes for the tables, so the statements below are necessary to add a few selected constraints:

USE [AdventureWorks2012];
GO

ALTER TABLE [Sales].[SalesOrderDetailEnlarged]
WITH CHECK ADD CONSTRAINT [FK_SalesOrderDetailEnlarged_SalesOrderHeaderEnlarged_SalesOrderID]
FOREIGN KEY([SalesOrderID])
REFERENCES [Sales].[SalesOrderHeaderEnlarged] ([SalesOrderID])
ON DELETE CASCADE;
GO

ALTER TABLE [Sales].[SalesOrderDetailEnlarged]
WITH CHECK ADD CONSTRAINT [CK_SalesOrderDetailEnlarged_OrderQty]
CHECK (([OrderQty]>(0)))
GO

ALTER TABLE [Sales].[SalesOrderDetailEnlarged]
WITH CHECK ADD CONSTRAINT [CK_SalesOrderDetailEnlarged_UnitPrice]
CHECK (([UnitPrice]>=(0.00)));
GO

ALTER TABLE [Sales].[SalesOrderHeaderEnlarged]
WITH CHECK ADD CONSTRAINT [CK_SalesOrderHeaderEnlarged_DueDate]
CHECK (([DueDate]>=[OrderDate]))
GO

ALTER TABLE [Sales].[SalesOrderHeaderEnlarged]
WITH CHECK ADD CONSTRAINT [CK_SalesOrderHeaderEnlarged_Freight]
CHECK (([Freight]>=(0.00)))
GO

We can verify what constraints exist using sp_helpconstraint:

EXEC sp_helpconstraint '[Sales].[SalesOrderDetailEnlarged]';
GO

checkc1
sp_helpconstraint output

Once the constraints exist, we can compare the resource usage for DBCC CHECKCONSTRAINTS for a single constraint, a table, and the entire database using Extended Events.  First we’ll create a session that simply captures sp_statement_completed events, includes the sql_text action, and sends the output to the ring_buffer:

CREATE EVENT SESSION [Constraint_Performance] ON SERVER
ADD EVENT sqlserver.sp_statement_completed
(
  ACTION(sqlserver.database_id,sqlserver.sql_text)
)
ADD TARGET package0.ring_buffer
(
  SET max_events_limit=(5000)
)
WITH 
(
    MAX_MEMORY=32768 KB, EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,
    MAX_DISPATCH_LATENCY=30 SECONDS, MAX_EVENT_SIZE=0 KB,
    MEMORY_PARTITION_MODE=NONE, TRACK_CAUSALITY=OFF, STARTUP_STATE=OFF
);
GO

Next we’ll start the session and run each of the DBCC CHECKCONSTRAINT commands, then output the ring buffer to a temp table to manipulate.  Note that DBCC DROPCLEANBUFFERS executes before each check so that each starts from cold cache, keeping a level testing field.

ALTER EVENT SESSION [Constraint_Performance]
ON SERVER
STATE=START;
GO

USE [AdventureWorks2012];
GO

DBCC DROPCLEANBUFFERS;
GO
DBCC CHECKCONSTRAINTS ('[Sales].[CK_SalesOrderDetailEnlarged_OrderQty]') WITH NO_INFOMSGS;
GO
DBCC DROPCLEANBUFFERS;
GO
DBCC CHECKCONSTRAINTS ('[Sales].[FK_SalesOrderDetailEnlarged_SalesOrderHeaderEnlarged_SalesOrderID]') WITH NO_INFOMSGS;
GO
DBCC DROPCLEANBUFFERS;
GO
DBCC CHECKCONSTRAINTS ('[Sales].[SalesOrderDetailEnlarged]') WITH NO_INFOMSGS;
GO
DBCC DROPCLEANBUFFERS;
GO
DBCC CHECKCONSTRAINTS WITH ALL_CONSTRAINTS, NO_INFOMSGS;
GO

DECLARE @target_data XML;

SELECT @target_data = CAST(target_data AS XML)
  FROM sys.dm_xe_sessions AS s
  INNER JOIN sys.dm_xe_session_targets AS t 
  ON t.event_session_address = s.[address]
  WHERE s.name = N'Constraint_Performance'
  AND t.target_name = N'ring_buffer';

SELECT
  n.value('(@name)[1]', 'varchar(50)') AS event_name,
  DATEADD(HOUR ,DATEDIFF(HOUR, SYSUTCDATETIME(), SYSDATETIME()),n.value('(@timestamp)[1]', 'datetime2')) AS [timestamp],
  n.value('(data[@name="duration"]/value)[1]', 'bigint') AS duration,
  n.value('(data[@name="physical_reads"]/value)[1]', 'bigint') AS physical_reads,
  n.value('(data[@name="logical_reads"]/value)[1]', 'bigint') AS logical_reads,
  n.value('(action[@name="sql_text"]/value)[1]', 'varchar(max)') AS sql_text,
  n.value('(data[@name="statement"]/value)[1]', 'varchar(max)') AS [statement]
INTO #EventData
FROM @target_data.nodes('RingBufferTarget/event[@name=''sp_statement_completed'']') AS q(n);
GO

ALTER EVENT SESSION [Constraint_Performance]
ON SERVER
STATE=STOP;
GO

Parsing the ring_buffer into a temp table may take some additional time (about 20 seconds on my machine), but repeated querying of the data is faster from a temp table than via the ring_buffer.  If we look at the output we see there are several statements executed for each DBCC CHECKCONSTRAINTS:

SELECT *
FROM #EventData
WHERE [sql_text] LIKE 'DBCC%';

Extended Events Output
Extended Events Output

Using Extended Events to dig into the inner workings of CHECKCONSTRAINTS is an interesting task, but what we’re really interested here is resource consumption – specifically I/O.  We can aggregate the physical_reads for each check command to compare the I/O:

SELECT [sql_text], SUM([physical_reads]) AS [Total Reads]
FROM #EventData
WHERE [sql_text] LIKE 'DBCC%'
GROUP BY [sql_text];

Aggregated I/O for Checks
Aggregated I/O for Checks

In order to check a constraint, SQL Server has to read through the data to find any rows that might violate the constraint.  The definition of the CK_SalesOrderDetailEnlarged_OrderQty constraint is [OrderQty] > 0.  The foreign key constraint, FK_SalesOrderDetailEnlarged_SalesOrderHeaderEnlarged_SalesOrderID, establishes a relationship on SalesOrderID between the [Sales].[SalesOrderHeaderEnlarged] and [Sales].[SalesOrderDetailEnlarged] tables. Intuitively it might seem as though checking the foreign key constraint would require more I/O, as SQL Server must read data from two tables.  However, [SalesOrderID] exists in the leaf level of the IX_SalesOrderHeaderEnlarged_SalesPersonID nonclustered index on the [Sales].[SalesOrderHeaderEnlarged] table, and in the IX_SalesOrderDetailEnlarged_ProductID index on the [Sales].[SalesOrderDetailEnlarged] table.  As such, SQL Server scans those two indexes to compare the [SalesOrderID] values between the two tables.  This requires just over 19,000 reads.  In the case of the CK_SalesOrderDetailEnlarged_OrderQty constraint, the [OrderQty] column is not included in any index, so a full scan of the clustered index occurs, which requires over 72,000 reads.

When all the constraints for a table are checked, the I/O requirements are higher than if a single constraint is checked, and they increase again when the entire database is checked.  In the example above, the [Sales].[SalesOrderHeaderEnlarged] and [Sales].[SalesOrderDetailEnlarged] tables are disproportionally larger than other tables in the database.  This is not uncommon in real-world scenarios; very often databases have several large tables which comprise a large portion of the database.  When running CHECKCONSTRAINTS for these tables, be aware of the potential resource consumption required for the check.  Run checks during off hours when possible to minimize user impact.  In the event that checks must be running during normal business hours, understanding what constraints exist, and what indexes exist to support validation, can help gauge the effect of the check.  You can execute checks in a test or development environment first to understand the performance impact, but variations may then exist based on hardware, comparable data, etc.  And finally, remember that any time you run a check command that includes the REPAIR_ALLOW_DATA_LOSS option, follow the repair with DBCC CHECKCONSTRAINTS.  Database repair does not take any constraints into account as corruption is fixed, so in addition to potentially losing data, you may end up with data that violates one or more constraints in your database.