Paul White

Simple Parameterization and Trivial Plans — Part 4

SentryOne eBooks

In these books, you will find useful, hand-picked articles that will help give insight into some of your most vexing performance problems. These articles were written by several of the SQL Server industry’s leading experts, including Paul White, Paul Randal, Jonathan Kehayias, Erin Stellato, Glenn Berry, Aaron Bertrand, and Joe Sack.

Free Download

Featured Author

Jonathan Kehayias is a Principal Consultant with SQLskills and the youngest MCM ever.

Jonathan’s Posts

[ This series:  Part 1 | Part 2 | Part 3 | Part 4 | Part 5 | Part 6 ]

The Compilation Process

The most important things to understand about server-side parameterization are it doesn’t happen all at once and a final decision to parameterize isn’t made until the end of the process.

Multiple compilation components are involved in successfully turning a SQL statement containing constants into a parameterized prepared statement with typed parameters. Each component is active at a different time during compilation and performs its own parameterization-related tasks using the information available at that time.

The scope and timing of each component’s activity explain the curious things seen in the previous parts of this series. Even when the parameterization attempt isn’t successful, there can still be effects from preparations for parameterization visible in the final execution plan.

There are a number of details to explore to gain a complete understanding. Let’s follow the main compilation stages noting the effects on parameterization as we go.

As in previous parts, most code examples use the Stack Overflow 2010 database on SQL Server 2019 CU 16 with the following additional nonclustered index:

CREATE INDEX [IX dbo.Users Reputation (DisplayName)] 
ON dbo.Users (Reputation) 
INCLUDE (DisplayName);

Database compatibility is set to 150, and the cost threshold for parallelism is set to 50 to avoid parallelism for the time being:

ALTER DATABASE StackOverflow2010
SET COMPATIBILITY_LEVEL = 150;
GO
EXECUTE sys.sp_configure
    @configname = 'show advanced options',
    @configvalue = 1;
RECONFIGURE;
GO
EXECUTE sys.sp_configure
    @configname = 'cost threshold for parallelism',
    @configvalue = 50;
RECONFIGURE;

Statement Parsing

The parser identifies constants in the statement and marks them as potential parameters if they are in an allowable context. For example, the constants in TOP (50) and CONVERT(varchar(11, x) are not parameterizable, but the literal values in WHERE x = 123 and WHERE y IN (3, 8) are.

Decisions about parameterization at this early stage are generic and conservative because the only information available is the statement itself. Names of tables and columns haven’t been resolved to database objects yet, and no type information is available.

Imagine someone handing you a query written for an unknown SQL Server database and asking you to identify parameterizable constants. That’s roughly the task facing the parser.

Parsing marks constants as potential parameters unless the current clause context forbids it, accommodating both simple and forced parameterization rules.

For example, AND x = 100 + 50 is not acceptable to forced parameterization, but simple parameterization allows it, so the parser marks both constants as potential parameters.

In contrast, the earlier example WHERE y IN (3, 8) is acceptable to forced parameterization, but not simple. Again, the parser marks both constants as potential parameters just in case.

Initial Data Types

Under simple parameterization, constants are assigned an initial data type based on the textual representation (see part two for details). The data type may then be refined depending on the context.

For example, in WHERE x = 5 the constant is initially parsed as an integer because the textual form is not surrounded by quotation marks and has no decimal point. The context is a comparison operator (equals), so the data type is shrunk to a tinyint. This is the smallest integer type able to contain the value 5.

As a second example, in the expression 123 + 456 both constants are initially typed as integer based on the textual representation. Neither is shrunk to a smaller integer subtype because the context is an arithmetical operation, not a comparison. This explains why the constant 7 was typed as an integer rather than tinyint in the arithmetic operators section of part two.

These rules might seem odd or arbitrary but they were created for SQL Server 7.0 where simple parameterization was a new feature, then called “automatic parameters” or “auto-parameterization”. The engine’s ability to match indexes and reason through implicit conversions has improved markedly since then but the parsing rules remain the same for compatibility.

CAST and CONVERT

In part two I described how these very specific inferred parameter data types could prevent plan reuse—a particular problem with numeric and decimal data types:

Separate prepared statements

An obvious solution would be to provide an explicit type for each constant, but T-SQL doesn’t provide a way to do this for all constant values. The best we can do sometimes is add CAST or CONVERT, but this does not work well with simple parameterization for reasons I will now set out.

In principle, the parser could incorporate the CAST or CONVERT in its parameter data typing decision, but this would require either a call into the expression services component or an early round of constant folding.

Neither of these facilities are available to the parser. It’s simply too early in the process. For example, constant folding expects an operator tree that doesn’t exist yet. All things are possible with enough engineering effort of course, but as things stand the parser cannot use a wrapping CAST or CONVERT to determine the data type of a constant literal.

The end result today is a parameter with the original parser-derived type surrounded by an explicit CAST or CONVERT. This doesn’t solve the problem of plan reuse at all.

Example 1

This is a slightly simplified version of the decimal example from part two. An explicit CONVERT matching the column data type has been added around each constant in an attempt to promote plan reuse:

ALTER DATABASE SCOPED CONFIGURATION 
    CLEAR PROCEDURE_CACHE;
GO
DROP TABLE IF EXISTS dbo.Test;
GO
CREATE TABLE dbo.Test
(
    SomeValue decimal(19,8) NOT NULL
);
GO
SELECT T.SomeValue 
FROM dbo.Test AS T 
WHERE T.SomeValue = CONVERT(decimal(19,8), 1.23);
GO
SELECT T.SomeValue 
FROM dbo.Test AS T 
WHERE T.SomeValue = CONVERT(decimal(19,8), 12.345);
GO
SELECT T.SomeValue 
FROM dbo.Test AS T 
WHERE T.SomeValue = CONVERT(decimal(19,8), 123.4567);
GO

Let’s look at the plan cache:

SELECT
    CP.usecounts,
    CP.objtype,
    ST.[text]
FROM sys.dm_exec_cached_plans AS CP
CROSS APPLY sys.dm_exec_sql_text (CP.plan_handle) AS ST
WHERE 
    ST.[text] NOT LIKE '%dm_exec_cached_plans%'
    AND ST.[text] LIKE '%SomeValue%Test%'
ORDER BY 
    CP.objtype ASC;

It shows a prepared statement for each query:

Separate prepared statements again

This is the same outcome as before we added the CONVERT. The parameter data types are still different, so separate plans are cached and no plan reuse occurs.

Example 2

This is the other example from part two with a CONVERT added to match the integer type of the Reputation column:

ALTER DATABASE SCOPED CONFIGURATION 
    CLEAR PROCEDURE_CACHE;
GO
SELECT U.DisplayName
FROM dbo.Users AS U 
WHERE U.Reputation = CONVERT(integer, 252);
GO
SELECT U.DisplayName
FROM dbo.Users AS U 
WHERE U.Reputation = CONVERT(integer, 25221);
GO
SELECT U.DisplayName
FROM dbo.Users AS U 
WHERE U.Reputation = CONVERT(integer, 252552);
GO

As a reminder, without the CONVERT these statements resulted in three separate prepared cached plans due to the parser’s typing rules:

Different guessed types

Let’s look at the plan cache after running the statements with the CONVERT:

SELECT
    CP.usecounts,
    CP.objtype,
    ST.[text]
FROM sys.dm_exec_cached_plans AS CP
CROSS APPLY sys.dm_exec_sql_text (CP.plan_handle) AS ST
WHERE 
    ST.[text] NOT LIKE '%dm_exec_cached_plans%'
    AND ST.[text] LIKE '%DisplayName%Users%'
ORDER BY 
    CP.objtype ASC;

We see one prepared statement used three times:

enter image description here

This result should surprise you. Didn’t we just establish adding a CAST or CONVERT doesn’t help plan reuse?

This case is slightly different though. The constants in these statements were initially typed by the parser as integer then shrunk to the smallest possible integer subtype (smallint or tinyint) capable of holding the specific value. The shrinking caused different prepared statements without the CONVERT.

Remember from earlier this shrinking happens under simple parameterization only when the parse context is a comparison operator. Without the CONVERT the immediate context for the constant is the equality comparison operator, so shrinking is applied.

With the CONVERT the immediate context is the conversion, which is not a comparison operator so no shrinking occurs. All three constants remain typed as integer resulting in a single prepared statement used three times.

As an aside, notice the explicit CONVERT to integer remains in the prepared statement text even though the parameter is typed as integer.

A CAST or CONVERT isn’t the only operator capable of preventing integer type shrinking by the parser. Anything that gets between the constant and a comparison operator will do the job, as long as the extra item is acceptable for simple parameterization.

For example, we could use FLOOR or ABS around the constant value—but not CEILING. The list of intrinsic functions compatible with simple parameterization is quite limited and undocumented.

Parameterization Attempts

If the parser encounters a syntax element that always prevents simple parameterization it sets a flag so components involved in later stages can avoid wasted effort.

These syntax checks are not exhaustive. For example, the presence of a subquery, TOP clause, or query hint is sufficient to set the flag, but an IN clause, constant-to-constant comparison, or disallowed intrinsic function like LOWER or CEILING is not.

There is a partial list of the syntax elements that prevent simple parameterization in Appendix A of the Microsoft Technical Paper Plan Caching and Recompilation in SQL Server 2012. The list is not complete or maintained, and doesn’t detail why each item excludes parameterization, or at what stage of the compilation process the test is applied.

When the parser decides simple parameterization is impossible, none of the auto-parameterization performance counters mentioned in part 3 are incremented.

In particular, the Auto-Param Attempts/sec counter of the SQL Statistics object is not incremented. This is the primary way to detect a statement with constants was determined unsuitable for simple parameterization by the parser.

If Auto-Param Attempts/sec is incremented, it means the parser was satisfied simple parameterization might succeed. Later components will determine the eventual outcome of the parameterization attempt, either Failed, Safe, or Unsafe. I will cover these details later in this series.

In either case, the parser performs the lightweight work to identify potential parameters and assign an initial data type. Partly this is due to the streaming nature of the parser—it might encounter constants in the token stream before anything that disallows simple parameterization. The work might still prove useful if forced parameterization is active, either at the database level, via a plan guide, or due to undocumented trace flag 144.

End of Part 4

In the next part of this series, I’ll continue the compilation process at the algebrization and normalization stages, showing how these components explain some of the curious things we’ve seen with simple parameterization and trivial plans.

[ This series:  Part 1 | Part 2 | Part 3 | Part 4 | Part 5 | Part 6 ]