The Compilation Process
The most important things to understand about server-side parameterization are it doesn’t happen all at once and a final decision to parameterize isn’t made until the end of the process.
Multiple compilation components are involved in successfully turning a SQL statement containing constants into a parameterized prepared statement with typed parameters. Each component is active at a different time during compilation and performs its own parameterization-related tasks using the information available at that time.
The scope and timing of each component’s activity explain the curious things seen in the previous parts of this series. Even when the parameterization attempt isn’t successful, there can still be effects from preparations for parameterization visible in the final execution plan.
There are a number of details to explore to gain a complete understanding. Let’s follow the main compilation stages noting the effects on parameterization as we go.
As in previous parts, most code examples use the Stack Overflow 2010 database on SQL Server 2019 CU 16 with the following additional nonclustered index:
CREATE INDEX [IX dbo.Users Reputation (DisplayName)] ON dbo.Users (Reputation) INCLUDE (DisplayName);
Database compatibility is set to 150, and the cost threshold for parallelism is set to 50 to avoid parallelism for the time being:
ALTER DATABASE StackOverflow2010 SET COMPATIBILITY_LEVEL = 150; GO EXECUTE sys.sp_configure @configname = 'show advanced options', @configvalue = 1; RECONFIGURE; GO EXECUTE sys.sp_configure @configname = 'cost threshold for parallelism', @configvalue = 50; RECONFIGURE;
The parser identifies constants in the statement and marks them as potential parameters if they are in an allowable context. For example, the constants in
TOP (50) and
CONVERT(varchar(11, x) are not parameterizable, but the literal values in
WHERE x = 123 and
WHERE y IN (3, 8) are.
Decisions about parameterization at this early stage are generic and conservative because the only information available is the statement itself. Names of tables and columns haven’t been resolved to database objects yet, and no type information is available.
Imagine someone handing you a query written for an unknown SQL Server database and asking you to identify parameterizable constants. That’s roughly the task facing the parser.
Parsing marks constants as potential parameters unless the current clause context forbids it, accommodating both simple and forced parameterization rules.
AND x = 100 + 50 is not acceptable to forced parameterization, but simple parameterization allows it, so the parser marks both constants as potential parameters.
In contrast, the earlier example
WHERE y IN (3, 8) is acceptable to forced parameterization, but not simple. Again, the parser marks both constants as potential parameters just in case.
Initial Data Types
Under simple parameterization, constants are assigned an initial data type based on the textual representation (see part two for details). The data type may then be refined depending on the context.
For example, in
WHERE x = 5 the constant is initially parsed as an
integer because the textual form is not surrounded by quotation marks and has no decimal point. The context is a comparison operator (equals), so the data type is shrunk to a
tinyint. This is the smallest integer type able to contain the value 5.
As a second example, in the expression
123 + 456 both constants are initially typed as
integer based on the textual representation. Neither is shrunk to a smaller integer subtype because the context is an arithmetical operation, not a comparison. This explains why the constant 7 was typed as an
integer rather than
tinyint in the arithmetic operators section of part two.
These rules might seem odd or arbitrary but they were created for SQL Server 7.0 where simple parameterization was a new feature, then called “automatic parameters” or “auto-parameterization”. The engine’s ability to match indexes and reason through implicit conversions has improved markedly since then but the parsing rules remain the same for compatibility.
CAST and CONVERT
In part two I described how these very specific inferred parameter data types could prevent plan reuse—a particular problem with
decimal data types:
An obvious solution would be to provide an explicit type for each constant, but T-SQL doesn’t provide a way to do this for all constant values. The best we can do sometimes is add
CONVERT, but this does not work well with simple parameterization for reasons I will now set out.
In principle, the parser could incorporate the
CONVERT in its parameter data typing decision, but this would require either a call into the expression services component or an early round of constant folding.
Neither of these facilities are available to the parser. It’s simply too early in the process. For example, constant folding expects an operator tree that doesn’t exist yet. All things are possible with enough engineering effort of course, but as things stand the parser cannot use a wrapping
CONVERT to determine the data type of a constant literal.
The end result today is a parameter with the original parser-derived type surrounded by an explicit
CONVERT. This doesn’t solve the problem of plan reuse at all.
This is a slightly simplified version of the
decimal example from part two. An explicit
CONVERT matching the column data type has been added around each constant in an attempt to promote plan reuse:
ALTER DATABASE SCOPED CONFIGURATION CLEAR PROCEDURE_CACHE; GO DROP TABLE IF EXISTS dbo.Test; GO CREATE TABLE dbo.Test ( SomeValue decimal(19,8) NOT NULL ); GO SELECT T.SomeValue FROM dbo.Test AS T WHERE T.SomeValue = CONVERT(decimal(19,8), 1.23); GO SELECT T.SomeValue FROM dbo.Test AS T WHERE T.SomeValue = CONVERT(decimal(19,8), 12.345); GO SELECT T.SomeValue FROM dbo.Test AS T WHERE T.SomeValue = CONVERT(decimal(19,8), 123.4567); GO
Let’s look at the plan cache:
SELECT CP.usecounts, CP.objtype, ST.[text] FROM sys.dm_exec_cached_plans AS CP CROSS APPLY sys.dm_exec_sql_text (CP.plan_handle) AS ST WHERE ST.[text] NOT LIKE '%dm_exec_cached_plans%' AND ST.[text] LIKE '%SomeValue%Test%' ORDER BY CP.objtype ASC;
It shows a prepared statement for each query:
This is the same outcome as before we added the
CONVERT. The parameter data types are still different, so separate plans are cached and no plan reuse occurs.
This is the other example from part two with a
CONVERT added to match the
integer type of the Reputation column:
ALTER DATABASE SCOPED CONFIGURATION CLEAR PROCEDURE_CACHE; GO SELECT U.DisplayName FROM dbo.Users AS U WHERE U.Reputation = CONVERT(integer, 252); GO SELECT U.DisplayName FROM dbo.Users AS U WHERE U.Reputation = CONVERT(integer, 25221); GO SELECT U.DisplayName FROM dbo.Users AS U WHERE U.Reputation = CONVERT(integer, 252552); GO
As a reminder, without the
CONVERT these statements resulted in three separate prepared cached plans due to the parser’s typing rules:
Let’s look at the plan cache after running the statements with the
SELECT CP.usecounts, CP.objtype, ST.[text] FROM sys.dm_exec_cached_plans AS CP CROSS APPLY sys.dm_exec_sql_text (CP.plan_handle) AS ST WHERE ST.[text] NOT LIKE '%dm_exec_cached_plans%' AND ST.[text] LIKE '%DisplayName%Users%' ORDER BY CP.objtype ASC;
We see one prepared statement used three times:
This result should surprise you. Didn’t we just establish adding a
CONVERT doesn’t help plan reuse?
This case is slightly different though. The constants in these statements were initially typed by the parser as
integer then shrunk to the smallest possible integer subtype (
tinyint) capable of holding the specific value. The shrinking caused different prepared statements without the
Remember from earlier this shrinking happens under simple parameterization only when the parse context is a comparison operator. Without the
CONVERT the immediate context for the constant is the equality comparison operator, so shrinking is applied.
CONVERT the immediate context is the conversion, which is not a comparison operator so no shrinking occurs. All three constants remain typed as
integer resulting in a single prepared statement used three times.
As an aside, notice the explicit
integer remains in the prepared statement text even though the parameter is typed as
CONVERT isn’t the only operator capable of preventing integer type shrinking by the parser. Anything that gets between the constant and a comparison operator will do the job, as long as the extra item is acceptable for simple parameterization.
For example, we could use
ABS around the constant value—but not
CEILING. The list of intrinsic functions compatible with simple parameterization is quite limited and undocumented.
If the parser encounters a syntax element that always prevents simple parameterization it sets a flag so components involved in later stages can avoid wasted effort.
These syntax checks are not exhaustive. For example, the presence of a subquery,
TOP clause, or query hint is sufficient to set the flag, but an
IN clause, constant-to-constant comparison, or disallowed intrinsic function like
CEILING is not.
There is a partial list of the syntax elements that prevent simple parameterization in Appendix A of the Microsoft Technical Paper Plan Caching and Recompilation in SQL Server 2012. The list is not complete or maintained, and doesn’t detail why each item excludes parameterization, or at what stage of the compilation process the test is applied.
When the parser decides simple parameterization is impossible, none of the auto-parameterization performance counters mentioned in part 3 are incremented.
In particular, the
Auto-Param Attempts/sec counter of the SQL Statistics object is not incremented. This is the primary way to detect a statement with constants was determined unsuitable for simple parameterization by the parser.
Auto-Param Attempts/sec is incremented, it means the parser was satisfied simple parameterization might succeed. Later components will determine the eventual outcome of the parameterization attempt, either
Unsafe. I will cover these details later in this series.
In either case, the parser performs the lightweight work to identify potential parameters and assign an initial data type. Partly this is due to the streaming nature of the parser—it might encounter constants in the token stream before anything that disallows simple parameterization. The work might still prove useful if forced parameterization is active, either at the database level, via a plan guide, or due to undocumented trace flag 144.
End of Part 4
In the next part of this series, I’ll continue the compilation process at the algebrization and normalization stages, showing how these components explain some of the curious things we’ve seen with simple parameterization and trivial plans.