[ This series: Part 1 | Part 2 | Part 3 | Part 4 | Part 5 | Part 6 ]
The Compilation Process
The most important things to understand about server-side parameterization are it doesn’t happen all at once and a final decision to parameterize isn’t made until the end of the process.
Multiple compilation components are involved in successfully turning a SQL statement containing constants into a parameterized prepared statement with typed parameters. Each component is active at a different time during compilation and performs its own parameterization-related tasks using the information available at that time.
The scope and timing of each component’s activity explain the curious things seen in the previous parts of this series. Even when the parameterization attempt isn’t successful, there can still be effects from preparations for parameterization visible in the final execution plan.
There are a number of details to explore to gain a complete understanding. Let’s follow the main compilation stages noting the effects on parameterization as we go.
As in previous parts, most code examples use the Stack Overflow 2010 database on SQL Server 2019 CU 16 with the following additional nonclustered index:
CREATE INDEX [IX dbo.Users Reputation (DisplayName)]
ON dbo.Users (Reputation)
INCLUDE (DisplayName);
Database compatibility is set to 150, and the cost threshold for parallelism is set to 50 to avoid parallelism for the time being:
ALTER DATABASE StackOverflow2010
SET COMPATIBILITY_LEVEL = 150;
GO
EXECUTE sys.sp_configure
@configname = 'show advanced options',
@configvalue = 1;
RECONFIGURE;
GO
EXECUTE sys.sp_configure
@configname = 'cost threshold for parallelism',
@configvalue = 50;
RECONFIGURE;
Statement Parsing
The parser identifies constants in the statement and marks them as potential parameters if they are in an allowable context. For example, the constants in TOP (50)
and CONVERT(varchar(11, x)
are not parameterizable, but the literal values in WHERE x = 123
and WHERE y IN (3, 8)
are.
Decisions about parameterization at this early stage are generic and conservative because the only information available is the statement itself. Names of tables and columns haven’t been resolved to database objects yet, and no type information is available.
Imagine someone handing you a query written for an unknown SQL Server database and asking you to identify parameterizable constants. That’s roughly the task facing the parser.
Parsing marks constants as potential parameters unless the current clause context forbids it, accommodating both simple and forced parameterization rules.
For example, AND x = 100 + 50
is not acceptable to forced parameterization, but simple parameterization allows it, so the parser marks both constants as potential parameters.
In contrast, the earlier example WHERE y IN (3, 8)
is acceptable to forced parameterization, but not simple. Again, the parser marks both constants as potential parameters just in case.
Initial Data Types
Under simple parameterization, constants are assigned an initial data type based on the textual representation (see part two for details). The data type may then be refined depending on the context.
For example, in WHERE x = 5
the constant is initially parsed as an integer
because the textual form is not surrounded by quotation marks and has no decimal point. The context is a comparison operator (equals), so the data type is shrunk to a tinyint
. This is the smallest integer type able to contain the value 5.
As a second example, in the expression 123 + 456
both constants are initially typed as integer
based on the textual representation. Neither is shrunk to a smaller integer subtype because the context is an arithmetical operation, not a comparison. This explains why the constant 7 was typed as an integer
rather than tinyint
in the arithmetic operators section of part two.
These rules might seem odd or arbitrary but they were created for SQL Server 7.0 where simple parameterization was a new feature, then called “automatic parameters” or “auto-parameterization”. The engine’s ability to match indexes and reason through implicit conversions has improved markedly since then but the parsing rules remain the same for compatibility.
CAST and CONVERT
In part two I described how these very specific inferred parameter data types could prevent plan reuse—a particular problem with numeric
and decimal
data types:
An obvious solution would be to provide an explicit type for each constant, but T-SQL doesn’t provide a way to do this for all constant values. The best we can do sometimes is add CAST
or CONVERT
, but this does not work well with simple parameterization for reasons I will now set out.
In principle, the parser could incorporate the CAST
or CONVERT
in its parameter data typing decision, but this would require either a call into the expression services component or an early round of constant folding.
Neither of these facilities are available to the parser. It’s simply too early in the process. For example, constant folding expects an operator tree that doesn’t exist yet. All things are possible with enough engineering effort of course, but as things stand the parser cannot use a wrapping CAST
or CONVERT
to determine the data type of a constant literal.
The end result today is a parameter with the original parser-derived type surrounded by an explicit CAST
or CONVERT
. This doesn’t solve the problem of plan reuse at all.
Example 1
This is a slightly simplified version of the decimal
example from part two. An explicit CONVERT
matching the column data type has been added around each constant in an attempt to promote plan reuse:
ALTER DATABASE SCOPED CONFIGURATION
CLEAR PROCEDURE_CACHE;
GO
DROP TABLE IF EXISTS dbo.Test;
GO
CREATE TABLE dbo.Test
(
SomeValue decimal(19,8) NOT NULL
);
GO
SELECT T.SomeValue
FROM dbo.Test AS T
WHERE T.SomeValue = CONVERT(decimal(19,8), 1.23);
GO
SELECT T.SomeValue
FROM dbo.Test AS T
WHERE T.SomeValue = CONVERT(decimal(19,8), 12.345);
GO
SELECT T.SomeValue
FROM dbo.Test AS T
WHERE T.SomeValue = CONVERT(decimal(19,8), 123.4567);
GO
Let’s look at the plan cache:
SELECT
CP.usecounts,
CP.objtype,
ST.[text]
FROM sys.dm_exec_cached_plans AS CP
CROSS APPLY sys.dm_exec_sql_text (CP.plan_handle) AS ST
WHERE
ST.[text] NOT LIKE '%dm_exec_cached_plans%'
AND ST.[text] LIKE '%SomeValue%Test%'
ORDER BY
CP.objtype ASC;
It shows a prepared statement for each query:
This is the same outcome as before we added the CONVERT
. The parameter data types are still different, so separate plans are cached and no plan reuse occurs.
Example 2
This is the other example from part two with a CONVERT
added to match the integer
type of the Reputation column:
ALTER DATABASE SCOPED CONFIGURATION
CLEAR PROCEDURE_CACHE;
GO
SELECT U.DisplayName
FROM dbo.Users AS U
WHERE U.Reputation = CONVERT(integer, 252);
GO
SELECT U.DisplayName
FROM dbo.Users AS U
WHERE U.Reputation = CONVERT(integer, 25221);
GO
SELECT U.DisplayName
FROM dbo.Users AS U
WHERE U.Reputation = CONVERT(integer, 252552);
GO
As a reminder, without the CONVERT
these statements resulted in three separate prepared cached plans due to the parser’s typing rules:
Let’s look at the plan cache after running the statements with the CONVERT
:
SELECT
CP.usecounts,
CP.objtype,
ST.[text]
FROM sys.dm_exec_cached_plans AS CP
CROSS APPLY sys.dm_exec_sql_text (CP.plan_handle) AS ST
WHERE
ST.[text] NOT LIKE '%dm_exec_cached_plans%'
AND ST.[text] LIKE '%DisplayName%Users%'
ORDER BY
CP.objtype ASC;
We see one prepared statement used three times:
This result should surprise you. Didn’t we just establish adding a CAST
or CONVERT
doesn’t help plan reuse?
This case is slightly different though. The constants in these statements were initially typed by the parser as integer
then shrunk to the smallest possible integer subtype (smallint
or tinyint
) capable of holding the specific value. The shrinking caused different prepared statements without the CONVERT
.
Remember from earlier this shrinking happens under simple parameterization only when the parse context is a comparison operator. Without the CONVERT
the immediate context for the constant is the equality comparison operator, so shrinking is applied.
With the CONVERT
the immediate context is the conversion, which is not a comparison operator so no shrinking occurs. All three constants remain typed as integer
resulting in a single prepared statement used three times.
As an aside, notice the explicit CONVERT
to integer
remains in the prepared statement text even though the parameter is typed as integer
.
A CAST
or CONVERT
isn’t the only operator capable of preventing integer type shrinking by the parser. Anything that gets between the constant and a comparison operator will do the job, as long as the extra item is acceptable for simple parameterization.
For example, we could use FLOOR
or ABS
around the constant value—but not CEILING
. The list of intrinsic functions compatible with simple parameterization is quite limited and undocumented.
Parameterization Attempts
If the parser encounters a syntax element that always prevents simple parameterization it sets a flag so components involved in later stages can avoid wasted effort.
These syntax checks are not exhaustive. For example, the presence of a subquery, TOP
clause, or query hint is sufficient to set the flag, but an IN
clause, constant-to-constant comparison, or disallowed intrinsic function like LOWER
or CEILING
is not.
There is a partial list of the syntax elements that prevent simple parameterization in Appendix A of the Microsoft Technical Paper Plan Caching and Recompilation in SQL Server 2012. The list is not complete or maintained, and doesn’t detail why each item excludes parameterization, or at what stage of the compilation process the test is applied.
When the parser decides simple parameterization is impossible, none of the auto-parameterization performance counters mentioned in part 3 are incremented.
In particular, the Auto-Param Attempts/sec
counter of the SQL Statistics object is not incremented. This is the primary way to detect a statement with constants was determined unsuitable for simple parameterization by the parser.
If Auto-Param Attempts/sec
is incremented, it means the parser was satisfied simple parameterization might succeed. Later components will determine the eventual outcome of the parameterization attempt, either Failed
, Safe
, or Unsafe
. I will cover these details later in this series.
In either case, the parser performs the lightweight work to identify potential parameters and assign an initial data type. Partly this is due to the streaming nature of the parser—it might encounter constants in the token stream before anything that disallows simple parameterization. The work might still prove useful if forced parameterization is active, either at the database level, via a plan guide, or due to undocumented trace flag 144.
End of Part 4
In the next part of this series, I’ll continue the compilation process at the algebrization and normalization stages, showing how these components explain some of the curious things we’ve seen with simple parameterization and trivial plans.
[ This series: Part 1 | Part 2 | Part 3 | Part 4 | Part 5 | Part 6 ]
Hi Paul, Interesting point regarding cast and convert. Are cast and convert therefore reliable when it comes to comparison predicate being SARG-able, to ensure the index is used based on a specific data type? At which point is it determined?
Hi Kamil,
Whether an index can be used to support a predicate depends (among other things) on the final types being compared, so any
CAST
orCONVERT
contributes.The point about the inferred parameter data types primarily affects how many plans are cached.
Cheers,
Paul