Query parameterization promotes the reuse of cached execution plans, thereby avoiding unnecessary compilations, and reducing the number of ad-hoc queries in the plan cache. These are all good things, provided the query being parameterized really ought to use the same cached execution plan for different parameter values. An execution plan that is efficient for one parameter value may not be a good choice for other possible parameter values.
When parameter sniffing is enabled (the default), SQL Server chooses an execution plan based on the particular parameter values that exist at compilation time. The implicit assumption is that parameterized statements are most commonly executed with the most common parameter values. This sounds reasonable enough (even obvious) and indeed it often works well.
A problem can occur when an automatic recompilation of the cached plan occurs. A recompilation may be triggered for all sorts of reasons, for example because an index used by the cached plan has been dropped (a correctness recompilation) or because statistical information has changed (an optimality recompile). Whatever the exact cause of the plan recompilation, there is a chance that an atypical value is being passed as a parameter at the time the new plan is generated.
This can result in a new cached plan (based on the sniffed atypical parameter value) that is not good for the majority of executions for which it will be reused. It is not easy to predict when a particular execution plan will be recompiled (for example, because statistics have changed sufficiently) resulting in a situation where a good-quality reusable plan can be suddenly replaced by a quite different plan optimized for atypical parameter values.
One such scenario occurs when the atypical value is highly selective, resulting in a plan optimized for a small number of rows. Such plans will often use single-threaded execution, nested loops joins, and lookups. Serious performance issues can arise when this plan is reused for different parameter values that generate a much larger number of rows.
Disabling Parameter Sniffing
Parameter sniffing can be disabled using the documented and supported trace flag 4136. This trace flag is also supported for per-query use via the QUERYTRACEON hint. Both apply from SQL Server 2005 Service Pack 4 onward (and slightly earlier if you are willing to apply cumulative updates to Service Pack 3).
When parameter sniffing is disabled, SQL Server uses average distribution statistics to choose an execution plan. This also sounds like a reasonable approach (and can help avoid the situation where the plan is optimized for an unusually selective parameter value) but it is not a perfect strategy either. A plan optimized for an 'average' value might well end up being seriously sub-optimal for the commonly-seen parameter values.
Consider an execution plan that contains memory-consuming operators like sorts and hashes. Because memory is reserved before query execution starts, a parameterized plan based on average distribution values can spill to tempdb for common parameter values that produce more data than the optimizer expected. Memory reservations cannot usually grow during query execution, regardless of how much free memory the server may have.
Certain applications do benefit from turning parameter sniffing off (see this post by the Dynamics AX Performance Team for an example). However, for most workloads, disabling parameter sniffing entirely is the wrong solution, and may even be a disaster. Parameter sniffing is a heuristic optimization: it works better than using average values on most systems, most of the time.
SQL Server provides a range of query hints and other options to tune the behaviour of parameter sniffing:
OPTIMIZE FOR (@parameter = value)query hint builds a reusable plan based on a specific value
OPTIMIZE FOR (@parameter UNKNOWN)uses average distribution statistics for a particular parameter
OPTIMIZE FOR UNKNOWNuses average distribution for all parameters (same effect as trace flag 4136)
WITH RECOMPILEstored procedure option compiles a fresh procedure plan for every execution
OPTION (RECOMPILE)query hint compiles a fresh plan for an individual statement
The old technique of "parameter hiding" (assigning procedure parameters to local variables, and referencing the variables instead) has the same effect as specifying
OPTIMIZE FOR UNKNOWN. It can be useful on instances earlier than SQL Server 2008 (the
OPTIMIZE FOR hint was new for 2008).
It could be argued that every parameterized statement should be checked for sensitivity to parameter values, and either left alone (if the default behaviour works well) or explicitly hinted using one of the options above. This is rarely done in practice, partly because performing a comprehensive analysis for all possible parameter values can be time-consuming and requires quite advanced skills.
Most often, no such analysis is performed and parameter-sensitivity problems are addressed as and when they occur in production. This lack of prior analysis is probably one main reason parameter sniffing has a poor reputation. It pays to be aware of the potential for problems to arise, and to perform at least a quick analysis on statements that are likely to cause performance problems when recompiled with an atypical parameter value.
What is a parameter?
Some would say that a
SELECT statement referencing a local variable is a "parameterized statement" of sorts, but that is not the definition SQL Server uses. A reasonable indication that a statement uses parameters can be found by looking at the plan properties (see the Parameters tab in SQL Sentry Plan Explorer, or click the query plan root node in SSMS, open the Properties window, and expand the Parameter List node):
The 'compiled value' shows the sniffed value of the parameter used to compile the cached plan. The 'runtime value' shows the value of the parameter on the particular execution captured in the plan. Either of these properties may be blank or missing in different circumstances. If a query is not parameterized, the properties will simply all be missing.
Just because nothing is ever simple in SQL Server, there are situations where the parameter list can be populated, but the statement is still not parameterized. This can occur when SQL Server attempts simple parameterization (discussed later) but decides the attempt is "unsafe". In that case, parameter markers will be shown but the execution plan is not in fact parameterized.
Sniffing is not just for Stored Procedures
Parameter sniffing also occurs when a batch is explicitly parameterized for reuse using sp_executesql. For example:
EXECUTE sys.sp_executesql N' SELECT P.ProductID, P.Name, TotalQty = SUM(TH.Quantity) FROM Production.Product AS P JOIN Production.TransactionHistory AS TH ON TH.ProductID = P.ProductID WHERE P.Name LIKE @NameLike GROUP BY P.ProductID, P.Name; ', N'@NameLike nvarchar(50)', @NameLike = N'K%';
The optimizer chooses an execution plan based on the sniffed value of the
@NameLike parameter. The parameter value "K%" is estimated to match very few rows in the Product table, so the optimizer chooses a nested loop join and key lookup strategy:
Executing the statement again with a parameter value of "[H-R]%" (which will match many more rows) reuses the cached parameterized plan:
EXECUTE sys.sp_executesql N' SELECT P.ProductID, P.Name, TotalQty = SUM(TH.Quantity) FROM Production.Product AS P JOIN Production.TransactionHistory AS TH ON TH.ProductID = P.ProductID WHERE P.Name LIKE @NameLike GROUP BY P.ProductID, P.Name; ', N'@NameLike nvarchar(50)', @NameLike = N'[H-R]%';
The AdventureWorks sample database is too small to make this a performance disaster, but this plan is certainly not optimal for the second parameter value. We can see the plan the optimizer would have chosen by clearing the plan cache and executing the second query again:
With a larger number of matches expected, the optimizer determines that a hash join and hash aggregation are better strategies.
Parameter sniffing occurs with T-SQL functions as well, though the way execution plans are generated can make this more difficult to see. There are good reasons to avoid T-SQL scalar and multi-statement functions in general, so for educational purposes only, here is a T-SQL multi-statement table-valued function version of our test query:
CREATE FUNCTION dbo.F (@NameLike nvarchar(50)) RETURNS @Result TABLE ( ProductID integer NOT NULL PRIMARY KEY, Name nvarchar(50) NOT NULL, TotalQty integer NOT NULL ) WITH SCHEMABINDING AS BEGIN INSERT @Result SELECT P.ProductID, P.Name, TotalQty = SUM(TH.Quantity) FROM Production.Product AS P JOIN Production.TransactionHistory AS TH ON TH.ProductID = P.ProductID WHERE P.Name LIKE @NameLike GROUP BY P.ProductID, P.Name; RETURN; END;
The following query uses the function to display information for product names starting with 'K':
SELECT Result.ProductID, Result.Name, Result.TotalQty FROM dbo.F(N'K%') AS Result;
Seeing parameter sniffing with an embedded function is more difficult because SQL Server does not return a separate post-execution (actual) query plan for each function invocation. The function could be called many times within a single statement, and users would not be impressed if SSMS tried to display a million function call plans for a single query. As a result of this design decision, the actual plan returned by SQL Server for our test query is not very helpful:
Nevertheless, there are ways to see parameter sniffing in action with embedded functions. The method I have chosen to use here is to inspect the plan cache:
SELECT DEQS.plan_generation_num, DEQS.execution_count, DEQS.last_logical_reads, DEQS.last_elapsed_time, DEQS.last_rows, DEQP.query_plan FROM sys.dm_exec_query_stats AS DEQS CROSS APPLY sys.dm_exec_sql_text(DEQS.plan_handle) AS DEST CROSS APPLY sys.dm_exec_query_plan(DEQS.plan_handle) AS DEQP WHERE DEST.objectid = OBJECT_ID(N'dbo.F', N'TF');
This result shows that the function plan has been executed once, at a cost of 201 logical reads with 2891 microseconds elapsed time, and the most recent execution returned one row. The XML plan returned shows that the parameter value was sniffed:
Now run the statement again, with a different parameter:
SELECT Result.ProductID, Result.Name, Result.TotalQty FROM dbo.F(N'[H-R]%') AS Result;
The post-execution plan just shows that 306 rows were returned by the function:
The plan cache query shows the cached execution plan for the function has been reused (execution_count = 2):
It also shows a much higher number of logical reads, and a longer elapsed time compared with the previous run. This is consistent with reusing a nested loops and lookup plan, but to be completely sure, the post-execution function plan can be captured using Extended Events or the SQL Server Profiler tool:
Because parameter sniffing applies to functions, these modules can suffer from the same unexpected changes in performance commonly associated with stored procedures. For example, the first time a function is referenced, a plan might be cached that does not use parallelism. Subsequent executions with parameter values that would benefit from parallelism (but reuse the cached serial plan) will show unexpectedly poor performance.
This issue can be tricky to identify because SQL Server does not return separate post-execution plans for function calls as we have seen. Using Extended Events or Profiler to routinely capture post-execution plans is extremely resource-intensive in current versions of SQL Server, so it only makes sense to use that technique in a very targeted fashion. The difficulties around debugging function parameter-sensitivity issues mean it is even more worthwhile doing an analysis (and coding defensively) before the function hits production.
Parameter-sniffing works exactly the same way with embedded T-SQL scalar user-defined functions. In-line table-valued functions do not generate a separate execution plan for each invocation, because (as the name says) these are in-lined into the calling query before compilation.
Beware Sniffed NULLs
Clear the plan cache and request an estimated plan for the test query:
SELECT Result.ProductID, Result.Name, Result.TotalQty FROM dbo.F(N'K%') AS Result;
You will see two execution plans, the second of which is for the function call:
A limitation of parameter sniffing with embedded functions in estimated plans means the parameter value is sniffed as
NULL (not "K%"):
In versions of SQL Server before 2012, this plan (optimized for a
NULL parameter) is cached for reuse. This is unfortunate, because
NULL is unlikely to be a representative parameter value, and it was certainly not the value specified in the query. SQL Server 2012 does not cache any plans resulting from an "estimated plan" request, though it will still show a function plan optimized for a
NULL parameter value.
Simple and Forced Parameterization
An ad-hoc T-SQL statement containing constant literal values can be parameterized by SQL Server, either because the query qualifies for simple parameterization or because the database option for forced parameterization is enabled. A statement parameterized in this way is also subject to parameter sniffing. The following query qualifies for simple parameterization:
SELECT A.AddressLine1, A.City, A.PostalCode FROM Person.Address AS A WHERE A.AddressLine1 = N'Heidestieg Straße 8664';
The estimated execution plan shows an estimate of 2.5 rows based on the sniffed parameter value:
In fact the query returns 7 rows (cardinality estimation is not perfect, even where values are sniffed):
At this point you may be wondering where the evidence is that this query was parameterized, and the resulting parameter value sniffed. Run the query a second time with a different value:
SELECT A.AddressLine1, A.City, A.PostalCode FROM Person.Address AS A WHERE A.AddressLine1 = N'Winter der Böck 8550';
The query returns one row:
The execution plan shows the second execution reused the parameterized plan that was compiled using a sniffed value:
Parameterization and sniffing are separate activities
An ad-hoc statement can be parameterized by SQL Server without parameter values being sniffed. To demonstrate, we can use trace flag 4136 to disable parameter sniffing for a batch that will be parameterized by the server:
DBCC FREEPROCCACHE; DBCC TRACEON (4136); GO SELECT A.AddressLine1, A.City, A.PostalCode FROM Person.Address AS A WHERE A.AddressLine1 = N'Heidestieg Straße 8664'; GO SELECT A.AddressLine1, A.City, A.PostalCode FROM Person.Address AS A WHERE A.AddressLine1 = N'Winter der Böck 8550'; GO DBCC TRACEOFF (4136);
The script results in statements that are parameterized, but the parameter value is not sniffed for cardinality estimation purposes. To see this, we can inspect the plan cache:
WITH XMLNAMESPACES (DEFAULT 'http://schemas.microsoft.com/sqlserver/2004/07/showplan') SELECT DECP.cacheobjtype, DECP.objtype, DECP.usecounts, DECP.plan_handle, parameterized_plan_handle = DEQP.query_plan.value ( '(//StmtSimple)/@ParameterizedPlanHandle', 'NVARCHAR(100)' ) FROM sys.dm_exec_cached_plans AS DECP CROSS APPLY sys.dm_exec_sql_text(DECP.plan_handle) AS DEST CROSS APPLY sys.dm_exec_query_plan(DECP.plan_handle) AS DEQP WHERE DEST.[text] LIKE N'%AddressLine1%' AND DEST.[text] NOT LIKE N'%XMLNAMESPACES%';
The results show two cache entries for the ad-hoc queries, linked to the parameterized (prepared) query plan by the parameterized plan handle. The parameterized plan is used twice (click the image to enlarge in a new window to see the details):
The execution plan shows a different cardinality estimate now that parameter sniffing is disabled:
Compare the estimate of 1.44571 rows with the 2.5 row estimate used when parameter sniffing was enabled. With sniffing disabled, the 1.44571 estimate comes from average frequency information about the
AddressLine1 column. An extract of the
DBCC SHOW_STATISTICS output for the index in question shows how this number was calculated. Multiplying the number of rows in the table (19,614) by the density (7.370826e-5) gives the 1.44571 row estimate.
It is widely believed that only integer comparisons using a unique index can qualify for simple parameterization. I deliberately chose this example (a string comparison using a non-unique index) to refute that.
WITH RECOMPILE and OPTION (RECOMPILE)
When a parameter-sensitivity problem is encountered, a common piece of advice on forums and Q&A sites is to "use recompile" (assuming the other tuning options presented earlier are unsuitable). Unfortunately, that advice is often misinterpreted to mean adding
WITH RECOMPILE option to the stored procedure.
WITH RECOMPILE effectively returns us to SQL Server 2000 behaviour, where the entire stored procedure is recompiled on every execution. A better alternative, on SQL Server 2005 and later, is to use the
OPTION (RECOMPILE) query hint on just the statement that suffers from the parameter-sniffing problem. This query hint results in a recompilation of the problematic statement only; execution plans for other statements within the stored procedure are cached and reused as normal.
WITH RECOMPILE also means the compiled plan for the stored procedure is not cached. As a result, no performance information is maintained in DMVs such as sys.dm_exec_query_stats. Using the query hint instead means that a compiled plan can be cached, and performance information is available in the DMVs (though it is limited to the most recent execution, for the affected statement only).
For instances running at least SQL Server 2008 build 2746 (Service Pack 1 with Cumulative Update 5), using
OPTION (RECOMPILE) has another significant advantage over
WITH RECOMPILE: only
OPTION (RECOMPILE) enables the Parameter Embedding Optimization.
The Parameter Embedding Optimization
Sniffing parameter values allows the optimizer to use the parameter value to derive cardinality estimates. Both
WITH RECOMPILE and
OPTION (RECOMPILE) result in query plans with estimates calculated from the actual parameter values on each execution.
The Parameter Embedding Optimization takes this process a step further: query parameters are replaced with literal constant values during query parsing. The parser is capable of surprisingly complex simplifications, and subsequent query optimization may refine things even further. Consider the following stored procedure, which features the
WITH RECOMPILE option:
CREATE PROCEDURE dbo.P @NameLike nvarchar(50), @Sort tinyint WITH RECOMPILE AS BEGIN SELECT TOP (5) ProductID, Name FROM Production.Product WHERE @NameLike IS NULL OR Name LIKE @NameLike ORDER BY CASE WHEN @Sort = 1 THEN ProductID ELSE NULL END ASC, CASE WHEN @Sort = 2 THEN ProductID ELSE NULL END DESC, CASE WHEN @Sort = 3 THEN Name ELSE NULL END ASC, CASE WHEN @Sort = 4 THEN Name ELSE NULL END DESC; END;
The procedure is executed twice, with the following parameter values:
EXECUTE dbo.P @NameLike = N'K%', @Sort = 1; GO EXECUTE dbo.P @NameLike = N'[H-R]%', @Sort = 4;
WITH RECOMPILE is used, the procedure is fully recompiled on every execution. The parameter values are sniffed each time and used by the optimizer to calculate cardinality estimates. The plan for the first procedure execution is exactly correct, estimating 1 row:
The second execution estimates 360 rows, very close to the 366 seen at run time:
Both plans use the same general execution strategy: scan all rows in an index applying the
WHERE clause predicate as a residual; compute the
CASE expression used in the
ORDER BY clause; and perform a Top N Sort on the result of the
Now re-create the stored procedure using an
OPTION (RECOMPILE) query hint instead of
CREATE PROCEDURE dbo.P @NameLike nvarchar(50), @Sort tinyint AS BEGIN SELECT TOP (5) ProductID, Name FROM Production.Product WHERE @NameLike IS NULL OR Name LIKE @NameLike ORDER BY CASE WHEN @Sort = 1 THEN ProductID ELSE NULL END ASC, CASE WHEN @Sort = 2 THEN ProductID ELSE NULL END DESC, CASE WHEN @Sort = 3 THEN Name ELSE NULL END ASC, CASE WHEN @Sort = 4 THEN Name ELSE NULL END DESC OPTION (RECOMPILE); END;
Executing the stored procedure twice with the same parameter values as before produces dramatically different execution plans. This is the first execution plan (with parameters requesting names starting with "K", ordered by
The parser embeds the parameter values in the query text, resulting in the following intermediate form:
SELECT TOP (5) ProductID, Name FROM Production.Product WHERE 'K%' IS NULL OR Name LIKE 'K%' ORDER BY CASE WHEN 1 = 1 THEN ProductID ELSE NULL END ASC, CASE WHEN 1 = 2 THEN ProductID ELSE NULL END DESC, CASE WHEN 1 = 3 THEN Name ELSE NULL END ASC, CASE WHEN 1 = 4 THEN Name ELSE NULL END DESC;
The parser then goes further, removing contradictions and fully evaluating the CASE expressions. This results in:
SELECT TOP (5) ProductID, Name FROM Production.Product WHERE Name LIKE 'K%' ORDER BY ProductID ASC, NULL DESC, NULL ASC, NULL DESC;
You will get an error message if you try to submit that query directly to SQL Server, because ordering by a constant value is not allowed. Nevertheless, this is the form produced by the parser; it is allowed internally because it arose as a result of applying the parameter embedding optimization. The simplified query makes life a lot easier for the query optimizer:
The Clustered Index Scan applies the
LIKE predicate as a residual; the Compute Scalar provides the constant
NULL values, and the Top returns the first 5 rows in the order provided by the Clustered Index (avoiding a sort). In a perfect world, the query optimizer would also remove the Compute Scalar that define the
NULLs, as they are not used during query execution.
The second execution follows exactly the same process, resulting in a query plan (for names beginning with letters "H" to "R", ordered by
Name descending) like this:
This plan features a Nonclustered Index Seek that covers the
LIKE range, a residual
LIKE predicate, the constant
NULLs as before, and a Top (5). The query optimizer chooses to perform a
BACKWARD range scan in the Index Seek to once again avoid a sort. Compare the plan above with the one produced using
WITH RECOMPILE, which cannot use the parameter embedding optimization:
This simple example might have been better implemented as a series of
IF statements in the procedure (one for each combination of parameter values). This could provide similar query plan benefits, without incurring a statement compilation each time. In more complex scenarios, the statement-level recompile with parameter embedding provided by
OPTION (RECOMPILE) can be an extremely useful optimization technique.
An Embedding Restriction
There is one scenario where using
OPTION (RECOMPILE) will not result in the parameter embedding optimization being applied. If the statement assigns to a variable, parameter values are not embedded:
CREATE PROCEDURE dbo.P @NameLike nvarchar(50), @Sort tinyint AS BEGIN DECLARE @ProductID integer, @Name nvarchar(50); SELECT TOP (1) @ProductID = ProductID, @Name = Name FROM Production.Product WHERE @NameLike IS NULL OR Name LIKE @NameLike ORDER BY CASE WHEN @Sort = 1 THEN ProductID ELSE NULL END ASC, CASE WHEN @Sort = 2 THEN ProductID ELSE NULL END DESC, CASE WHEN @Sort = 3 THEN Name ELSE NULL END ASC, CASE WHEN @Sort = 4 THEN Name ELSE NULL END DESC OPTION (RECOMPILE); END;
SELECT statement now assigns to a variable, the query plans produced are the same as when
WITH RECOMPILE was used. Parameter values are still sniffed and used by the query optimizer for cardinality estimation, and
OPTION (RECOMPILE) still only compiles the single statement, only the benefit of parameter value embedding is lost.