After blogging about how filtered indexes could be more powerful, and more recently about how they can be rendered useless by forced parameterization, I'm revisiting the filtered indexes/parameterization topic. A seemingly-too-simple solution came up at work recently, and I had to share.
Take the following example, where we have a sales database containing a table of orders. Sometimes we just want a list (or a count) of only the yet-to-be-shipped orders — which, over time, (hopefully!) represent a smaller and smaller percentage of the overall table:
CREATE DATABASE Sales;
GO
USE Sales;
GO
-- simplified, obviously:
CREATE TABLE dbo.Orders
(
OrderID int IDENTITY(1,1) PRIMARY KEY,
OrderDate datetime NOT NULL,
filler char(500) NOT NULL DEFAULT '',
IsShipped bit NOT NULL DEFAULT 0
);
GO
-- let's put some data in there; 7,000 shipped orders, and 50 unshipped:
INSERT dbo.Orders(OrderDate, IsShipped)
-- random dates over two years
SELECT TOP (7000) DATEADD(DAY, ABS(object_id % 730), '20171101'), 1
FROM sys.all_columns
UNION ALL
-- random dates from this month
SELECT TOP (50) DATEADD(DAY, ABS(object_id % 30), '20191201'), 0
FROM sys.all_columns;
It might make sense in this scenario to create a filtered index like this (which makes quick work of any queries that are trying to get at those unshipped orders):
CREATE INDEX ix_OrdersNotShipped
ON dbo.Orders(IsShipped, OrderDate)
WHERE IsShipped = 0;
We can run a quick query like this to see how it uses the filtered index:
SELECT OrderID, OrderDate FROM dbo.Orders WHERE IsShipped = 0;
The execution plan is fairly simple, but there is a warning about UnmatchedIndexes:
The name of the warning is slightly misleading — the optimizer was ultimately able to use the index, but is suggesting it would be "better" without parameters (which we didn't explicitly use), even though the statement looks like it was parameterized:
If you really want, you can eliminate the warning, with no difference in actual performance (it would just be cosmetic). One way is to add a zero-impact predicate, like AND (1 > 0)
:
SELECT wadd = OrderID, OrderDate FROM dbo.Orders WHERE IsShipped = 0 AND (1 > 0);
Another (probably more common) is to add OPTION (RECOMPILE)
:
SELECT wrecomp = OrderID, OrderDate FROM dbo.Orders WHERE IsShipped = 0 OPTION (RECOMPILE);
Both of these options yield the same plan (a seek with no warnings):
So far, so good; our filtered index is being used (as expected). These aren't the only tricks, of course; see the comments below for others that readers have already submitted.
Then, the complication
Because the database is subject to a large number of ad hoc queries, someone turns on forced parameterization, attempting to reduce compilation and eliminate low- and single-use plans from polluting the plan cache:
ALTER DATABASE Sales SET PARAMETERIZATION FORCED;
Now our original query can't use the filtered index; it's forced to scan the clustered index:
SELECT OrderID, OrderDate FROM dbo.Orders WHERE IsShipped = 0;
The warning about unmatched indexes returns, and we get new warnings about residual I/O. Note that the statement is parameterized, but it looks a bit different:
This is by design, since the whole purpose of forced parameterization is to parameterize queries like this. But it defeats the purpose of our filtered index, since that is meant to support a single value in the predicate, not a parameter that can change.
Tomfoolery
Our "trick" query that uses the additional predicate is also unable to use the filtered index, and ends up with a slightly more complicated plan to boot:
SELECT OrderID, OrderDate FROM dbo.Orders WHERE IsShipped = 0 AND (1 > 0);
OPTION (RECOMPILE)
The typical reaction in this case, just like with removing the warning earlier, is to add OPTION (RECOMPILE)
to the statement. This works, and allows the filtered index to be chosen for an efficient seek…
SELECT OrderID, OrderDate FROM dbo.Orders WHERE IsShipped = 0 OPTION (RECOMPILE);
…but adding OPTION (RECOMPILE)
and taking this additional compilation hit against every execution of the query is not always going to be acceptable in high-volume environments (especially if they're already CPU-bound).
Hints
Someone suggested explicitly hinting the filtered index to avoid the costs of recompile. In general, this is rather brittle, because it relies on the index outliving the code; I tend to use this is as a last resort. In this case it isn't valid anyway. When parameterization rules prevent the optimizer from picking the filtered index automatically, they also prevent you from picking it manually. Same issue with a generic FORCESEEK
hint:
SELECT OrderID, OrderDate FROM dbo.Orders WITH (INDEX (ix_OrdersNotShipped)) WHERE IsShipped = 0;
SELECT OrderID, OrderDate FROM dbo.Orders WITH (FORCESEEK) WHERE IsShipped = 0;
Both yield this error:
Query processor could not produce a query plan because of the hints defined in this query. Resubmit the query without specifying any hints and without using SET FORCEPLAN.
And this makes sense, because there's no way to know that the unknown value for the IsShipped
parameter will match the filtered index (or support a seek operation on any index).
Dynamic SQL?
I suggested you could use dynamic SQL, to at least only pay that recompile hit when you know you want to hit the smaller index:
DECLARE @IsShipped bit = 0;
DECLARE @sql nvarchar(max) = N'SELECT dynsql = OrderID, OrderDate FROM dbo.Orders'
+ CASE WHEN @IsShipped IS NOT NULL THEN N' WHERE IsShipped = @IsShipped'
ELSE N'' END
+ CASE WHEN @IsShipped = 0 THEN N' OPTION (RECOMPILE)' ELSE N'' END;
EXEC sys.sp_executesql @sql, N'@IsShipped bit', @IsShipped;
This leads to the same efficient plan as above. If you changed the variable to @IsShipped = 1
, then you get the more expensive clustered index scan you should expect:
But nobody likes using dynamic SQL in an edge case like this — it makes code harder to read and maintain, and even if this code were out in the application, it's still additional logic that would have to be added there, making it less than desirable.
Something simpler
We talked briefly about implementing a plan guide, which is certainly not simpler, but then a colleague suggested that you could fool the optimizer by "hiding" the parameterized statement inside a stored procedure, view, or inline table-valued function. It was so simple, I didn't believe it would work.
But then I tried it:
CREATE PROCEDURE dbo.GetUnshippedOrders
AS
BEGIN
SET NOCOUNT ON;
SELECT OrderID, OrderDate FROM dbo.Orders WHERE IsShipped = 0;
END
GO
CREATE VIEW dbo.vUnshippedOrders
AS
SELECT OrderID, OrderDate FROM dbo.Orders WHERE IsShipped = 0;
GO
CREATE FUNCTION dbo.fnUnshippedOrders()
RETURNS TABLE
AS
RETURN (SELECT OrderID, OrderDate FROM dbo.Orders WHERE IsShipped = 0);
GO
All three of these queries perform the efficient seek against the filtered index:
EXEC dbo.GetUnshippedOrders;
GO
SELECT OrderID, OrderDate FROM dbo.vUnshippedOrders;
GO
SELECT OrderID, OrderDate FROM dbo.fnUnshippedOrders();
Conclusion
I was surprised this was so effective. Of course, this requires you to change the application; if you can't change the app code to call a stored procedure or reference the view or function (or even add OPTION (RECOMPILE)
), you'll have to keep looking for other options. But if you can change the application code, stuffing the predicate into another module may just be the way to go.
On a related note, if you've got something like a stored procedure where the value actually IS a dynamic parameter like @IsShipped, then you can still use the filtered index where it helps by doing branching in the proc, and calling a child stored proc when you want to get filtered index usage. A little easier than dealing with dynamic SQL (especially for complex filters), but still ugly.
There is also:
See https://docs.microsoft.com/en-us/previous-versions/sql/sql-server-2008-r2/ms175037(v=sql.105) for other exceptions.
Thanks Brent! I've also done something a bit more advanced like this, where we had multiple filtered indexes (for a handful of uncommon states among 20+ more common ones, rather than a single bit). You can have a stored procedure like you wrote, but it also takes the state as a parameter and adds OPTION (RECOMPILE) – this way you don't pay for the RECOMPILE on the more common states (or when the state parameter is left out), and you also get "current" estimates per uncommon state. Ugly, as you said, and I actually prefer the ugly dynamic SQL just for containment, but it can work very well.
Very nice, I totally forgot about that, too. It's too bad they didn't carry forward a new version of these docs. I was going to suggest a couple of changes, but I guess they don't want to maintain them…
It has been folded into https://docs.microsoft.com/en-us/sql/relational-databases/query-processing-architecture-guide?view=sql-server-ver15#ForcedParam
Thanks Aaron, great post!
Just a small correction: you wrote that "the query was auto-parameterized, even under simple parameterization".
The optimizer tried to parameterize the query, but failed to do so, and the query was actually not parameterized. This is what the warning means, in a way. If you try to execute the same query with IsShipped = 1, you will not get a reuse of the parameterized plan, because there is no parameterized plan.
The warning, in this case, has no impact, as you wrote, and there is no reason to eliminate it, other than to have a clean plan without warnings.
You're right! That sentence was just me being a bonehead. Will correct, thanks!