After blogging about how filtered indexes could be more powerful, and more recently about how they can be rendered useless by forced parameterization, I'm revisiting the filtered indexes/parameterization topic. A seemingly-too-simple solution came up at work recently, and I had to share.
Take the following example, where we have a sales database containing a table of orders. Sometimes we just want a list (or a count) of only the yet-to-be-shipped orders — which, over time, (hopefully!) represent a smaller and smaller percentage of the overall table:
CREATE DATABASE Sales; GO USE Sales; GO -- simplified, obviously: CREATE TABLE dbo.Orders ( OrderID int IDENTITY(1,1) PRIMARY KEY, OrderDate datetime NOT NULL, filler char(500) NOT NULL DEFAULT '', IsShipped bit NOT NULL DEFAULT 0 ); GO -- let's put some data in there; 7,000 shipped orders, and 50 unshipped: INSERT dbo.Orders(OrderDate, IsShipped) -- random dates over two years SELECT TOP (7000) DATEADD(DAY, ABS(object_id % 730), '20171101'), 1 FROM sys.all_columns UNION ALL -- random dates from this month SELECT TOP (50) DATEADD(DAY, ABS(object_id % 30), '20191201'), 0 FROM sys.all_columns;
It might make sense in this scenario to create a filtered index like this (which makes quick work of any queries that are trying to get at those unshipped orders):
CREATE INDEX ix_OrdersNotShipped ON dbo.Orders(IsShipped, OrderDate) WHERE IsShipped = 0;
We can run a quick query like this to see how it uses the filtered index:
SELECT OrderID, OrderDate FROM dbo.Orders WHERE IsShipped = 0;
The execution plan is fairly simple, but there is a warning about UnmatchedIndexes:
The name of the warning is slightly misleading — the optimizer was ultimately able to use the index, but is suggesting it would be "better" without parameters (which we didn't explicitly use), even though the statement looks like it was parameterized:
If you really want, you can eliminate the warning, with no difference in actual performance (it would just be cosmetic). One way is to add a zero-impact predicate, like
AND (1 > 0):
SELECT wadd = OrderID, OrderDate FROM dbo.Orders WHERE IsShipped = 0 AND (1 > 0);
Another (probably more common) is to add
SELECT wrecomp = OrderID, OrderDate FROM dbo.Orders WHERE IsShipped = 0 OPTION (RECOMPILE);
Both of these options yield the same plan (a seek with no warnings):
So far, so good; our filtered index is being used (as expected). These aren't the only tricks, of course; see the comments below for others that readers have already submitted.
Then, the complication
Because the database is subject to a large number of ad hoc queries, someone turns on forced parameterization, attempting to reduce compilation and eliminate low- and single-use plans from polluting the plan cache:
ALTER DATABASE Sales SET PARAMETERIZATION FORCED;
Now our original query can't use the filtered index; it's forced to scan the clustered index:
SELECT OrderID, OrderDate FROM dbo.Orders WHERE IsShipped = 0;
The warning about unmatched indexes returns, and we get new warnings about residual I/O. Note that the statement is parameterized, but it looks a bit different:
This is by design, since the whole purpose of forced parameterization is to parameterize queries like this. But it defeats the purpose of our filtered index, since that is meant to support a single value in the predicate, not a parameter that can change.
Our "trick" query that uses the additional predicate is also unable to use the filtered index, and ends up with a slightly more complicated plan to boot:
SELECT OrderID, OrderDate FROM dbo.Orders WHERE IsShipped = 0 AND (1 > 0);
The typical reaction in this case, just like with removing the warning earlier, is to add
OPTION (RECOMPILE) to the statement. This works, and allows the filtered index to be chosen for an efficient seek…
SELECT OrderID, OrderDate FROM dbo.Orders WHERE IsShipped = 0 OPTION (RECOMPILE);
OPTION (RECOMPILE) and taking this additional compilation hit against every execution of the query is not always going to be acceptable in high-volume environments (especially if they're already CPU-bound).
Someone suggested explicitly hinting the filtered index to avoid the costs of recompile. In general, this is rather brittle, because it relies on the index outliving the code; I tend to use this is as a last resort. In this case it isn't valid anyway. When parameterization rules prevent the optimizer from picking the filtered index automatically, they also prevent you from picking it manually. Same issue with a generic
SELECT OrderID, OrderDate FROM dbo.Orders WITH (INDEX (ix_OrdersNotShipped)) WHERE IsShipped = 0; SELECT OrderID, OrderDate FROM dbo.Orders WITH (FORCESEEK) WHERE IsShipped = 0;
Both yield this error:
Query processor could not produce a query plan because of the hints defined in this query. Resubmit the query without specifying any hints and without using SET FORCEPLAN.
And this makes sense, because there's no way to know that the unknown value for the
IsShipped parameter will match the filtered index (or support a seek operation on any index).
I suggested you could use dynamic SQL, to at least only pay that recompile hit when you know you want to hit the smaller index:
DECLARE @IsShipped bit = 0; DECLARE @sql nvarchar(max) = N'SELECT dynsql = OrderID, OrderDate FROM dbo.Orders' + CASE WHEN @IsShipped IS NOT NULL THEN N' WHERE IsShipped = @IsShipped' ELSE N'' END + CASE WHEN @IsShipped = 0 THEN N' OPTION (RECOMPILE)' ELSE N'' END; EXEC sys.sp_executesql @sql, N'@IsShipped bit', @IsShipped;
This leads to the same efficient plan as above. If you changed the variable to
@IsShipped = 1, then you get the more expensive clustered index scan you should expect:
But nobody likes using dynamic SQL in an edge case like this — it makes code harder to read and maintain, and even if this code were out in the application, it's still additional logic that would have to be added there, making it less than desirable.
We talked briefly about implementing a plan guide, which is certainly not simpler, but then a colleague suggested that you could fool the optimizer by "hiding" the parameterized statement inside a stored procedure, view, or inline table-valued function. It was so simple, I didn't believe it would work.
But then I tried it:
CREATE PROCEDURE dbo.GetUnshippedOrders AS BEGIN SET NOCOUNT ON; SELECT OrderID, OrderDate FROM dbo.Orders WHERE IsShipped = 0; END GO CREATE VIEW dbo.vUnshippedOrders AS SELECT OrderID, OrderDate FROM dbo.Orders WHERE IsShipped = 0; GO CREATE FUNCTION dbo.fnUnshippedOrders() RETURNS TABLE AS RETURN (SELECT OrderID, OrderDate FROM dbo.Orders WHERE IsShipped = 0); GO
All three of these queries perform the efficient seek against the filtered index:
EXEC dbo.GetUnshippedOrders; GO SELECT OrderID, OrderDate FROM dbo.vUnshippedOrders; GO SELECT OrderID, OrderDate FROM dbo.fnUnshippedOrders();
I was surprised this was so effective. Of course, this requires you to change the application; if you can't change the app code to call a stored procedure or reference the view or function (or even add
OPTION (RECOMPILE)), you'll have to keep looking for other options. But if you can change the application code, stuffing the predicate into another module may just be the way to go.