Aaron Bertrand

Follow-up #1 on leading wildcard seeks

SentryOne Newsletters

The SQLPerformance.com bi-weekly newsletter keeps you up to speed on the most recent blog posts and forum discussions in the SQL Server community.

eNews is a bi-monthly newsletter with fun information about SentryOne, tips to help improve your productivity, and much more.

Subscribe

Featured Author

Jonathan Kehayias is a Principal Consultant with SQLskills and the youngest MCM ever.

Jonathan’s Posts

Follow-up #1 on leading wildcard seeksIn my last post, "One way to get an index seek for a leading wildcard," I mentioned that you would need triggers to deal with maintaining the fragments I recommended. A couple of people have contacted me to ask if I could demonstrate those triggers.

To simplify from the previous post, let's assume we have the following tables – a set of Companies, and then a CompanyNameFragments table that allows for pseudo-wildcard searching against any substring of the company name:

CREATE TABLE dbo.Companies
(
  CompanyID  int CONSTRAINT PK_Companies PRIMARY KEY,
  Name       nvarchar(100) NOT NULL
);
GO

CREATE TABLE dbo.CompanyNameFragments
(
  CompanyID int NOT NULL,
  Fragment  nvarchar(100) NOT NULL
);

CREATE CLUSTERED INDEX CIX_CNF ON dbo.CompanyNameFragments(Fragment, CompanyID);

Given this function for generating fragments (the only change from the original article is I've increased @input to support 100 characters):

CREATE FUNCTION dbo.CreateStringFragments( @input nvarchar(100) )
RETURNS TABLE WITH SCHEMABINDING
AS
  RETURN 
  (
    WITH x(x) AS 
    (
      SELECT 1 UNION ALL SELECT x+1 FROM x WHERE x < (LEN(@input))
    )
    SELECT Fragment = SUBSTRING(@input, x, LEN(@input)) FROM x
  );
GO

We can create a single trigger that can handle all three operations:

CREATE TRIGGER dbo.Company_MaintainFragments
ON dbo.Companies
FOR INSERT, UPDATE, DELETE
AS
BEGIN
  SET NOCOUNT ON;

  DELETE f FROM dbo.CompanyNameFragments AS f
    INNER JOIN deleted AS d 
    ON f.CompanyID = d.CompanyID;

  INSERT dbo.CompanyNameFragments(CompanyID, Fragment)
    SELECT i.CompanyID, fn.Fragment
    FROM inserted AS i 
    CROSS APPLY dbo.CreateStringFragments(i.Name) AS fn;
END
GO

This works without any checking for which type of operation happened because:

  • For an UPDATE or a DELETE, the DELETE will happen - for an UPDATE, we are not going to bother trying to match fragments that will remain the same; we're just going to blow them all away, so they can be replaced en masse. For an INSERT, the DELETE statement will have no effect, because there will be no rows in deleted.
  • For an INSERT or an UPDATE, the INSERT will happen. For a DELETE, the INSERT statement will have no effect, because there will be no rows in inserted.

Now, just to make sure it works, let's perform some changes to the Companies table and then inspect our two tables.

-- First, let's insert two companies 
-- (table contents after insert shown in figure 1 below)

INSERT dbo.Companies(Name) VALUES(N'Banana'), (N'Acme Corp');

-- Now, let's update company 2 to 'Orange' 
-- (table contents after update shown in figure 2 below):

UPDATE dbo.Companies SET Name = N'Orange' WHERE CompanyID = 2;

-- Finally, delete company #1 
-- (table contents after delete shown in figure 3 below):

DELETE dbo.Companies WHERE CompanyID = 1;
Wildcard supporting fragments 1Figure 1: Initial table contents Wildcard supporting fragments 2Figure 2: Table contents after update Wildcard supporting fragments 3Figure 3: Table contents after delete

Caveat (for the referential integrity folks)

Note that if you set up proper foreign keys between these two tables, you'll have to use an instead of trigger to handle deletes, otherwise you'll have a chicken and egg problem - you can't wait until *after* the parent row is deleted to remove the child rows. So you would have to set up ON DELETE CASCADE (which I personally don't tend to like), or your two triggers would look like this (the after trigger would still have to perform a DELETE/INSERT pair in the case of an UPDATE):

CREATE TRIGGER dbo.Company_DeleteFragments
ON dbo.Companies
INSTEAD OF DELETE
AS
BEGIN
  SET NOCOUNT ON;

  DELETE f FROM dbo.CompanyNameFragments AS f
    INNER JOIN deleted AS d
    ON f.CompanyID = d.CompanyID;

  DELETE c FROM dbo.Companies AS c
    INNER JOIN deleted AS d
    ON c.CompanyID = d.CompanyID;
END
GO

CREATE TRIGGER dbo.Company_MaintainFragments
ON dbo.Companies
FOR INSERT, UPDATE
AS
BEGIN
  SET NOCOUNT ON;

  DELETE f FROM dbo.CompanyNameFragments AS f
    INNER JOIN deleted AS d
    ON f.CompanyID = d.CompanyID;

  INSERT dbo.CompanyNameFragments(CompanyID, Fragment)
    SELECT i.CompanyID, fn.Fragment
    FROM inserted AS i
    CROSS APPLY dbo.CreateStringFragments(i.Name) AS fn;
END
GO

Summary

This post was aimed at showing how easy it is to set up triggers that will maintain seekable string fragments to improve wildcard searches, at least for moderately-sized strings. Now, I still know that this kind of comes across as a wacky idea, but I keep talking about it because I am convinced there are good use cases out there.

In my next post, I will show how to see the impact of this choice: You can easily set up representative workloads to compare the resource costs of maintaining the fragments against the performance savings at query time. I will look at varying string lengths as well as different workload balances (mostly read vs. mostly write) and try to find sweet spots and danger zones.