;
As an example adapted from the aforementioned technical report, suppose that you’re given a table dbo.Ticker, with columns symbol, tradedate and price. Use the following code to create the table, populate it with some sample data, and query it:
SET NOCOUNT ON;
USE tempdb;
DROP TABLE IF EXISTS dbo.Ticker;
CREATE TABLE dbo.Ticker
(
symbol VARCHAR(10) NOT NULL,
tradedate DATE NOT NULL,
price NUMERIC(12, 2) NOT NULL,
CONSTRAINT PK_Ticker
PRIMARY KEY (symbol, tradedate)
);
GO
INSERT INTO dbo.Ticker(symbol, tradedate, price) VALUES
('STOCK1', '20190212', 150.00), ('STOCK1', '20190213', 151.00), ('STOCK1', '20190214', 148.00),
('STOCK1', '20190215', 146.00), ('STOCK1', '20190218', 142.00), ('STOCK1', '20190219', 144.00),
('STOCK1', '20190220', 152.00), ('STOCK1', '20190221', 152.00), ('STOCK1', '20190222', 153.00),
('STOCK1', '20190225', 154.00), ('STOCK1', '20190226', 154.00), ('STOCK1', '20190227', 154.00),
('STOCK1', '20190228', 153.00), ('STOCK1', '20190301', 145.00), ('STOCK1', '20190304', 140.00),
('STOCK1', '20190305', 142.00), ('STOCK1', '20190306', 143.00), ('STOCK1', '20190307', 142.00),
('STOCK1', '20190308', 140.00), ('STOCK1', '20190311', 138.00), ('STOCK2', '20190212', 330.00),
('STOCK2', '20190213', 329.00), ('STOCK2', '20190214', 329.00), ('STOCK2', '20190215', 326.00),
('STOCK2', '20190218', 325.00), ('STOCK2', '20190219', 326.00), ('STOCK2', '20190220', 328.00),
('STOCK2', '20190221', 326.00), ('STOCK2', '20190222', 320.00), ('STOCK2', '20190225', 317.00),
('STOCK2', '20190226', 319.00), ('STOCK2', '20190227', 325.00), ('STOCK2', '20190228', 322.00),
('STOCK2', '20190301', 324.00), ('STOCK2', '20190304', 321.00), ('STOCK2', '20190305', 319.00),
('STOCK2', '20190306', 322.00), ('STOCK2', '20190307', 326.00), ('STOCK2', '20190308', 326.00),
('STOCK2', '20190311', 324.00);
SELECT symbol, tradedate, price FROM dbo.Ticker;
This code generates the following output:
symbol tradedate price
------ ---------- ------
STOCK1 2019-02-12 150.00
STOCK1 2019-02-13 151.00
STOCK1 2019-02-14 148.00
STOCK1 2019-02-15 146.00
STOCK1 2019-02-18 142.00
STOCK1 2019-02-19 144.00
STOCK1 2019-02-20 152.00
STOCK1 2019-02-21 152.00
STOCK1 2019-02-22 153.00
STOCK1 2019-02-25 154.00
STOCK1 2019-02-26 154.00
STOCK1 2019-02-27 154.00
STOCK1 2019-02-28 153.00
STOCK1 2019-03-01 145.00
STOCK1 2019-03-04 140.00
STOCK1 2019-03-05 142.00
STOCK1 2019-03-06 143.00
STOCK1 2019-03-07 142.00
STOCK1 2019-03-08 140.00
STOCK1 2019-03-11 138.00
STOCK2 2019-02-12 330.00
STOCK2 2019-02-13 329.00
STOCK2 2019-02-14 329.00
STOCK2 2019-02-15 326.00
STOCK2 2019-02-18 325.00
STOCK2 2019-02-19 326.00
STOCK2 2019-02-20 328.00
STOCK2 2019-02-21 326.00
STOCK2 2019-02-22 320.00
STOCK2 2019-02-25 317.00
STOCK2 2019-02-26 319.00
STOCK2 2019-02-27 325.00
STOCK2 2019-02-28 322.00
STOCK2 2019-03-01 324.00
STOCK2 2019-03-04 321.00
STOCK2 2019-03-05 319.00
STOCK2 2019-03-06 322.00
STOCK2 2019-03-07 326.00
STOCK2 2019-03-08 326.00
STOCK2 2019-03-11 324.00
40 row(s) affected.
The following query identifies patterns representing V shapes in the stock price (a period with strictly falling price followed by a period with strictly increasing price), using ONE ROW PER MATCH as the row pattern rows per match option:
SELECT
MR.symbol, MR.matchnum, MR.startdate, MR.startprice,
MR.bottomdate, MR.bottomprice, MR.enddate, MR.endprice, MR.maxprice
FROM dbo.Ticker
MATCH_RECOGNIZE
(
PARTITION BY symbol
ORDER BY tradedate
MEASURES
MATCH_NUMBER() AS matchnum,
A.tradedate AS startdate,
A.price AS startprice,
LAST(B.tradedate) AS bottomdate,
LAST(B.price) AS bottomprice,
LAST(C.tradedate) AS enddate, -- same as LAST(tradedate)
LAST(C.price) AS endprice,
MAX(U.price) AS maxprice -- same as MAX(price)
ONE ROW PER MATCH -- default
AFTER MATCH SKIP PAST LAST ROW -- default
PATTERN (A B+ C+)
SUBSET U = (A, B, C)
DEFINE
-- A defaults to True, matches any row, same as explicitly defining A AS 1 = 1
B AS B.price < PREV(B.price),
C AS C.price > PREV(C.price)
) AS MR;
The PARTITION BY clause defines that you want to handle each stock symbol separately.
The ORDER BY clause defines ordering based on trade date.
The DEFINE clause defines row pattern variables representing the different subsequences of rows in the pattern. In the above example A represents any row as a starting point, B represents a subsequence of decreasing prices (B.price < PREV(B.price)), and C represents a subsequence of increasing prices (C AS C.price > PREV(C.price)).
The PATTERN clause uses regular expressions to identify a pattern. In the above query the pattern is (A B+ C+), meaning (any row, followed by one or more rows with decreasing prices, followed by one or more rows with increasing prices). Following are regular expression pattern quantifiers that you can use:
* — zero (0) or more matches
+ — one (1) or more matches
? — no match or one (1) match, optional
{ n } — exactly n matches
{ n, } — n or more matches
{ n, m } — between n and m (inclusive) matches
{ , m } — between zero (0) and m (inclusive) matches
{- Variable -}, e.g., {- A -} — indicates that matching rows are to be excluded from the output (useful only if ALL ROW PER MATCH specified)
|, e.g., A | B — alternation
(), e.g., (A | B) — grouping
^, e.g., ^A{1, 3} — start of a row pattern partition
$, e.g., A{1, 3}$ — end of a row pattern partition
By default the quantifiers are greedy, but you can define them to be reluctant.
The SUBSET clause allows you to define a named subset list of variables.
The MEASURES clause defines measures related to the pattern. You can apply calculations to pattern variables and to subsets. The MATCH_NUMBER() function assigns sequential integers starting with 1 for the matches within the partition. You can use operations like FIRST, LAST, PREV and NEXT, as well as aggregate calculations.
This query uses ONE ROW PER MATCH as the row pattern rows per match option. This means that the result table will have one row per pattern match, similar to the outcome of grouping. The alternative is ALL ROWS PER MATCH where you want the detail rows returned per pattern match (example to follow shortly).
This query uses AFTER MATCH SKIP PAST LAST ROW as the AFTER MATCH <skip to option>. This means that once a match is found, you want the next attempt to start after the last row of the current match. There are other alternatives like looking for the next match in the row following the first row of the current match (SKIP TO NEXT ROW), or to skip to a position relative to a row pattern variable.
Here’s the expected output of this query:
symbol matchnum startdate startprice bottomdat bottomprice enddate endprice maxprice
------ -------- ---------- ---------- ---------- ----------- ---------- -------- --------
STOCK1 1 2019-02-13 151.00 2019-02-18 142.00 2019-02-20 152.00 152.00
STOCK1 2 2019-02-27 154.00 2019-03-04 140.00 2019-03-06 143.00 154.00
STOCK2 1 2019-02-14 329.00 2019-02-18 325.00 2019-02-20 328.00 329.00
STOCK2 2 2019-02-21 326.00 2019-02-25 317.00 2019-02-27 325.00 326.00
STOCK2 3 2019-03-01 324.00 2019-03-05 319.00 2019-03-07 326.00 326.00
Here’s a slightly modified version of the query using the ALL ROWS PER MATCH option:
SELECT
MR.symbol, MR.tradedate, MR.price, MR.matchnum, MR.classy,
MR.startdate, MR.startprice, MR.bottomdate, MR.bottomprice,
MR.enddate, MR.endprice, MR.maxprice
FROM dbo.Ticker
MATCH_RECOGNIZE
(
PARTITION BY symbol
ORDER BY tradedate
MEASURES
MATCH_NUMBER() AS matchnum,
CLASSIFIER() AS classy,
A.tradedate AS startdate,
A.price AS startprice,
LAST(B.tradedate) AS bottomdate,
LAST(B.price) AS bottomprice,
LAST(C.tradedate) AS enddate,
LAST(C.price) AS endprice,
MAX(U.price) AS maxprice
ALL ROWS PER MATCH
AFTER MATCH SKIP PAST LAST ROW
PATTERN (A B+ C+)
SUBSET U = (A, B, C)
DEFINE
B AS B.price < PREV(B.price),
C AS C.price > PREV(C.price)
) AS MR;
Here’s the expected output of this query:
symbol tradedate price matchnum classy startdate startprice bottomdate bottomprice enddate endprice maxprice
------ ---------- ------ -------- ------ ---------- ---------- ---------- ----------- ---------- --------- --------
STOCK1 2019-02-13 151.00 1 A 2019-02-13 151.00 NULL NULL NULL NULL 151.00
STOCK1 2019-02-14 148.00 1 B 2019-02-13 151.00 2019-02-14 148.00 NULL NULL 151.00
STOCK1 2019-02-15 146.00 1 B 2019-02-13 151.00 2019-02-15 146.00 NULL NULL 151.00
STOCK1 2019-02-18 142.00 1 B 2019-02-13 151.00 2019-02-18 142.00 NULL NULL 151.00
STOCK1 2019-02-19 144.00 1 C 2019-02-13 151.00 2019-02-18 142.00 2019-02-19 144.00 151.00
STOCK1 2019-02-20 152.00 1 C 2019-02-13 151.00 2019-02-18 142.00 2019-02-20 152.00 152.00
STOCK1 2019-02-27 154.00 2 A 2019-02-27 154.00 NULL NULL NULL NULL 154.00
STOCK1 2019-02-28 153.00 2 B 2019-02-27 154.00 2019-02-28 153.00 NULL NULL 154.00
STOCK1 2019-03-01 145.00 2 B 2019-02-27 154.00 2019-03-01 145.00 NULL NULL 154.00
STOCK1 2019-03-04 140.00 2 B 2019-02-27 154.00 2019-03-04 140.00 NULL NULL 154.00
STOCK1 2019-03-05 142.00 2 C 2019-02-27 154.00 2019-03-04 140.00 2019-03-05 142.00 154.00
STOCK1 2019-03-06 143.00 2 C 2019-02-27 154.00 2019-03-04 140.00 2019-03-06 143.00 154.00
STOCK2 2019-02-14 329.00 1 A 2019-02-14 329.00 NULL NULL NULL NULL 329.00
STOCK2 2019-02-15 326.00 1 B 2019-02-14 329.00 2019-02-15 326.00 NULL NULL 329.00
STOCK2 2019-02-18 325.00 1 B 2019-02-14 329.00 2019-02-18 325.00 NULL NULL 329.00
STOCK2 2019-02-19 326.00 1 C 2019-02-14 329.00 2019-02-18 325.00 2019-02-19 326.00 329.00
STOCK2 2019-02-20 328.00 1 C 2019-02-14 329.00 2019-02-18 325.00 2019-02-20 328.00 329.00
STOCK2 2019-02-21 326.00 2 A 2019-02-21 326.00 NULL NULL NULL NULL 326.00
STOCK2 2019-02-22 320.00 2 B 2019-02-21 326.00 2019-02-22 320.00 NULL NULL 326.00
STOCK2 2019-02-25 317.00 2 B 2019-02-21 326.00 2019-02-25 317.00 NULL NULL 326.00
STOCK2 2019-02-26 319.00 2 C 2019-02-21 326.00 2019-02-25 317.00 2019-02-26 319.00 326.00
STOCK2 2019-02-27 325.00 2 C 2019-02-21 326.00 2019-02-25 317.00 2019-02-27 325.00 326.00
STOCK2 2019-03-01 324.00 3 A 2019-03-01 324.00 NULL NULL NULL NULL 324.00
STOCK2 2019-03-04 321.00 3 B 2019-03-01 324.00 2019-03-04 321.00 NULL NULL 324.00
STOCK2 2019-03-05 319.00 3 B 2019-03-01 324.00 2019-03-05 319.00 NULL NULL 324.00
STOCK2 2019-03-06 322.00 3 C 2019-03-01 324.00 2019-03-05 319.00 2019-03-06 322.00 324.00
STOCK2 2019-03-07 326.00 3 C 2019-03-01 324.00 2019-03-05 319.00 2019-03-07 326.00 326.00
27 row(s) affected.
Notice the addition of the measure classy based on the function CLASSIFIER. This function returns a string representing the row pattern variable that the result row is associated with (in our case, A, B, or C).
Feature R020, “Row pattern recognition: WINDOW clause”
Feature R020 uses row pattern recognition as part of the windowing OVER clause (or WINDOW clause when naming a window specification) to further restrict a window frame. Just like the window partition restricts the rows of the input table expression (FROM… WHERE… GROUP BY… HAVING), and the window frame further restricts the window partition, with feature R020 you further restrict the full window frame to a reduced window frame made of the subsequence of rows constituting the pattern match. Here’s a query demonstrating this feature with the WINDOW clause, using similar row pattern matching specification to the one used in the first query in this article:
SELECT T.symbol, T.tradedate, T.price,
startdate OVER W, startprice OVER W,
bottomdate OVER W, bottomprice OVER W,
enddate OVER W, endprice OVER W,
maxprice OVER W
FROM dbo.Ticker T
WINDOW W AS
(
PARTITION BY symbol
ORDER BY tradedate
MEASURES
A.tradedate AS startdate,
A.price AS startprice,
LAST(B.tradedate) AS bottomdate,
LAST(B.price) AS bottomprice,
LAST(C.tradedate) AS enddate,
LAST(C.price) AS endprice,
MAX(U.price) AS maxprice
ROWS BETWEEN CURRENT ROW
AND UNBOUNDED FOLLOWING
AFTER MATCH SKIP PAST LAST ROW
INITIAL -- pattern must start at first row of full window frame; alternative is SEEK
PATTERN (A B+ C+)
SUBSET U = (A, B, C)
DEFINE
B AS B.price < PREV(B.price),
C AS C.price > PREV(C.price)
);
When using row pattern recognition with windowing, the full window frame must start at the current row. Notice the use of the INITIAL option in this query. This option means that you get a match only if the pattern starts with the current row. The alternative is SEEK, meaning that a search for a match starts with the current row, but is permitted through to the end of the full window frame. Either way, if a match is found, the reduced window frame consists of only the pattern match rows, otherwise, the reduced window frame is empty. Only one row pattern match per full window frame is sought.
Observe in the query’s SELECT list that you can return measures defined in the MEASURES clause, computed over W, which is the reduced window frame.
Recall the result of the first query in this article, using row pattern recognition in the FROM clause, with the ONE ROW PER MATCH option:
symbol matchnum startdate startprice bottomdat bottomprice enddate endprice maxprice
------ -------- ---------- ---------- ---------- ----------- ---------- -------- --------
STOCK1 1 2019-02-13 151.00 2019-02-18 142.00 2019-02-20 152.00 152.00
STOCK1 2 2019-02-27 154.00 2019-03-04 140.00 2019-03-06 143.00 154.00
STOCK2 1 2019-02-14 329.00 2019-02-18 325.00 2019-02-20 328.00 329.00
STOCK2 2 2019-02-21 326.00 2019-02-25 317.00 2019-02-27 325.00 326.00
STOCK2 3 2019-03-01 324.00 2019-03-05 319.00 2019-03-07 326.00 326.00
Here’s the expected output of our last query, using row pattern recognition in the WINDOW clause:
symbol tradedate price startdate startprice bottomdate bottomprice enddate endprice maxprice
------ ---------- ------ ---------- ---------- ---------- ----------- ---------- -------- --------
STOCK1 2019-02-12 150.00 NULL NULL NULL NULL NULL NULL NULL
STOCK1 2019-02-13 151.00 2019-02-13 151.00 2019-02-18 142.00 2019-02-20 152.00 152.00
STOCK1 2019-02-14 148.00 NULL NULL NULL NULL NULL NULL NULL
STOCK1 2019-02-15 146.00 NULL NULL NULL NULL NULL NULL NULL
STOCK1 2019-02-18 142.00 NULL NULL NULL NULL NULL NULL NULL
STOCK1 2019-02-19 144.00 NULL NULL NULL NULL NULL NULL NULL
STOCK1 2019-02-20 152.00 NULL NULL NULL NULL NULL NULL NULL
STOCK1 2019-02-21 152.00 NULL NULL NULL NULL NULL NULL NULL
STOCK1 2019-02-22 153.00 NULL NULL NULL NULL NULL NULL NULL
STOCK1 2019-02-25 154.00 NULL NULL NULL NULL NULL NULL NULL
STOCK1 2019-02-26 154.00 NULL NULL NULL NULL NULL NULL NULL
STOCK1 2019-02-27 154.00 2019-02-27 154.00 2019-03-04 140.00 2019-03-06 143.00 154.00
STOCK1 2019-02-28 153.00 NULL NULL NULL NULL NULL NULL NULL
STOCK1 2019-03-01 145.00 NULL NULL NULL NULL NULL NULL NULL
STOCK1 2019-03-04 140.00 NULL NULL NULL NULL NULL NULL NULL
STOCK1 2019-03-05 142.00 NULL NULL NULL NULL NULL NULL NULL
STOCK1 2019-03-06 143.00 NULL NULL NULL NULL NULL NULL NULL
STOCK1 2019-03-07 142.00 NULL NULL NULL NULL NULL NULL NULL
STOCK1 2019-03-08 140.00 NULL NULL NULL NULL NULL NULL NULL
STOCK1 2019-03-11 138.00 NULL NULL NULL NULL NULL NULL NULL
STOCK2 2019-02-12 330.00 NULL NULL NULL NULL NULL NULL NULL
STOCK2 2019-02-13 329.00 NULL NULL NULL NULL NULL NULL NULL
STOCK2 2019-02-14 329.00 2019-02-14 329.00 2019-02-18 325.00 2019-02-20 328.00 329.00
STOCK2 2019-02-15 326.00 NULL NULL NULL NULL NULL NULL NULL
STOCK2 2019-02-18 325.00 NULL NULL NULL NULL NULL NULL NULL
STOCK2 2019-02-19 326.00 NULL NULL NULL NULL NULL NULL NULL
STOCK2 2019-02-20 328.00 NULL NULL NULL NULL NULL NULL NULL
STOCK2 2019-02-21 326.00 2019-02-21 326.00 2019-02-25 317.00 2019-02-27 325.00 326.00
STOCK2 2019-02-22 320.00 NULL NULL NULL NULL NULL NULL NULL
STOCK2 2019-02-25 317.00 NULL NULL NULL NULL NULL NULL NULL
STOCK2 2019-02-26 319.00 NULL NULL NULL NULL NULL NULL NULL
STOCK2 2019-02-27 325.00 NULL NULL NULL NULL NULL NULL NULL
STOCK2 2019-02-28 322.00 NULL NULL NULL NULL NULL NULL NULL
STOCK2 2019-03-01 324.00 2019-03-01 324.00 2019-03-05 319.00 2019-03-07 326.00 326.00
STOCK2 2019-03-04 321.00 NULL NULL NULL NULL NULL NULL NULL
STOCK2 2019-03-05 319.00 NULL NULL NULL NULL NULL NULL NULL
STOCK2 2019-03-06 322.00 NULL NULL NULL NULL NULL NULL NULL
STOCK2 2019-03-07 326.00 NULL NULL NULL NULL NULL NULL NULL
STOCK2 2019-03-08 326.00 NULL NULL NULL NULL NULL NULL NULL
STOCK2 2019-03-11 324.00 NULL NULL NULL NULL NULL NULL NULL
40 row(s) affected.
Observe that you get all detailed rows in the output, and where a pattern match starts, you get the result of the requested row pattern measures against the reduced window frame.
Cast your vote
It’s completely understood that for Microsoft to add a feature to T-SQL—especially such a substantial one—is a pretty significant investment. But what’s great about features that are added to T-SQL is that they pretty much stay there forever. There’s a huge community hungry for T-SQL improvements such as this one.
If you feel that row pattern recognition is an important addition to SQL Server, make sure you cast your vote . Also, Microsoft is more likely to prioritize a proposed feature if they know of customers and use cases that could benefit from it, and that at the moment such customers are either using other products or more complex solutions. If you or your customers see RPR as beneficial to you, and have use cases that you can share, make sure that you add comments to the feedback item and let Microsoft know.