Itzik Ben-Gan

Row Pattern Recognition in SQL

SentryOne Newsletters

The SQLPerformance.com bi-weekly newsletter keeps you up to speed on the most recent blog posts and forum discussions in the SQL Server community.

eNews is a bi-monthly newsletter with fun information about SentryOne, tips to help improve your productivity, and much more.

Subscribe

Featured Author

Paul Randal, CEO of SQLskills, writes about knee-jerk performance tuning, DBCC, and SQL Server internals.

Paul’s Posts

The ISO/IEC 9075:2016 standard, or SQL:2016 in short, introduces support for a profound concept called Row Pattern Recognition (RPR) in SQL. I posted a suggestion to improve SQL Server by adding support for RPR in T-SQL, but I was quite limited in the space that I had to provide the feedback. The purpose of this article is to provide more details about the proposal, and hopefully you will be convinced of its importance and add your vote.

Background

Similar to using regular expressions to identify patterns in a string, with RPR you use regular expressions to identify patterns in a sequence of rows. For each pattern match, you can either return a summary row like in grouping, or the detailed rows, with computed measures against subsequences of the match. RPR has limitless practical applications, including identifying patterns in stock market activity (both valid patterns with trading value, and potentially illegal or suspicious patterns), handling time series, fraud detection, material handling, shipping applications, DNA sequencing, gaps and islands, top N per group, and many others.

To me, RPR is the next step in the evolution of window functions, with a higher level of sophistication, and extended utility. If you think that window functions are profound and useful, RPR is really going to bake your noodle. Similar to window functions, RPR supports partitioning and ordering. You will typically be looking for pattern matches within each partition independently, based on the indicated ordering. Also similar to window functions, RPR lends itself to good optimization, with the ability to rely on index ordering to avoid sorting the data.

You can find coverage of RPR, including illustrated examples, in a 90-page technical report ISO/IEC TR 19075-5 (available for free).

You can also find it as part of the document ISO/IEC 9075-2:2016, Information technology — Database languages — SQL — Part 2: Foundation (SQL/Foundation) (available for purchase).

The SQL:2016 standard provides two main RPR-related features:

  • Feature R010, “Row pattern recognition: FROM clause”
  • Feature R020, “Row pattern recognition: WINDOW clause”

The standard also mentions feature R030, “Row pattern recognition: full aggregate support,” without which aggregate functions are not allowed to specify DISTINCT or <filter clause>.

So far, the only platform I know of that implemented feature R010 is Oracle. I don’t know of any platform that implemented R020 yet.

Feature R010, “Row pattern recognition: FROM clause”

Feature R010 defines a clause/table operator called MATCH_RECOGNIZE, which you use in the FROM clause. The input is a table or table expression, and the output is a virtual table. The context is similar to that of other table operators like JOIN, APPLY, PIVOT and UNPIVOT. Here’s the syntax of a query using this feature:

SELECT