Snowflake MATCH_RECOGNIZE for finding transactions that do NOT match a pattern


I was looking through some audit data. In this audit data all audited transactions were first INITIALIZED, and then they may be REVERSED and MODIFIED. Not all transactions are REVERSED and MODIFIED. Only some of them. But all audited transactions should have an INIT action.

Here is sample data set (PII redacted):

Sample Data Set

Notice that all audits start with INIT action, except for transaction_id: 333. This is is anomaly we were looking for. We were looking for all the transaction audits that had a REVERSAL + MODIFICATION, but the INIT was missing.

This is where Snowflake’s REGEX powered MATCH_RECOGNIZE came in handy.

Using Regex in Snowflake MATCH_RECOGNIZE

Notice the Pattern is REVERSAL and MODIFICATION. However if just have PATTERN (REVERSAL MODIFICATION) it will also match the PATTERN where the INIT is also present. We only want to find the PATTERN where INIT is missing. To achieve that, we use the ^ regrex, which is the Start Anchor. Basically it is saying that the Pattern should start with a REVERSAL instead of a INIT. Which is exactly what we are looking for— transaction audits where the INIT is missing.

select 
 *
from scratch.saqib_ali.audit_table
match_recognize(
    partition by transaction_id
    order by audit_ts
    ALL ROWS PER MATCH
    PATTERN (^REVERSAL MODIFICATION*)
    DEFINE
          INIT as iff(ACTION='INIT',TRUE,FALSE),
          REVERSAL as iff(ACTION='REVERSAL',TRUE,FALSE),
          MODIFICATION as iff(ACTION='MODIFICATION',TRUE,FALSE)
);

Now let’s say, you wanted to find all the transaction audits where there was a REVERSAL but no MODIFICATION. You can use the Regex Ending Anchor ($) for this:

select 
 *
from audit_table
match_recognize(
    partition by transaction_id
    order by audit_ts
    ALL ROWS PER MATCH
    PATTERN (INIT REVERSAL+$)
    DEFINE
          INIT as iff(ACTION='INIT',TRUE,FALSE),
          REVERSAL as iff(ACTION='REVERSAL',TRUE,FALSE),
          MODIFICATION as iff(ACTION='MODIFICATION',TRUE,FALSE)
);