sequence_count

Aggregate function that counts the number of non-overlapping occurrences of a pattern in the event stream.

Signature

sequence_count(pattern VARCHAR, timestamp TIMESTAMP,
               cond1 BOOLEAN, cond2 BOOLEAN [, ...]) -> BIGINT

Parameters:

ParameterTypeDescription
patternVARCHARPattern string (same syntax as sequence_match)
timestampTIMESTAMPEvent timestamp
cond1..condNBOOLEANEvent conditions (2 to 32)

Returns: BIGINT -- the number of non-overlapping matches of the pattern in the event stream.

Usage

-- Count how many times a user viewed then purchased
SELECT user_id,
  sequence_count('(?1)(?2)', event_time,
    event_type = 'view',
    event_type = 'purchase'
  ) as conversion_count
FROM events
GROUP BY user_id;

Behavior

  1. Events are sorted by timestamp.
  2. The pattern is compiled and executed using the NFA engine.
  3. Each time the pattern matches, the count increments and the NFA restarts from the event following the last matched event.
  4. Matches are non-overlapping: once a set of events is consumed by a match, those events cannot participate in another match.

Example

Given events for a user with pattern (?1)(?2):

event_timecond1 (view)cond2 (purchase)
10:00truefalse
10:10falsetrue
10:20truefalse
10:30falsetrue
10:40truefalse

Result: 2

  • First match: events at 10:00 and 10:10.
  • Second match: events at 10:20 and 10:30.
  • The event at 10:40 has no subsequent cond2 event.

Pattern Syntax

Uses the same pattern syntax as sequence_match. Refer to that page for the full syntax reference.

Implementation

OperationComplexity
UpdateO(1) amortized (event append)
CombineO(m) where m = events in other state
FinalizeO(n * s) NFA execution, where n = events, s = pattern steps
SpaceO(n) -- all collected events

At benchmark scale, sequence_count processes 100 million events in 1.18 s (85 Melem/s).

See Also