sessionize

Window function that assigns monotonically increasing session IDs. A new session begins when the gap between consecutive events exceeds a configurable threshold.

Signature

sessionize(timestamp TIMESTAMP, gap INTERVAL) -> BIGINT

Parameters:

ParameterTypeDescription
timestampTIMESTAMPEvent timestamp
gapINTERVALMaximum allowed inactivity gap between events in the same session

Returns: BIGINT -- the session ID (1-indexed, monotonically increasing within each partition).

Usage

sessionize is used as a window function with OVER (PARTITION BY ... ORDER BY ...).

SELECT user_id, event_time,
  sessionize(event_time, INTERVAL '30 minutes') OVER (
    PARTITION BY user_id ORDER BY event_time
  ) as session_id
FROM events;

Behavior

  • The first event in each partition is assigned session ID 1.
  • Each subsequent event is compared to the previous event's timestamp.
  • If the gap exceeds the threshold, the session ID increments.
  • A gap exactly equal to the threshold does not start a new session; the gap must strictly exceed the threshold.

Example

Given events for a single user with a 30-minute threshold:

event_timesession_idReason
10:001First event
10:15115 min gap (within threshold)
10:25110 min gap (within threshold)
11:30265 min gap (exceeds threshold)
11:45215 min gap (within threshold)
13:00375 min gap (exceeds threshold)

Implementation

The state tracks the first timestamp, last timestamp, and the number of session boundaries (gaps exceeding the threshold). The combine operation is O(1), which enables efficient evaluation via DuckDB's segment tree windowing machinery.

OperationComplexity
UpdateO(1)
CombineO(1)
FinalizeO(1)
SpaceO(1) per partition segment

At benchmark scale, sessionize processes 1 billion rows in 1.20 seconds (830 Melem/s).

See Also

  • retention -- cohort retention analysis using boolean conditions
  • window_funnel -- conversion funnel step tracking within time windows