Engineering Overview
This page provides a technical summary of the engineering decisions, quality
standards, and design trade-offs that underpin duckdb-behavioral. It is
intended for technical evaluators, contributors, and anyone seeking to
understand the depth and rigor of this project.
Project Scope
duckdb-behavioral is a loadable extension for DuckDB
that implements the complete set of behavioral analytics functions found in
ClickHouse. The extension is written entirely in
Rust, compiles to a shared library (.so / .dylib), and integrates with
DuckDB via the C extension API.
The project spans several distinct engineering disciplines:
| Discipline | Scope |
|---|---|
| Systems programming | Rust FFI, raw C API callbacks, memory-safe aggregate state management, unsafe code confinement |
| Database internals | DuckDB's segment tree windowing, aggregate function lifecycle (init, update, combine, finalize, destroy), data chunk format |
| Algorithm design | NFA-based pattern matching, recursive descent parsing, greedy funnel search, bitmask-based retention analysis |
| Performance engineering | Cache-aware data structures, algorithmic complexity analysis, Criterion.rs benchmarking with confidence intervals, negative result documentation |
| Software quality | 434 unit tests, 27 E2E tests, property-based testing (proptest), mutation testing (cargo-mutants, 88.4% kill rate), zero clippy warnings under pedantic lints |
| CI/CD and release engineering | Multi-platform builds (Linux x86/ARM, macOS x86/ARM), SemVer validation, artifact attestation, reproducible builds |
| Technical writing | mdBook documentation site, function reference pages, optimization history with measured data, ClickHouse compatibility matrix |
Architecture
Separation of Concerns
The codebase enforces a strict separation between business logic and FFI integration:
%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#ffffff', 'primaryTextColor': '#1a1a1a', 'primaryBorderColor': '#333333', 'lineColor': '#333333', 'secondaryColor': '#f5f5f5', 'tertiaryColor': '#e0e0e0', 'textColor': '#1a1a1a', 'mainBkg': '#ffffff', 'clusterBkg': '#f5f5f5', 'clusterBorder': '#333333'}}}%%
graph TB
subgraph "DuckDB Process"
DDB[DuckDB Engine]
SEG[Segment Tree<br/>Windowing]
end
subgraph "Extension Entry Point"
EP[behavioral_init_c_api<br/>src/lib.rs]
end
subgraph "FFI Bridge — src/ffi/"
REG[register_all_raw]
FS[sessionize.rs]
FR[retention.rs]
FW[window_funnel.rs]
FQ[sequence.rs]
FE[sequence_match_events.rs]
FN[sequence_next_node.rs]
end
subgraph "Business Logic — Pure Safe Rust"
SS[sessionize.rs]
SR[retention.rs]
SW[window_funnel.rs]
SQ[sequence.rs]
SN[sequence_next_node.rs]
subgraph "Pattern Engine"
PP[parser.rs]
PE[executor.rs]
end
subgraph "Common"
EV[event.rs — 16-byte Copy]
TS[timestamp.rs]
end
end
DDB -->|LOAD extension| EP
EP --> REG
REG --> FS & FR & FW & FQ & FE & FN
SEG -->|init/update/combine/finalize| FS & FR & FW & FQ & FE & FN
FS --> SS
FR --> SR
FW --> SW
FQ --> SQ
FE --> SQ
FN --> SN
SQ --> PP & PE
SW --> EV
SQ --> EV
SN --> EV
style EP fill:#d9d9d9,stroke:#333333,stroke-width:2px,color:#1a1a1a
style DDB fill:#e8e8e8,stroke:#333333,stroke-width:2px,color:#1a1a1a
style SEG fill:#e8e8e8,stroke:#333333,stroke-width:2px,color:#1a1a1a
style REG fill:#f0f0f0,stroke:#333333,stroke-width:2px,color:#1a1a1a
style FS fill:#e0e0e0,stroke:#333333,stroke-width:2px,color:#1a1a1a
style FR fill:#e0e0e0,stroke:#333333,stroke-width:2px,color:#1a1a1a
style FW fill:#e0e0e0,stroke:#333333,stroke-width:2px,color:#1a1a1a
style FQ fill:#e0e0e0,stroke:#333333,stroke-width:2px,color:#1a1a1a
style FE fill:#e0e0e0,stroke:#333333,stroke-width:2px,color:#1a1a1a
style FN fill:#e0e0e0,stroke:#333333,stroke-width:2px,color:#1a1a1a
style SS fill:#f5f5f5,stroke:#333333,stroke-width:2px,color:#1a1a1a
style SR fill:#f5f5f5,stroke:#333333,stroke-width:2px,color:#1a1a1a
style SW fill:#f5f5f5,stroke:#333333,stroke-width:2px,color:#1a1a1a
style SQ fill:#f5f5f5,stroke:#333333,stroke-width:2px,color:#1a1a1a
style SN fill:#f5f5f5,stroke:#333333,stroke-width:2px,color:#1a1a1a
style PP fill:#ffffff,stroke:#333333,stroke-width:2px,color:#1a1a1a
style PE fill:#ffffff,stroke:#333333,stroke-width:2px,color:#1a1a1a
style EV fill:#e8e8e8,stroke:#333333,stroke-width:2px,color:#1a1a1a
style TS fill:#e8e8e8,stroke:#333333,stroke-width:2px,color:#1a1a1a
-
Business logic (
src/*.rs,src/common/,src/pattern/): Pure safe Rust. No FFI types, nounsafeblocks. All algorithmic work -- pattern parsing, NFA execution, funnel search, retention bitmask logic -- lives here. This layer is independently testable viacargo testwith no DuckDB dependency. -
FFI bridge (
src/ffi/): Contains allunsafecode. Each function has a dedicated FFI module implementing the five DuckDB aggregate callbacks (state_size,init,update,combine,finalize). Everyunsafeblock has a// SAFETY:comment documenting the invariants it relies on. -
Entry point (
src/lib.rs): A custom C entry point (behavioral_init_c_api) that bypasses theduckdbRust crate's connection wrapper entirely, usingduckdb_connectdirectly from the extension access struct. This eliminates struct layout fragility across DuckDB versions.
Why This Matters
This architecture enables:
- Independent unit testing: Business logic tests run in < 1 second with no DuckDB instance. All 434 tests exercise Rust structs directly.
- Safe evolution: Updating the DuckDB version only requires updating
libduckdb-sysinCargo.tomland re-running E2E tests. Business logic is decoupled from the database. - Auditable unsafe scope: The
unsafeboundary is confined tosrc/ffi/(6 files). Reviewers can audit the safety-critical code without reading the entire codebase.
Testing Philosophy
The Test Pyramid
%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#ffffff', 'primaryTextColor': '#1a1a1a', 'primaryBorderColor': '#333333', 'lineColor': '#333333', 'secondaryColor': '#f5f5f5', 'tertiaryColor': '#e0e0e0', 'textColor': '#1a1a1a', 'clusterBkg': '#f5f5f5', 'clusterBorder': '#333333'}}}%%
graph TB
subgraph "Complementary Test Levels"
L3["Mutation Testing<br/>88.4% kill rate (130/147)<br/>cargo-mutants"]
L2["E2E Tests (27)<br/>Real DuckDB CLI, SQL execution<br/>Extension load, registration, results"]
L1["Unit Tests (434)<br/>State lifecycle, edge cases, combine correctness<br/>Property-based (26 proptest), mutation-guided (51)"]
end
style L1 fill:#f5f5f5,stroke:#333333,stroke-width:2px,color:#1a1a1a
style L2 fill:#e8e8e8,stroke:#333333,stroke-width:2px,color:#1a1a1a
style L3 fill:#d9d9d9,stroke:#333333,stroke-width:2px,color:#1a1a1a
This project implements a rigorous multi-level testing strategy:
Level 1: Unit Tests (434 tests)
Organized by category within each module:
- State lifecycle tests -- empty state, single update, multiple updates, finalize
- Edge cases -- threshold boundaries, NULL handling, empty inputs, overflow
at type boundaries (
u32::MAX,i64::MIN) - Combine correctness -- empty-into-empty, empty-into-populated, populated-into-empty, associativity verification, configuration propagation
- Property-based tests (26 proptest) -- algebraic properties required by DuckDB's segment tree: combine associativity, commutativity, identity element, idempotency, monotonicity
- Mutation-testing-guided tests (51) -- tests written specifically to kill mutants that survived initial test suites
Level 2: E2E Tests (27 tests)
Integration tests against a real DuckDB CLI instance that validate the complete chain: extension loading, function registration, SQL execution, and result correctness. These tests caught three critical bugs that all 434 unit tests missed:
- A segmentation fault on extension load (incorrect pointer arithmetic)
- Six of seven functions silently failing to register (missing API call)
window_funnelreturning incorrect results (combine not propagating configuration)
Level 3: Mutation Testing (88.4% kill rate)
cargo-mutants systematically replaces operators, removes branches, and
changes return values across the codebase. Of 147 generated mutants, 130
were caught by the test suite. The 17 survivors are documented and represent
code paths where mutations produce semantically equivalent behavior (e.g.,
OR vs XOR on non-overlapping bitmasks).
Key Insight
Unit tests validate algorithmic correctness in isolation. E2E tests validate the integration boundary. Neither alone is sufficient. The three bugs found during E2E validation were entirely in the FFI layer -- a domain that unit tests by definition cannot exercise.
Performance Engineering
Methodology
Every performance claim in this project is backed by:
- Criterion.rs benchmarks with 95% confidence intervals
- Multiple runs (minimum 3) to establish baselines
- Throughput reporting (elements per second) alongside wall clock time
- Negative results documented honestly -- five attempted optimizations were measured, found to be regressions, reverted, and documented
Scale
| Function Class | Max Benchmark Scale | Constraint |
|---|---|---|
| O(1) state (sessionize, retention) | 1 billion elements | Compute-bound; no per-event memory |
| Event-collecting (window_funnel, sequence_*) | 100 million elements | 1.6 GB working set at 16 bytes/event |
| String-carrying (sequence_next_node) | 10 million elements | 32 bytes/event with Arc<str> allocation |
Optimization History
Fifteen rounds of measured optimization, each following a documented protocol:
- Establish baseline with 3 Criterion runs
- Implement one optimization per commit
- Measure before/after with confidence intervals
- Accept only when confidence intervals do not overlap
- Revert and document if the improvement is not statistically significant
Selected highlights:
| Optimization | Improvement | Scale |
|---|---|---|
| Event bitmask (Vec<bool> to u32) | 5--13x | All event functions |
| In-place combine (O(N^2) to O(N)) | Up to 2,436x | Combine operations at 10K states |
| NFA lazy matching (greedy to lazy) | 1,961x | sequence_match at 1M events |
| Arc<str> (String to reference counted) | 2.1--5.8x | sequence_next_node |
| NFA fast paths (linear scan) | 39--61% | sequence_count |
Five attempted optimizations were negative results:
- LSD radix sort: 4.3x slower than pdqsort for 16-byte embedded-key structs
- Branchless sessionize: 5--10% regression; branch predictor handles 90/10 patterns efficiently
- String pool with index vectors: 10--55% slower than Arc<str> at most scales
- Compiled pattern preservation: Caching parsed patterns did not improve performance
- First-condition pre-check: Adding an early-exit branch for non-matching first conditions increased overhead
Full optimization history with confidence intervals: Performance
Domain Significance
Why Behavioral Analytics Matters
Behavioral analytics answers a class of questions that traditional SQL aggregations cannot express efficiently:
-
Sessionization: Given a stream of timestamped events, partition them into logical sessions based on inactivity gaps. This is fundamental to web analytics, mobile app analysis, and IoT telemetry.
-
Retention cohorts: Given users who first appeared at time T, what fraction returned at T+1, T+2, T+N? This is the core metric for SaaS businesses, subscription services, and any product measuring long-term engagement.
-
Conversion funnels: Given a multi-step process (browse, add to cart, checkout, purchase), how far does each user progress within a time window? This directly drives revenue optimization in e-commerce.
-
Event sequence patterns: Given a temporal sequence of heterogeneous events, does a specific pattern occur? How many times? What happened next? This generalizes to fraud detection, clinical event monitoring, user journey analysis, and process mining.
Who Needs This
| Audience | Application |
|---|---|
| Product analytics teams | Funnel analysis, retention metrics, session analysis, user journey mapping |
| Data engineers | Replacing external analytics services (Amplitude, Mixpanel) with in-database SQL |
| Security analysts | Detecting temporal attack patterns in log data |
| Clinical researchers | Analyzing sequential treatment events in patient data |
| DevOps/SRE | Sessionizing log streams, identifying incident escalation patterns |
| Academic researchers | Process mining, sequential pattern mining on large datasets |
Scale Context
These problems are inherently large-scale. A mid-sized SaaS application generates millions of events per day. An e-commerce platform during a sale event can produce hundreds of millions of events per hour. The ability to process one billion sessionize events in 1.20 seconds on a single core means these analyses can run as interactive queries rather than batch jobs.
Implementation Challenges
DuckDB Aggregate Function Registration
DuckDB's Rust crate does not provide high-level aggregate function
registration. This project uses the raw C API (libduckdb-sys) directly,
implementing five callback functions per aggregate:
%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#f5f5f5', 'primaryTextColor': '#1a1a1a', 'primaryBorderColor': '#333333', 'lineColor': '#333333', 'secondaryColor': '#e8e8e8', 'tertiaryColor': '#e0e0e0', 'textColor': '#1a1a1a', 'noteBkgColor': '#e8e8e8', 'noteTextColor': '#1a1a1a', 'noteBorderColor': '#333333', 'labelTextColor': '#1a1a1a', 'transitionColor': '#333333'}}}%%
stateDiagram-v2
[*] --> state_size: DuckDB queries state byte size
state_size --> init: Allocate state buffer
init --> update: Process input rows
state "update" as update {
[*] --> ProcessRow
ProcessRow --> AccumulateState
AccumulateState --> ProcessRow: Next row
}
update --> combine: Segment tree merge
note right of combine
Called O(n log n) times.
Target state is zero-initialized.
Must propagate ALL config fields.
end note
combine --> finalize: Produce result
finalize --> destroy: Free resources
destroy --> [*]
state_size -- Returns the byte size of the aggregate state
init -- Initializes a new state (called by DuckDB's segment tree)
update -- Processes one row of input into the state
combine -- Merges two states (called O(n log n) times by segment tree)
finalize -- Produces the final result from the accumulated state
Because duckdb_aggregate_function_set_varargs does not exist, each function
that accepts a variable number of boolean conditions (retention, window_funnel,
sequence_match, sequence_count, sequence_match_events, sequence_next_node)
must register 31 overloads (2--32 parameters) via a function set.
NFA Pattern Engine
The sequence functions use a custom NFA (Nondeterministic Finite Automaton) pattern engine:
%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#ffffff', 'primaryTextColor': '#1a1a1a', 'primaryBorderColor': '#333333', 'lineColor': '#333333', 'secondaryColor': '#f5f5f5', 'tertiaryColor': '#e0e0e0', 'textColor': '#1a1a1a'}}}%%
flowchart LR
SQL["Pattern String<br/>'(?1).*(?t<=3600)(?2)'"] --> PARSE["Recursive Descent<br/>Parser"]
PARSE --> STEPS["CompiledPattern<br/>Condition(1), AnyEvents,<br/>TimeConstraint(<=, 3600),<br/>Condition(2)"]
STEPS --> CLASS{Classify}
CLASS -->|"All adjacent"| FAST1["O(n) Sliding<br/>Window"]
CLASS -->|"Wildcard-separated"| FAST2["O(n) Linear<br/>Scan"]
CLASS -->|"Time constraints<br/>or mixed"| NFA["NFA<br/>Backtracking"]
style SQL fill:#e8e8e8,stroke:#333333,stroke-width:2px,color:#1a1a1a
style PARSE fill:#f5f5f5,stroke:#333333,stroke-width:2px,color:#1a1a1a
style STEPS fill:#e0e0e0,stroke:#333333,stroke-width:2px,color:#1a1a1a
style CLASS fill:#ffffff,stroke:#333333,stroke-width:2px,color:#1a1a1a
style FAST1 fill:#f5f5f5,stroke:#333333,stroke-width:2px,color:#1a1a1a
style FAST2 fill:#f5f5f5,stroke:#333333,stroke-width:2px,color:#1a1a1a
style NFA fill:#d9d9d9,stroke:#333333,stroke-width:2px,color:#1a1a1a
- A recursive descent parser that compiles pattern strings (e.g.,
(?1).*(?t<=3600)(?2)) into an intermediate representation of typed steps - An NFA executor that evaluates the pattern against a sorted event stream using lazy backtracking with optional time constraints
- Fast-path classifiers that detect common pattern shapes (adjacent conditions, wildcard-separated conditions) and dispatch to specialized O(n) linear scans, avoiding the full NFA for the majority of real-world patterns
Combine Semantics in DuckDB's Segment Tree
DuckDB's windowing and grouping machinery uses a segment tree that creates
fresh zero-initialized states and combines source states into them. This
creates a subtle correctness requirement: combine must propagate all
configuration fields (window size, mode flags, pattern string, direction/base
parameters), not just the accumulated data. Missing this caused silent
incorrect results that passed all unit tests but failed E2E validation.
Quality Standards
| Metric | Value |
|---|---|
| Unit tests | 434 |
| Doc-tests | 1 |
| E2E tests | 27 |
| Property-based tests | 26 (proptest) |
| Mutation-guided tests | 51 |
| Mutation kill rate | 88.4% (130/147) |
| Clippy warnings | 0 (pedantic + nursery + cargo) |
| Unsafe block count | Confined to src/ffi/ (6 files) |
| MSRV | Rust 1.80 |
| Criterion benchmark files | 7 |
| Max benchmark scale | 1 billion elements |
| CI jobs | 13 (check, test, clippy, fmt, doc, MSRV, bench, deny, semver, coverage, cross-platform, extension-build) |
| Documented negative results | 5 (radix sort, branchless, string pool, compiled pattern, first-condition pre-check) |
| ClickHouse parity | Complete (7/7 functions, all modes, 32 conditions) |
Technology Stack
| Layer | Technology | Purpose |
|---|---|---|
| Language | Rust (stable, MSRV 1.80) | Memory safety, zero-cost abstractions, unsafe confinement |
| Database | DuckDB 1.4.4 | Analytical SQL engine, segment tree windowing |
| FFI | libduckdb-sys (C API) | Raw aggregate function registration |
| Benchmarking | Criterion.rs | Statistical benchmarking with confidence intervals |
| Property testing | proptest | Algebraic property verification |
| Mutation testing | cargo-mutants | Test suite effectiveness measurement |
| Documentation | mdBook | Static site generation for GitHub Pages |
| CI/CD | GitHub Actions | Multi-platform build, test, release, attestation |
| Dependency audit | cargo-deny | License compliance, advisory database checks |
| SemVer validation | cargo-semver-checks | API compatibility verification |