Skip to content

robust-sql/robust

Repository files navigation

Robust

CI DuckDB extension-ci-tools status JOB speedup JOB memory

A DuckDB extension that implements Predicate Transfer — a sideways-information-passing technique that propagates bloom filters, min/max ranges, and IN-lists across the entire join graph of a multi-join query, then pushes those filters down to the storage layer so probe-side scans skip rows that can't survive downstream joins.

Overview

In a multi-join query, the vast majority of rows scanned never survive to the final result. Predicate transfer addresses this by building filters from join keys on the build side of each join, then applying those filters (and pushing them to scans) on the probe side. Filters cascade across the join graph: a filter built from the deepest join can prune scans at the leaves.

The optimizer walks the join graph, detects equality-join equivalence classes, and inserts CREATE_FILTER and PROBE_FILTER operators into the physical plan. A second backward pass broadcasts each filter across its equivalence class, so filters built on one side of a join also apply to every other table that joins on the same key. Queries return identical results — they just touch dramatically fewer rows.

Benchmark Results

Measured on the Join Order Benchmark (113 queries over IMDb), DuckDB at commit 88277463aa (post-v1.5-merge on main), single machine, 8 threads.

Per-query speedup (Robust vs baseline DuckDB)

JOB speedup

Geometric mean: 1.76× faster. Best case: 23× on join-heavy queries (e.g. 07c).

Memory allocated per query (lower is better)

Memory ratio

Geometric mean of baseline_memory / robust_memory: 1.67×. Robust uses less memory on 100/113 queries; large reductions concentrated in queries with cardinality-explosive intermediate joins.

Hash-join output cardinality (matched rows emitted by HJs)

HJ cardinality sum

Sum of operator_cardinality across all HASH_JOIN operators in the plan — i.e. the total number of matched rows each HJ emits to its parent. Robust reduces this metric by 8× on the hardest queries: by filtering probe-side rows out at the scan before they reach the hash table, fewer rows participate in the join, so fewer rows match and propagate downstream — shrinking every intermediate result.

All figures (PDF + PNG) are regenerated by scripts/plot_results.py from docs/metrics.csv.

Building

Prerequisites

git clone --recurse-submodules https://github.com/robust-labs/robust.git

# vcpkg can live anywhere; pick a location once and reuse it for any C++ project  
git clone https://github.com/Microsoft/vcpkg.git
./vcpkg/bootstrap-vcpkg.sh
export VCPKG_TOOLCHAIN_PATH=$(pwd)/vcpkg/scripts/buildsystems/vcpkg.cmake

Build

# release
GEN=ninja make release

# debug (AddressSanitizer enabled)
GEN=ninja make debug

# release + benchmark runner (needed for bench_job.sh)
BUILD_BENCHMARK=1 GEN=ninja make release

Build artifacts

  • ./build/release/duckdb — DuckDB shell
  • ./build/release/extension/robust/robust.duckdb_extension — loadable extension
  • ./build/release/benchmark/benchmark_runner — benchmark runner (only with BUILD_BENCHMARK=1)

Running

The extension is not published to the DuckDB extension repository, so it must be loaded as an unsigned extension:

./build/release/duckdb -unsigned
LOAD 'build/release/extension/robust/robust.duckdb_extension';

-- Robust is designed for multi-way joins; it intentionally stays out of the way
-- when the query has <= 1 join. The example below is a 3-way star schema:
-- a wide fact table joined to two dimensions, with selective filters on the dims.
CREATE TEMP TABLE orders AS
    SELECT i              AS order_id,
           i % 1000       AS customer_id,
           i % 100        AS product_id,
           (i * 13) % 10000 AS amount
    FROM range(1000000) tbl(i);

CREATE TEMP TABLE customers AS
    SELECT i AS customer_id, 'cust_' || i AS name
    FROM range(1000) tbl(i);

CREATE TEMP TABLE products AS
    SELECT i AS product_id, i % 10 AS category
    FROM range(100) tbl(i);

-- 3-way join with filters on each dimension.
-- Robust builds filters (currently bloom filters, min/max ranges, and IN-lists
-- where applicable — the filter set is extensible) from the filtered rows of
-- customers and products, then propagates them across the join graph in two
-- passes: a forward pass pushes filters down to probe-side scans (here, the
-- orders scan), and a backward pass broadcasts them across the rest of the
-- graph. Only rows that survive every applicable filter reach the hash joins.
SELECT count(*), sum(o.amount)
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
JOIN products  p ON o.product_id  = p.product_id
WHERE p.category = 3
  AND c.name LIKE 'cust_1%';

-- inspect the rewritten plan: expect CREATE_FILTER nodes above the filtered
-- scans of customers/products, and PROBE_FILTER feeding the orders scan.
EXPLAIN
SELECT count(*), sum(o.amount)
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
JOIN products  p ON o.product_id  = p.product_id
WHERE p.category = 3
  AND c.name LIKE 'cust_1%';

For a larger / more realistic workload, see Benchmarks (JOB) below — the JOB suite has 113 queries (3–17-way joins over IMDb) that the extension is tuned against.

Settings

Setting Type Default Controls
robust_heuristic VARCHAR 'join_order' DAG construction heuristic: 'join_order' (DFS-build-first, follows DuckDB's own join order) or 'largest_root' (Prim's MST rooted at largest table; the original paper formulation)
robust_pass_mode VARCHAR 'both' Whether to run both passes or only forward: 'both' or 'forward_only'
robust_flip_roots BOOLEAN true In join_order mode, iteratively flip non-anchor roots to leaves so the largest table ends up as the sole anchor root
robust_filter_type VARCHAR 'all' Restrict filter types pushed to scans: 'all', 'bf_only', 'minmax_only'
robust_dynamic_or_filter_threshold UBIGINT 50 Max distinct build keys to use an IN-list (rather than a bloom filter) for scan pushdown
robust_profiling BOOLEAN false Emit per-operator timing and row-count stats after each query
robust_display_dag BOOLEAN false Print the logical transfer DAG to stdout before plan modification
robust_display_physical_dag BOOLEAN false Print the physical-plan DAG (before filter insertion) to stdout

All settings are registered in src/robust_extension.cpp. See docs/architecture.md for what these controls actually do at the algorithm level.

Benchmarks (JOB)

Setup

# expects jobdata/job.duckdb (load JOB tables via your tool of choice)
ls jobdata/job.duckdb
ls jobdata/queries/1a.sql    # 113 queries

When to use which script

Two complementary tools live in scripts/:

  • test_job.shcorrectness-first dev loop tool. Wraps the duckdb CLI, loads the extension dynamically with LOAD, diffs baseline vs Robust output to catch correctness regressions, and reports per-query wall-clock + geomean speedup. Use this on every PR / dev iteration. No special build flag needed; just a plain release build.
  • bench_job.shauthoritative measurement tool. Wraps DuckDB's in-tree benchmark_runner (build with BUILD_BENCHMARK=1), runs all queries in a single process for lower measurement noise, and explicitly skips the cold first run. Use this for numbers that go into the README, a paper, or PR perf claims. No correctness check — pair with test_job.sh if you've changed the optimizer.

The TPC-H counterpart of bench_job.sh is bench_tpch.sh (only the 9 queries where Robust currently inserts BFs and runs correctly).

Correctness + wall-clock comparison

./scripts/test_job.sh runs every JOB query with and without the extension, diffs results, and reports per-query timing + geomean speedup.

Recommended invocation:

./scripts/test_job.sh --heuristic join_order --timing --runs 5

For each of the 113 JOB queries it runs the query 5 times against the baseline (DuckDB stock optimizer) and 5 times with the Robust extension loaded, takes the minimum wall-clock from each side to suppress timer noise and cold-cache jitter, diffs the two result sets to confirm correctness, and prints per-query speedup plus a geometric mean across all queries. Output is also persisted to job_test_results/summary.txt.

Other useful forms:

./scripts/test_job.sh                          # correctness only, no timing
./scripts/test_job.sh --timing                 # single run per side
./scripts/test_job.sh --query 7c --timing      # one query
./scripts/test_job.sh --timing --limit 10      # first N queries

Summary written to job_test_results/summary.txt.

DuckDB benchmark-runner suites

scripts/bench_job.sh drives the in-tree DuckDB benchmark runner against the imdb (baseline) and imdb_robust* (extension) suites and writes a side-by-side comparison.

# all 113 queries, baseline + Robust, min of repeated runs
./scripts/bench_job.sh

# subset
./scripts/bench_job.sh --pattern '07.*'

# Robust with the largest_root heuristic (vs baseline)
./scripts/bench_job.sh --heuristic largest_root

# Robust forward-only (no backward equivalence-class broadcast)
./scripts/bench_job.sh --forward-only

# re-aggregate already-collected raw results without re-running
./scripts/bench_job.sh --no-run

# run Robust suite first (default is baseline first; flips the order to control thermals)
./scripts/bench_job.sh --robust-first

Output: benchmark_results/{baseline_raw.tsv, robust_raw.tsv, comparison.tsv}.

How the benchmark harness is wired

bench_job.sh and bench_tpch.sh invoke DuckDB's in-tree benchmark_runner against suites that need to live inside the DuckDB submodule (duckdb/benchmark/imdb_robust*, duckdb/benchmark/tpch_*). Those suites — plus two small upstream patches needed to load an unsigned dev extension and to work around a debug-build issue (see wip_docs/features/13-debug-build-verify-op-failure.md) — are kept inside this repo at:

  • bench_suites/ — vendored suites (source-of-truth, tracked here)
  • patches/ — duckdb-submodule patches (tracked here)

make release and make debug automatically run scripts/vendor_duckdb_bench.sh, which copies the suites into duckdb/benchmark/ and applies the patches. Re-running is idempotent. If a patch fails to apply (typically because the DuckDB submodule was bumped and upstream rewrote the patched region), the build aborts with a clear error.

Per-query profiling

scripts/profile_query.sh runs a single query under both baseline and Robust, captures DuckDB's JSON profile, and prints a breakdown by operator class (HASH_JOIN, SEQ_SCAN, CREATE_FILTER, PROBE_FILTER, ...).

# JOB query 7c
./scripts/profile_query.sh 7c

# TPCH query 3
./scripts/profile_query.sh --workload tpch 3

# disable DuckDB's join_filter_pushdown for Robust (isolates Robust's contribution)
./scripts/profile_query.sh --no-jfp robust 7c

# compare both Robust heuristics against baseline
./scripts/profile_query.sh --heuristic all 7c

# inline SQL
./scripts/profile_query.sh --sql "SELECT count(*) FROM t1 JOIN t2 ON t1.id = t2.id"

Full metrics sweep + plotting

scripts/bench_metrics.sh sweeps all JOB queries and extracts six metrics from each profile JSON (memory allocated, rows scanned, cumulative cardinality, peak buffer, sum/max of HASH_JOIN cardinality). scripts/plot_results.py consumes the resulting CSV and emits the figures in docs/figures/.

./scripts/bench_metrics.sh                      # all queries → benchmark_results/metrics.csv
./scripts/bench_metrics.sh --pattern '13.*'     # subset
./scripts/bench_metrics.sh --query 13a          # single query

./scripts/plot_results.py speedup benchmark_results/comparison.tsv --out fig.pdf
./scripts/plot_results.py metric memory benchmark_results/metrics.csv --out memory_ratio.pdf
./scripts/plot_results.py pairs hj_card_sum benchmark_results/metrics.csv \
    --style line --out hj_card_sum_pairs_line.pdf

How predicate transfer works

  1. Build DAG. The optimizer extracts equality joins, builds equivalence classes over join columns (union-find), and constructs a DAG over base tables with filtered tables as roots.
  2. Forward pass (leaves → root). For each edge, the smaller side builds a filter (bloom filter + min/max + optional IN-list when the build side has few distinct values). The filter is applied to the larger side via a PROBE_FILTER operator inserted above the scan.
  3. Backward pass (root → leaves). Each filter is broadcast across its equivalence class. If tables A, B, C all join on the same key and a filter was built from C, it's pushed to A and B as well — even though they never directly joined with C.
  4. Scan pushdown. Built filters are pushed into DuckDB's dynamic_filters infrastructure via BFTableFilter + SelectivityOptionalFilter, so the storage layer can skip rows/segments before they're decompressed.

Bloom filter implementation

The extension uses DuckDB's native BloomFilter (duckdb/planner/filter/bloom_filter.hpp) as the underlying implementation:

  • 12 bits per key
  • InsertHashes uses atomic fetch_or for lock-free parallel building
  • LookupHashes returns a SelectionVector directly (no separate bit-vector pass)

src/include/bloom_filter.hpp defines a thin wrapper, PTBloomFilter, which adds the glue DuckDB's native API doesn't provide:

  • Insert(DataChunk&, cols) and LookupSel(DataChunk&, sel, cols, buf) — hashes are computed from the chunk via VectorOperations::Hash + CombineHash before being passed to the native filter.
  • ReinitializeAndRehash(actual_rows, data, cols) — resizes the filter once the true build-side row count is known.
  • IsEmpty() / finalized_ — lifecycle flags read by PhysicalProbeFilter.

The bloom filter is one of three filter types pushed in a single CREATE_FILTER operation; min/max bounds and IN-lists (when the build side has ≤ 50 distinct keys) are emitted alongside it via DuckDB's ConstantFilter / InFilter infrastructure.

Pinned dependencies

Dependency Pin Notes
duckdb submodule 88277463aa Merge commit "Merge V1.5 -> Main", 2026-02-23. Not a release tag.
extension-ci-tools submodule 32eb753d9b v1.4.4 branch tip
OpenSSL 3.5.3+ via vcpkg dependency of DuckDB build

CI pins are kept in sync with submodule pins in .github/workflows/MainDistributionPipeline.yml.

License

Based on the DuckDB Extension Template (MIT).

About

A DuckDB extension implementing Predicate Transfer to reduce cardinality explosion

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Generated from duckdb/extension-template