feat: canonical H2O coverage — q6/q8/q9 adapters + engine-only timing + DataFusion memtable fix by ser-vasilich · Pull Request #4 · RayforceDB/rayforce-bench

ser-vasilich · 2026-05-15T20:12:31Z

Summary

61 commits accumulating the canonical H2O (h2oai/db-benchmark) coverage on rayforce-bench: engine-only timing for SQL adapters, rayforce wrappers for q6/q8/q9, fairness fixes across adapters, and dashboard polish.

Headline changes

Engine-only timing across SQL adapters (`20f915a`)

Replace `fetchall()` / IPC-materialization with server-side draining or `CREATE TEMPORARY TABLE` patterns so each adapter is timed on engine work only, not Arrow IPC / Python conversion. Affects DuckDB, chDB, DataFusion, QuestDB, TimescaleDB.

Rayforce q6 / q8 / q9 adapters (`a50ab48`, `611bcb3`, `626cd34`, `99ae025`)

q6 (median + stddev by id4,id5): native `Column.median()` + `Column.std()` via new engine `OP_MEDIAN` and existing stddev
q8 (largest 2 v3 by id6): `Column.top(2)` via engine `OP_TOP_N` then `OP_GROUP_TOPK_ROWFORM` (row-form emit, no LIST intermediate)
q9 (pearson² by id2,id4): two-stage adapter — `Column.pearson_corr(...)` then arithmetic squaring; required because `** 2` at top would block the DAG hash-agg lowering

Engine-side explode for q8 (raze + indexed gather) keeps the timed query in row form (200k rows) — matches DuckDB's `ROW_NUMBER OVER PARTITION` shape and SQL adapters' default materialization.

DataFusion memtable fix (`eae3261`)

`register_csv` produced a listing table that re-parsed CSV on every timed query (page cache avoided disk, but parse cost remained). Replaced with `register_record_batches` after one-shot `collect()`. Apples-to-apples vs duckdb/chdb/polars/pandas/rayforce which all hold native columnar storage. q4 154→17 ms, q6 312→148 ms, q8 367→262 ms.

Dashboard / framework polish (multiple)

Canonical H2O suite (groupby q1..q10 + canonical-join q1..q5 + sort_single/sort_multi)
Bonus suite (3-key joins, full-row sorts) under separate `bench-bonus` target
Per-adapter QUERY_STRINGS shown on the compare panel
Scaling sweep with operations panel split into groupby/join/sort
Histogram split fast/heavy
`make check` — cross-adapter result equivalence at all sizes 10..10m
Bench snapshot refreshed after `OP_GROUP_TOPK_ROWFORM` (`d354496`)

Perf snapshot (10M rows, k=100 cardinality, engine-only timing)

query	rayforce	duckdb	datafusion	polars
q1	5.5 ms	37 ms	17 ms	30 ms
q4	10 ms	9 ms	19 ms	29 ms
q6	121 ms	186 ms	148 ms	254 ms
q8	40 ms	157 ms	258 ms	503 ms
q9	49 ms	76 ms	72 ms	405 ms
q10	164 ms	380 ms	419 ms	1961 ms

Rayforce wins 9/10 (q4 within 1ms of duckdb).

feat(types): canonical H2O column wrappers — top/bot/pearson_corr/std/__pow__ rayforce-py#12 — Python column wrappers (`top`, `bot`, `pearson_corr`, `std`, `median`, `pow`)
feat: canonical H2O groundwork — top/bot/pearson_corr/pow + 4 fixes rayforce#202 — engine groundwork (top/bot/pearson_corr/pow primitives + 4 fixes)
perf: canonical H2O q6/q8/q9 — OP_MEDIAN, OP_PEARSON_CORR, OP_TOP_N, OP_GROUP_TOPK_ROWFORM rayforce#203 — engine perf opcodes (OP_MEDIAN, OP_PEARSON_CORR, OP_TOP_N, OP_GROUP_TOPK_ROWFORM)

Test plan

`make check LOCAL=1` → `pass — 665/665 comparisons matched polars, 0 NYI (rtol=1e-06, atol=1e-09)` across all 7 sizes × all 19 ops × all 6 adapters
`make bench LOCAL=1` reproduces the perf numbers above
Reviewer: build with companion branches (`RAYFORCE_LOCAL_PATH` pointing at rayforce#203 checkout) and re-run `make check` + `make bench`

…karounds Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

register_csv produces a listing table that re-parses CSV on every timed query. register_record_batches with the collected batches caches the columnar layout in memory. q4 154→17ms, q6 312→148ms, q8 367→262ms — DataFusion now apples-to-apples with adapters that hold native columnar storage.

q8's natural rayforce shape is 100k rows with LIST<F64>[2] cells — duckdb's ROW_NUMBER() <= 2 SQL emits 200k exploded rows. Timed bench was unfair: rayforce skipped the row-materialisation cost SQL adapters pay for. Move the explode into the timed engine query via raze + indexed gather (vectorised, no per-element lambda) so both sides materialise 200k. q8 163ms (100k rows) → 215ms (200k rows) vs duckdb 198ms — ~apples-to-apples now. Bundles the q9 two-stage adapter form already in the working tree.

run_groupby_q8's fast vectorised explode assumes K=2 everywhere (true for canonical 10m k100, where every id6 group has ≥2 non-null v3). Small check sizes (10..1m) hit groups with K=1 cells; the K=2-uniform formula produces row-count mismatch. Split: timed path keeps the fast formula; materialize() reverts to a per-cell Python explode for correctness across all check sizes.

ser-vasilich and others added 6 commits May 10, 2026 19:30

prototype: rayforce q6/q8/q9 implementations + bonus-join wrapper wor…

a50ab48

…karounds Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

bench(rayforce): q8 uses chained API — engine emits row form natively

99ae025

bench(snapshot): refresh after OP_GROUP_TOPK_ROWFORM

d354496

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: canonical H2O coverage — q6/q8/q9 adapters + engine-only timing + DataFusion memtable fix#4

feat: canonical H2O coverage — q6/q8/q9 adapters + engine-only timing + DataFusion memtable fix#4
ser-vasilich wants to merge 6 commits into
RayforceDB:masterfrom
ser-vasilich:prototype

ser-vasilich commented May 15, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ser-vasilich commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Headline changes

Engine-only timing across SQL adapters (`20f915a`)

Rayforce q6 / q8 / q9 adapters (`a50ab48`, `611bcb3`, `626cd34`, `99ae025`)

DataFusion memtable fix (`eae3261`)

Dashboard / framework polish (multiple)

Perf snapshot (10M rows, k=100 cardinality, engine-only timing)

Related

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ser-vasilich commented May 15, 2026 •

edited

Loading