Add in-place buffer reuse for arithmetic and bitwise binary expressions#21339
Add in-place buffer reuse for arithmetic and bitwise binary expressions#21339Dandandan wants to merge 4 commits intoapache:mainfrom
Conversation
|
run benchmarks |
|
Benchmark for this request failed. Last 20 lines of output: Click to expandFile an issue against this benchmark runner |
2 similar comments
|
Benchmark for this request failed. Last 20 lines of output: Click to expandFile an issue against this benchmark runner |
|
Benchmark for this request failed. Last 20 lines of output: Click to expandFile an issue against this benchmark runner |
When evaluating arithmetic binary expressions (+, -, *, /, %), reuse the left operand's buffer in-place when its reference count is 1, avoiding a buffer allocation. This benefits expression chains like a + b + c where intermediate results are consumed only once. Uses PrimitiveArray::unary_mut for array-scalar and into_builder for array-array cases. Only wrapping (infallible) ops use in-place mutation; checked ops fall back to standard Arrow kernels to avoid corrupting buffers on overflow. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
a1e5de3 to
418ccdd
Compare
|
run benchmarks |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing worktree-mutable-binary-eval (418ccdd) to 1e93a67 (merge-base) diff using: clickbench_partitioned File an issue against this benchmark runner |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing worktree-mutable-binary-eval (418ccdd) to 1e93a67 (merge-base) diff using: tpch File an issue against this benchmark runner |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing worktree-mutable-binary-eval (418ccdd) to 1e93a67 (merge-base) diff using: tpcds File an issue against this benchmark runner |
When the left operand's buffer cannot be reused (shared reference), try the right operand for in-place mutation. This covers cases like Scalar-Array and Array-Array where the right side has refcount 1. For non-commutative ops (sub, rem), the argument order is preserved correctly: result[i] = op(left[i], right[i]). Also refactors type dispatch into shared macros. Decimal types are excluded from in-place mutation because the result precision/scale differs from the input. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
418ccdd to
4c6f444
Compare
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usagetpch — base (merge-base)
tpch — branch
File an issue against this benchmark runner |
When evaluating boolean AND/OR expressions, try to reuse the left buffer in-place via Buffer::into_mutable. If the left buffer is shared, try the right buffer (AND/OR are commutative). Falls back to standard and_kleene/or_kleene when both buffers are shared or when nulls are present. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usageclickbench_partitioned — base (merge-base)
clickbench_partitioned — branch
File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usagetpcds — base (merge-base)
tpcds — branch
File an issue against this benchmark runner |
|
run benchmarks |
|
run benchmark tpch10 |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing worktree-mutable-binary-eval (01f8fa6) to 1e93a67 (merge-base) diff using: tpcds File an issue against this benchmark runner |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing worktree-mutable-binary-eval (01f8fa6) to 1e93a67 (merge-base) diff using: clickbench_partitioned File an issue against this benchmark runner |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing worktree-mutable-binary-eval (01f8fa6) to 1e93a67 (merge-base) diff using: tpch10 File an issue against this benchmark runner |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing worktree-mutable-binary-eval (01f8fa6) to 1e93a67 (merge-base) diff using: tpch File an issue against this benchmark runner |
When evaluating boolean AND/OR expressions without nulls, use BooleanBuffer's BitAndAssign/BitOrAssign which internally attempts Buffer::into_mutable() for in-place mutation. Falls back to and_kleene/or_kleene when nulls are present. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
01f8fa6 to
f70c9e7
Compare
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usagetpch — base (merge-base)
tpch — branch
File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usagetpch10 — base (merge-base)
tpch10 — branch
File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usagetpcds — base (merge-base)
tpcds — branch
File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usageclickbench_partitioned — base (merge-base)
clickbench_partitioned — branch
File an issue against this benchmark runner |
|
run benchmarks |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing worktree-mutable-binary-eval (f70c9e7) to 1e93a67 (merge-base) diff using: clickbench_partitioned File an issue against this benchmark runner |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing worktree-mutable-binary-eval (f70c9e7) to 1e93a67 (merge-base) diff using: tpcds File an issue against this benchmark runner |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing worktree-mutable-binary-eval (f70c9e7) to 1e93a67 (merge-base) diff using: tpch File an issue against this benchmark runner |
|
run benchmarks sql_planner |
|
🤖 Criterion benchmark running (GKE) | trigger CPU Details (lscpu)Comparing worktree-mutable-binary-eval (f70c9e7) to 1e93a67 (merge-base) diff File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usagetpch — base (merge-base)
tpch — branch
File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usagetpcds — base (merge-base)
tpcds — branch
File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usageclickbench_partitioned — base (merge-base)
clickbench_partitioned — branch
File an issue against this benchmark runner |
|
🤖 Criterion benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usagebase (merge-base)
branch
File an issue against this benchmark runner |
Summary
+,-,*,/,%), reuse the left operand's buffer in-place when itsArcreference count is 1, avoiding a buffer allocationPrimitiveArray::unary_mutfor array-scalar andinto_builderfor array-array cases on all integer (i8–u64) and float (f16–f64) typesTest plan
datafusion-physical-exprtests passdatafusion-physical-expr-commontests passcargo clippyclean with-D warningsSELECT a+b+c+d FROM t) to measure allocation savings🤖 Generated with Claude Code