optimizer: extract null-restriction evaluator and add syntactic fast-path with focused tests by kosiew · Pull Request #21289 · apache/datafusion

kosiew · 2026-04-01T05:23:41Z

Which issue does this PR close?

Part of perf: push_down_filter is pathologically slow for some plans #20002

Rationale for this change

This PR extracts and formalizes the null-restriction evaluation logic into a dedicated utility module to improve clarity, performance, and reviewability.

Previously, null-restriction determination relied solely on evaluating physical expressions, which is more expensive and harder to reason about in isolation. Additionally, the logic was embedded in a broader set of changes, making it difficult to review independently.

This change introduces a conservative syntactic fast path that can determine null-restriction behavior for common predicate shapes without executing expressions. This improves optimizer efficiency while maintaining correctness by falling back to the authoritative evaluation path when needed.

The PR also explicitly guards against mixed-reference predicates (those referencing columns outside the provided join-column set), ensuring they are treated conservatively as non-restricting to avoid incorrect pruning.

What changes are included in this PR?

Introduced new module utils/null_restriction.rs containing:
- A syntactic null-restriction evaluator
- Conservative pattern matching for common predicate shapes (column refs, IS NULL, IS NOT NULL, comparisons, AND/OR/NOT)
- Clear semantics via SyntacticNullRestriction enum
Updated is_restrict_null_predicate in utils.rs to:
- Add early return for mixed-reference predicates (columns outside join set)
- Use the syntactic evaluator as a fast path when possible
- Fall back to the authoritative physical-expression evaluation when needed
Improved internal documentation explaining:
- Two-phase evaluation strategy (syntactic + authoritative)
- Safety guarantees and conservative behavior
Added focused unit tests:
- Mixed-reference predicate handling (ensuring non-restricting behavior)
- Parity between syntactic and authoritative evaluators for supported cases
- Coverage of boolean logic (AND/OR/NOT) and comparison operators

Are these changes tested?

Yes.

This PR includes comprehensive unit tests that:

Validate that the syntactic fast path agrees with the authoritative evaluator for supported predicate shapes
Ensure mixed-reference predicates are conservatively treated as non-restricting
Cover key SQL boolean semantics (AND, OR, NOT) and comparison operators
Verify fallback behavior when the syntactic evaluator cannot determine the result

These tests help prevent regressions and document expected behavior.

Benchmark

Below are the benchmark results, using the modified #21029 benchmark

                    Criterion Benchmark Summary (Statistically Significant Changes)
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━┓
┃ Benchmark                                                                         ┃ Mean Change ┃  P-value ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━┩
│ logical_plan_optimize_hotspot_case_heavy_left_join                                │      -1.74% │ 0.000000 │
│ push_down_filter_control_non_case_left_join_ab/with_push_down_filter/predicates=… │      -2.32% │ 0.010000 │
│ (predicates=10,nesting_depth=1)                                                   │             │          │
│ push_down_filter_hotspot_case_heavy_left_join_ab/with_push_down_filter/predicate… │      -2.99% │ 0.000000 │
│ (predicates=10,nesting_depth=1)                                                   │             │          │
│ push_down_filter_hotspot_case_heavy_left_join_ab/with_push_down_filter/predicate… │      -1.96% │ 0.000000 │
│ (predicates=30,nesting_depth=2)                                                   │             │          │
│ push_down_filter_hotspot_case_heavy_left_join_ab/with_push_down_filter/predicate… │      -1.94% │ 0.000000 │
│ (predicates=60,nesting_depth=3)                                                   │             │          │
└───────────────────────────────────────────────────────────────────────────────────┴─────────────┴──────────┘

Summary: 5 improvements, 0 regressions (p < 0.05)

Are there any user-facing changes?

No.

This change is internal to the optimizer and does not modify public APIs or user-visible behavior. It improves performance and maintainability without altering query results.

LLM-generated code disclosure

This PR includes LLM-generated code and comments. All LLM-generated content has been manually reviewed and tested.

Implement syntactic null-restriction checks in a new file. Introduce SyntacticNullRestriction enum and syntactic_null_restriction_check function. Modify utils.rs for early returns and fast-path evaluation to improve performance.

Ensure predicates always pass through the join-column subset guard before any fast-path decision. Strengthen tests with a bare non-join column regression case and modify the parity test to compare the syntactic helper against direct authoritative evaluation. Update module documentation to accurately describe the enum-based return values.

kosiew added 5 commits March 31, 2026 16:41

Add null-restriction fast-path evaluator

d2f4667

Implement syntactic null-restriction checks in a new file. Introduce SyntacticNullRestriction enum and syntactic_null_restriction_check function. Modify utils.rs for early returns and fast-path evaluation to improve performance.

cargo fmt

f930047

amend benchmark

5a0cfa4

checkout main benchmark

a739205

github-actions bot added the optimizer Optimizer rules label Apr 1, 2026

clippy fix - Refactor is_direct_join_col to simplify pattern matching

004c088

kosiew marked this pull request as ready for review April 1, 2026 06:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

optimizer: extract null-restriction evaluator and add syntactic fast-path with focused tests#21289

optimizer: extract null-restriction evaluator and add syntactic fast-path with focused tests#21289
kosiew wants to merge 6 commits intoapache:mainfrom
kosiew:push-down-06-20002

kosiew commented Apr 1, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

kosiew commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Benchmark

Are there any user-facing changes?

LLM-generated code disclosure

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

kosiew commented Apr 1, 2026 •

edited

Loading