Skip to content

optimizer: extract null-restriction evaluator and add syntactic fast-path with focused tests#21289

Open
kosiew wants to merge 6 commits intoapache:mainfrom
kosiew:push-down-06-20002
Open

optimizer: extract null-restriction evaluator and add syntactic fast-path with focused tests#21289
kosiew wants to merge 6 commits intoapache:mainfrom
kosiew:push-down-06-20002

Conversation

@kosiew
Copy link
Copy Markdown
Contributor

@kosiew kosiew commented Apr 1, 2026

Which issue does this PR close?


Rationale for this change

This PR extracts and formalizes the null-restriction evaluation logic into a dedicated utility module to improve clarity, performance, and reviewability.

Previously, null-restriction determination relied solely on evaluating physical expressions, which is more expensive and harder to reason about in isolation. Additionally, the logic was embedded in a broader set of changes, making it difficult to review independently.

This change introduces a conservative syntactic fast path that can determine null-restriction behavior for common predicate shapes without executing expressions. This improves optimizer efficiency while maintaining correctness by falling back to the authoritative evaluation path when needed.

The PR also explicitly guards against mixed-reference predicates (those referencing columns outside the provided join-column set), ensuring they are treated conservatively as non-restricting to avoid incorrect pruning.


What changes are included in this PR?

  • Introduced new module utils/null_restriction.rs containing:

    • A syntactic null-restriction evaluator
    • Conservative pattern matching for common predicate shapes (column refs, IS NULL, IS NOT NULL, comparisons, AND/OR/NOT)
    • Clear semantics via SyntacticNullRestriction enum
  • Updated is_restrict_null_predicate in utils.rs to:

    • Add early return for mixed-reference predicates (columns outside join set)
    • Use the syntactic evaluator as a fast path when possible
    • Fall back to the authoritative physical-expression evaluation when needed
  • Improved internal documentation explaining:

    • Two-phase evaluation strategy (syntactic + authoritative)
    • Safety guarantees and conservative behavior
  • Added focused unit tests:

    • Mixed-reference predicate handling (ensuring non-restricting behavior)
    • Parity between syntactic and authoritative evaluators for supported cases
    • Coverage of boolean logic (AND/OR/NOT) and comparison operators

Are these changes tested?

Yes.

This PR includes comprehensive unit tests that:

  1. Validate that the syntactic fast path agrees with the authoritative evaluator for supported predicate shapes
  2. Ensure mixed-reference predicates are conservatively treated as non-restricting
  3. Cover key SQL boolean semantics (AND, OR, NOT) and comparison operators
  4. Verify fallback behavior when the syntactic evaluator cannot determine the result

These tests help prevent regressions and document expected behavior.

Benchmark

Below are the benchmark results, using the modified #21029 benchmark

                    Criterion Benchmark Summary (Statistically Significant Changes)
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━┓
┃ Benchmark                                                                         ┃ Mean Change ┃  P-value ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━┩
│ logical_plan_optimize_hotspot_case_heavy_left_join                                │      -1.74% │ 0.000000 │
│ push_down_filter_control_non_case_left_join_ab/with_push_down_filter/predicates=… │      -2.32% │ 0.010000 │
│ (predicates=10,nesting_depth=1)                                                   │             │          │
│ push_down_filter_hotspot_case_heavy_left_join_ab/with_push_down_filter/predicate… │      -2.99% │ 0.000000 │
│ (predicates=10,nesting_depth=1)                                                   │             │          │
│ push_down_filter_hotspot_case_heavy_left_join_ab/with_push_down_filter/predicate… │      -1.96% │ 0.000000 │
│ (predicates=30,nesting_depth=2)                                                   │             │          │
│ push_down_filter_hotspot_case_heavy_left_join_ab/with_push_down_filter/predicate… │      -1.94% │ 0.000000 │
│ (predicates=60,nesting_depth=3)                                                   │             │          │
└───────────────────────────────────────────────────────────────────────────────────┴─────────────┴──────────┘

Summary: 5 improvements, 0 regressions (p < 0.05)

Are there any user-facing changes?

No.

This change is internal to the optimizer and does not modify public APIs or user-visible behavior. It improves performance and maintainability without altering query results.


LLM-generated code disclosure

This PR includes LLM-generated code and comments. All LLM-generated content has been manually reviewed and tested.

kosiew added 5 commits March 31, 2026 16:41
Implement syntactic null-restriction checks in a new file.
Introduce SyntacticNullRestriction enum and
syntactic_null_restriction_check function.
Modify utils.rs for early returns and fast-path
evaluation to improve performance.
Ensure predicates always pass through the join-column subset guard
before any fast-path decision. Strengthen tests with a bare non-join
column regression case and modify the parity test to compare the
syntactic helper against direct authoritative evaluation. Update
module documentation to accurately describe the enum-based return
values.
@github-actions github-actions bot added the optimizer Optimizer rules label Apr 1, 2026
@kosiew kosiew marked this pull request as ready for review April 1, 2026 06:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

optimizer Optimizer rules

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant