Skip to content

refactor: thread SubqueryContext explicitly through physical planning#22340

Draft
timsaucer wants to merge 1 commit into
apache:mainfrom
timsaucer:refactor/subquery-context-explicit
Draft

refactor: thread SubqueryContext explicitly through physical planning#22340
timsaucer wants to merge 1 commit into
apache:mainfrom
timsaucer:refactor/subquery-context-explicit

Conversation

@timsaucer
Copy link
Copy Markdown
Member

@timsaucer timsaucer commented May 18, 2026

Which issue does this PR close?

None, but is a precursor to making QueryPlanner take &dyn Session instead of &SessionState so that we expose query planner to FFI properly in work similar to #22151

Rationale for this change

DefaultPhysicalPlanner::create_initial_plan registers scalar-subquery state for expression lowering by cloning the SessionState and mutating the embedded ExecutionProps:

let mut owned = session_state.clone();
owned.execution_props_mut().subquery_indexes = index_map;
owned.execution_props_mut().subquery_results = results.clone();

The lowering side then reads those fields out of ExecutionProps in physical-expr/src/planner.rs (Expr::ScalarSubquery branch). This is a documented hack — the block comment at the write site says it should live in a dedicated planning context, but create_physical_expr only takes &ExecutionProps, so ExecutionProps was the cheapest carrier.

That side channel is also what blocks moving QueryPlanner / PhysicalPlanner to &dyn Session (the goal of #20249 / #22151):

  • Session trait has no clone() — trait objects can't be cloned without extra machinery.
  • Session::execution_props(&self) is immutable on purpose; Session is &self from hot TableProvider paths.

So before any &dyn Session flip is even possible, this side channel needs to go.

What changes are included in this PR?

  • Introduces SubqueryContext in datafusion-expr/src/execution_props.rs — a small carrier bundling the Subquery -> SubqueryIndex map and the shared ScalarSubqueryResults.
  • Plumbs &SubqueryContext through as neccessary.
  • Adds *_with_subquery_context dual entry points alongside every existing public expression-lowering function:
    • create_physical_expr_with_subquery_context (and similar)
  • Drops the unreleased subquery_indexes and subquery_results fields from ExecutionProps.
  • Removes the SessionState.clone() + execution_props_mut() mutation at the write site.

Why dual entry points instead of changing the existing signatures?

The first instinct is to just add a 4th parameter to create_physical_expr:

pub fn create_physical_expr(
    e: &Expr,
    schema: &DFSchema,
    props: &ExecutionProps,
    subquery_ctx: &SubqueryContext, // new
) -> Result<Arc<dyn PhysicalExpr>>

Cleaner end state, but the cost is wrong:

  • create_physical_expr has ~92 production callers in this repo and is also reached by downstream crates (datafusion-comet, ballista, datafusion-python, sqllogictest, custom planners). All of them would have to add the new argument.
  • None of those callers actually need scalar-subquery support. Only one read site cares: physical-expr/src/planner.rs Expr::ScalarSubquery branch. That branch is only reached when the caller has registered uncorrelated subqueries — which only the physical planner ever does.
  • Forcing every caller to pass a context they don't use is API churn with no payoff.

The dual-entry-point shape lets the existing signature keep working for everyone:

pub fn create_physical_expr(
    e: &Expr,
    schema: &DFSchema,
    props: &ExecutionProps,
) -> Result<Arc<dyn PhysicalExpr>> {
    create_physical_expr_with_subquery_context(
        e,
        schema,
        props,
        &SubqueryContext::default(),
    )
}

External callers who never construct a SubqueryContext get the same behavior they had before: Expr::ScalarSubquery lowers to not_impl_err (matching today's behavior when subquery_indexes was empty). The physical planner — the only caller that needs subquery state — uses the _with_subquery_context variant.

The trade-off is one extra public function per lowering entry point in exchange for zero external breakage. With the &dyn Session refactor still ahead of us and likely needing its own API moves, keeping this PR additive is the right shape.

The unreleased status of ExecutionProps.subquery_indexes / subquery_results (added in #21240, after tag 48.0.0-rc2) is what makes the removal half free: they were never in a release, so there's nothing to deprecate.

Are these changes tested?

Existing test coverage exercises the new code path.

Are there any user-facing changes?

Public API additions (additive, non-breaking):

  • New type: datafusion_expr::execution_props::SubqueryContext.
  • New functions in datafusion-physical-expr:
    • create_physical_expr_with_subquery_context
    • create_physical_exprs_with_subquery_context
    • create_physical_sort_expr_with_subquery_context
    • create_physical_sort_exprs_with_subquery_context
  • New functions in datafusion::physical_planner:
    • create_window_expr_with_subquery_context
    • create_window_expr_with_name_and_subquery_context
  • New builder method: LoweredAggregateBuilder::with_subquery_context.

Removals (unreleased — never shipped):

  • ExecutionProps::subquery_indexes field.
  • ExecutionProps::subquery_results field.

Both were added after tag 48.0.0-rc2 (in #21240) and have never been in a published release, so there is no deprecation cycle required.

No behavior changes for any pre-existing public API. Every original function preserves its signature and delegates with SubqueryContext::default().

IF DF54.0 gets cut before this, or it doesn't get ported into it, then we will need to mark this as API change.

Removes the SessionState.clone() + execution_props_mut() trick used by
DefaultPhysicalPlanner to register scalar-subquery state for expression
lowering. That side channel relied on stashing per-plan state in
ExecutionProps, which forced physical planning to hold a mutable
SessionState and blocked moving QueryPlanner / PhysicalPlanner to
&dyn Session.

Introduces SubqueryContext in datafusion-expr and threads it explicitly
through create_initial_plan_inner, task_helper, map_logical_node_to_physical,
and the standalone planning helpers. Adds *_with_subquery_context dual
entry points for create_physical_expr, create_physical_exprs,
create_physical_sort_expr(s), create_window_expr(_with_name), and
LoweredAggregateBuilder.with_subquery_context — original public
signatures are preserved and delegate with SubqueryContext::default(),
so downstream callers are unaffected.

Drops the unreleased subquery_indexes and subquery_results fields from
ExecutionProps.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added logical-expr Logical plan and expressions physical-expr Changes to the physical-expr crates core Core DataFusion crate labels May 18, 2026
@timsaucer
Copy link
Copy Markdown
Member Author

@neilconway Would you mind taking a look? It's a fair amount of plumbing and I'm open to other suggestions. This is to unblock some follow on work where we want to use &dyn Session instead of &SessionState in the QueryPlanner.

@github-actions
Copy link
Copy Markdown

Thank you for opening this pull request!

Reviewer note: cargo-semver-checks reported the current version number is not SemVer-compatible with the changes in this pull request (compared against the base branch).

Details
     Cloning apache/main
    Building datafusion v53.1.0 (current)
       Built [ 100.516s] (current)
     Parsing datafusion v53.1.0 (current)
      Parsed [   0.037s] (current)
    Building datafusion v53.1.0 (baseline)
       Built [  97.855s] (baseline)
     Parsing datafusion v53.1.0 (baseline)
      Parsed [   0.038s] (baseline)
    Checking datafusion v53.1.0 -> v53.1.0 (no change; assume patch)
     Checked [   0.953s] 222 checks: 222 pass, 30 skip
     Summary no semver update required
    Finished [ 201.254s] datafusion
    Building datafusion-expr v53.1.0 (current)
       Built [  25.645s] (current)
     Parsing datafusion-expr v53.1.0 (current)
      Parsed [   0.077s] (current)
    Building datafusion-expr v53.1.0 (baseline)
       Built [  25.412s] (baseline)
     Parsing datafusion-expr v53.1.0 (baseline)
      Parsed [   0.078s] (baseline)
    Checking datafusion-expr v53.1.0 -> v53.1.0 (no change; assume patch)
     Checked [   1.768s] 222 checks: 221 pass, 1 fail, 0 warn, 30 skip

--- failure struct_pub_field_missing: pub struct's pub field removed or renamed ---

Description:
A publicly-visible struct has at least one public field that is no longer available under its prior name. It may have been renamed or removed entirely.
        ref: https://doc.rust-lang.org/cargo/reference/semver.html#item-remove
       impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.47.0/src/lints/struct_pub_field_missing.ron

Failed in:
  field subquery_indexes of struct ExecutionProps, previously in file /home/runner/work/datafusion/datafusion/target/semver-checks/git-apache_main/08c1122c4d76196d15653b7af215ecfb990ae490/datafusion/expr/src/execution_props.rs:69
  field subquery_results of struct ExecutionProps, previously in file /home/runner/work/datafusion/datafusion/target/semver-checks/git-apache_main/08c1122c4d76196d15653b7af215ecfb990ae490/datafusion/expr/src/execution_props.rs:72

     Summary semver requires new major version: 1 major and 0 minor checks failed
    Finished [  54.271s] datafusion-expr
    Building datafusion-physical-expr v53.1.0 (current)
       Built [  24.004s] (current)
     Parsing datafusion-physical-expr v53.1.0 (current)
      Parsed [   0.048s] (current)
    Building datafusion-physical-expr v53.1.0 (baseline)
       Built [  24.344s] (baseline)
     Parsing datafusion-physical-expr v53.1.0 (baseline)
      Parsed [   0.048s] (baseline)
    Checking datafusion-physical-expr v53.1.0 -> v53.1.0 (no change; assume patch)
     Checked [   0.482s] 222 checks: 222 pass, 30 skip
     Summary no semver update required
    Finished [  50.196s] datafusion-physical-expr

@github-actions github-actions Bot added the auto detected api change Auto detected API change label May 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto detected api change Auto detected API change core Core DataFusion crate logical-expr Logical plan and expressions physical-expr Changes to the physical-expr crates

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant