Skip to content

fix: Fix bug with structurally equal correlated subqueries#22313

Open
neilconway wants to merge 1 commit into
apache:mainfrom
neilconway:neilc/fix-duplicate-subquery
Open

fix: Fix bug with structurally equal correlated subqueries#22313
neilconway wants to merge 1 commit into
apache:mainfrom
neilconway:neilc/fix-duplicate-subquery

Conversation

@neilconway
Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Rationale for this change

If a query contained two or more structurally equal correlated subqueries, we previously would produce incorrect query results. This was caused by using Subquery as a hashmap key; if we use the subquery alias instead, we can avoid the unintended key collision. Using the alias (a string) as the hashmap key is also safer and more efficient than hashing on a complex type like Subquery.

What changes are included in this PR?

  • Fix for bug in ScalarSubqueryToJoin rewrite pass
  • SLT test

Are these changes tested?

Yes, with new test added.

Are there any user-facing changes?

No, aside from fixing the previously incorrect query results.

@github-actions github-actions Bot added optimizer Optimizer rules sqllogictest SQL Logic Tests (.slt) labels May 17, 2026
Copy link
Copy Markdown
Contributor

@kosiew kosiew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@neilconway
Thanks for the fix here. I walked through the rewrite flow and the approach makes sense overall. I just had one small suggestion that could make the bookkeeping a bit more explicit and easier to reason about in the future.

#[allow(clippy::allow_attributes, clippy::mutable_key_type)]
// Expr contains Arc with interior mutability but is intentionally used as hash key
let mut subquery_to_expr_map = HashMap::new();
let mut alias_to_expr_map: HashMap<String, Expr> = HashMap::new();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch on the alias mapping issue. One thought: it might be a little simpler and more robust to key the projection rewrite bookkeeping by projection index, or store alias -> projection index, instead of alias -> cloned Expr plus Expr -> rewrite Expr.

The current fix works because the generated aliases are unique, but using indexes would encode the ownership relationship directly and avoid another structural Expr lookup in this already pretty subtle rewrite path.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

optimizer Optimizer rules sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Duplicate subqueries produce incorrect results

2 participants