SQL API Extensions: Expose planning APIs and make classes public#38951
SQL API Extensions: Expose planning APIs and make classes public#38951damccorm wants to merge 3 commits into
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request refactors Beam SQL's planning layer to make it more extensible for external orchestrators like Spark Connect. By exposing core planning APIs and allowing for manual logical plan manipulation, it enables more flexible SQL parsing and physical plan conversion workflows. Additionally, it addresses limitations in parameter handling and Calcite configuration, ensuring better compatibility with external SQL dialects. Highlights
New Features🧠 You can now enable Memory (public preview) to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces helper methods to BeamSqlEnv, QueryPlanner, and CalciteQueryPlanner to allow parsing SQL queries into logical RelNode plans and subsequently converting those logical plans into physical BeamRelNode plans. It also exposes several classes and constructors as public and adds logging. The review feedback highlights a critical issue where calling planner.close() in the finally block of parseToRel prematurely invalidates the returned RelNode's state. Additionally, the feedback points out unused and unsafe fields captured from a temporary planner, a regression where query collation is discarded, and a misleading parameter name in QueryPlanner.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
…lose, preserve collation, and rename parameter
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request enhances the Beam SQL query planning capabilities by exposing methods to parse SQL into logical plans (RelNode) and convert logical plans into physical Beam plans (BeamRelNode) separately. It also adds support for resolving parser conformance from pipeline options. The review feedback highlights critical compilation errors due to undeclared checked exceptions (SqlConversionException) in the newly introduced methods, and suggests defensive checks to prevent potential runtime exceptions during plan optimization.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces APIs in BeamSqlEnv and QueryPlanner to parse SQL queries into logical plans (RelNode) and convert logical plans to physical plans (BeamRelNode), while also adding support for resolving SQL parser conformance from pipeline options. The review feedback highlights a compilation error in BeamSqlEnv.parseLogicalPlan due to an unhandled checked exception, points out that CalciteQueryPlanner.convertToBeamRel ignores the QueryParameters argument, and suggests exposing an overloaded convertToBeamRel method in BeamSqlEnv to allow passing query parameters.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| public RelNode parseLogicalPlan(String query) throws ParseException { | ||
| return planner.parseToRel(query, QueryParameters.ofNone()); | ||
| } |
There was a problem hiding this comment.
The planner.parseToRel method throws SqlConversionException, which is a checked exception. Since parseLogicalPlan does not catch or declare SqlConversionException in its throws clause, this will cause a compilation error.
Please update the method signature to declare throws SqlConversionException.
| public RelNode parseLogicalPlan(String query) throws ParseException { | |
| return planner.parseToRel(query, QueryParameters.ofNone()); | |
| } | |
| public RelNode parseLogicalPlan(String query) throws ParseException, SqlConversionException { | |
| return planner.parseToRel(query, QueryParameters.ofNone()); | |
| } |
| @Override | ||
| public BeamRelNode convertToBeamRel(RelNode relNode, QueryParameters queryParameters) { | ||
| return convertToBeamRel(relNode, (RelCollation) null); | ||
| } |
There was a problem hiding this comment.
The convertToBeamRel(RelNode, QueryParameters) implementation currently ignores the queryParameters argument. If the logical plan contains dynamic/positional parameters, they will not be bound, leading to runtime failures or incorrect physical plans.
Please update the implementation to bind positional parameters using ParameterBinder if they are provided.
@Override
public BeamRelNode convertToBeamRel(RelNode relNode, QueryParameters queryParameters) {
if (queryParameters.getKind() == Kind.POSITIONAL) {
relNode =
bindParameters(
relNode,
new ParameterBinder(relNode.getCluster().getRexBuilder(), queryParameters));
}
return convertToBeamRel(relNode, (RelCollation) null);
}| public BeamRelNode convertToBeamRel(RelNode relNode) { | ||
| return planner.convertToBeamRel(relNode, QueryParameters.ofNone()); | ||
| } |
There was a problem hiding this comment.
To allow external callers to pass query parameters when converting an externally constructed or parsed RelNode to a BeamRelNode, please expose an overloaded convertToBeamRel method that accepts QueryParameters.
| public BeamRelNode convertToBeamRel(RelNode relNode) { | |
| return planner.convertToBeamRel(relNode, QueryParameters.ofNone()); | |
| } | |
| public BeamRelNode convertToBeamRel(RelNode relNode) { | |
| return convertToBeamRel(relNode, QueryParameters.ofNone()); | |
| } | |
| public BeamRelNode convertToBeamRel(RelNode relNode, QueryParameters queryParameters) { | |
| return planner.convertToBeamRel(relNode, queryParameters); | |
| } |
Description
This PR is split from #38866. It focuses on exposing Beam SQL's planning and optimization infrastructure as an extensible API.
Previously, Beam SQL's planning stages (via Calcite) were mostly internal and tightly coupled to executing a full SQL string end-to-end. This PR refactors and exposes these planning stages to allow external orchestration of Beam SQL.
Key Changes
parseLogicalPlan(String query)/parseToRel(...)toBeamSqlEnvandQueryPlannerto allow parsing a SQL query string into a Calcite logical plan (RelNode) without immediately optimizing or executing it.convertToBeamRel(RelNode logicalPlan)to allow taking an externally constructed or manipulated Calcite logical plan (RelNode) and converting it into a Beam physical plan (BeamRelNode/ PCollection pipeline).BeamCalciteTableconstructorpublicto allow external planners to instantiate it.TextTableProvider.RowToCsvclasspublicto allow external integration with text table serialization.testParseAndConvertHelpersinCalciteQueryPlannerTest.javathat specifically exercises these new APIs end-to-end.Why this is needed
This is a crucial feature for external query engines or orchestrators (such as Spark Connect or custom SQL platforms). They can now use Beam's SQL parser to get a logical plan, perform their own optimizations or integrations, and then hand it back to Beam to generate the final executable pipeline.