feat: make BatchPartitioner::partition_iter public#21341
Merged
alamb merged 2 commits intoapache:mainfrom Apr 3, 2026
Merged
Conversation
Expose the existing partition_iter method so downstream async consumers can separate CPU-bound partitioning from I/O. Closes apache#21311
Contributor
|
Looks good to me -- thank you @hcrosse |
Contributor
|
Enabled auto merge so we can test out the merge queue |
alamb
approved these changes
Apr 3, 2026
Contributor
|
@blaginin FYI I tried the merge queue on this PR and it seems like it just merged directly (didn't wait for the checks 🤔 ) |
Member
Contributor
Member
|
Right! In this list you see two types of actions:
|
Member
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.



Which issue does this PR close?
Rationale for this change
BatchPartitioner::partition_iteris already used internally as the core implementation behind the publicpartitionmethod, and was intentionally factored out to support both sync and async consumption patterns. However, since it's private, downstream crates like Ballista can't use the iterator directly and are forced to run both CPU-bound partitioning and I/O together in a sync closure.What changes are included in this PR?
Changed
partition_itervisibility from private to public.Are these changes tested?
The existing tests for
BatchPartitionercoverpartition_iterindirectly through thepartitionmethod, which delegates to it. No behavioral change was made.Are there any user-facing changes?
BatchPartitioner::partition_iteris now part of the public API. This is a purely additive change with no breaking impact.