Skip to content

Dynamic work scheduling in FileStream#21351

Draft
alamb wants to merge 5 commits intoapache:mainfrom
alamb:alamb/reschedule_io
Draft

Dynamic work scheduling in FileStream#21351
alamb wants to merge 5 commits intoapache:mainfrom
alamb:alamb/reschedule_io

Conversation

@alamb
Copy link
Copy Markdown
Contributor

@alamb alamb commented Apr 3, 2026

Stacked on

Which issue does this PR close?

Rationale for this change

The whole point of this sequence of PRs is to enable dynamic work scheduling in the FileStream (so that if a task is done it can look at any remaining work)

What changes are included in this PR?

  1. Add shared state to FileStream for siblings
  2. Sibling streams put their file work into a shared queue when it can be reordered

Note there are a bunch of other things that are NOT included in this PR, including

  1. Trying to limit concurrent IO (this PR has the same properties as main -- up to one outstanding IO per partition)
  2. Trying to issue multiple IOs by the same partition (aka to interleave IO and CPU work)

Are these changes tested?

Yes by existing functional and benchmark tests, as well as new functional tests

Are there any user-facing changes?

Yes, faster performance (TODO MEASURE)

@github-actions github-actions bot added the datasource Changes to the datasource crate label Apr 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

datasource Changes to the datasource crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant