Skip to content

feat(flashblocks): State root computation via PayloadProcessor sparse trie task#2247

Draft
mw2000 wants to merge 1 commit intomainfrom
mw2000/state-root-fast
Draft

feat(flashblocks): State root computation via PayloadProcessor sparse trie task#2247
mw2000 wants to merge 1 commit intomainfrom
mw2000/state-root-fast

Conversation

@mw2000
Copy link
Copy Markdown
Contributor

@mw2000 mw2000 commented Apr 16, 2026

  • Replace inline synchronous state root computation with reth's PayloadProcessor sparse trie task, computing state roots in parallel during block building
  • PayloadProcessor, ChangesetCache, and TreeConfig are stored on OpPayloadBuilder and reused across blocks so the sparse trie stays warm
  • Falls back to synchronous computation on task failure
  • Rewrite state_root benchmark to measure residual wait time (finish → root) with busy-wait simulation of 200ms inter-flashblock delays

@cb-heimdall
Copy link
Copy Markdown
Collaborator

cb-heimdall commented Apr 16, 2026

🟡 Heimdall Review Status

Requirement Status More Info
Reviews 🟡 0/1
Denominator calculation
Show calculation
1 if user is bot 0
1 if user is external 0
2 if repo is sensitive 0
From .codeflow.yml 1
Additional review requirements
Show calculation
Max 0
0
From CODEOWNERS 0
Global minimum 0
Max 1
1
1 if commit is unverified 1
Sum 2

@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 16, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
base Ignored Ignored Preview Apr 22, 2026 10:58pm

Request Review

Comment thread crates/builder/core/src/flashblocks/state_root_task.rs Outdated
Comment thread crates/builder/core/src/flashblocks/state_root_task.rs
Comment thread crates/builder/core/src/flashblocks/state_root_task.rs Outdated
Comment thread crates/builder/core/src/flashblocks/payload.rs Outdated

// PayloadProcessor is reused across blocks for warm sparse trie.
let evm_config = BaseEvmConfig::optimism(ctx.chain_spec());
let tree_config = TreeConfig::default();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TreeConfig::default() — the benchmark uses TreeConfig::default().with_disable_sparse_trie_cache_pruning(true), but production here uses the plain default (which has pruning enabled). If the benchmark is meant to reflect production, one of these should change. If cache pruning is intentionally enabled in production but disabled in benchmarks to keep the trie warm, that's fine but worth documenting the discrepancy.

@mw2000 mw2000 force-pushed the mw2000/state-root-fast branch from bca1ea0 to 8979c8d Compare April 17, 2026 18:20
Comment thread crates/builder/core/src/flashblocks/payload.rs Outdated

// Merge pending transitions into bundle_state before reading it.
// revm's commit() pushes to transition_state, not bundle_state.
state.merge_transitions(BundleRetention::Reverts);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double merge_transitions when called from finalize_payload

finalize_state_root calls state.merge_transitions(BundleRetention::Reverts) here, then the caller finalize_payload passes state to build_block which calls merge_transitions again (payload.rs:1035). The second merge is a no-op on the now-empty transition state, but build_block clones state.transition_state beforehand (payload.rs:1033) to restore it at the end — since the finalization path already merged, this clone captures an empty transition state.

This is safe only because finalize_payload is called at the end of block building and no further flashblocks are built afterward. However, the implicit coupling is fragile — if someone later calls build_block after finalize_state_root in a non-terminal context, state would be silently corrupted. Consider either:

  • Having finalize_state_root not call merge_transitions itself (let the caller manage it), or
  • Adding a comment documenting this coupling.

Comment on lines 69 to 83
StateProviderFactory
+ ChainSpecProvider<ChainSpec = BaseChainSpec>
+ BlockReaderIdExt<Header = Header>
+ DatabaseProviderFactory<
Provider: BlockReader
+ StageCheckpointReader
+ PruneCheckpointReader
+ ChangeSetReader
+ StorageChangeSetReader
+ BlockNumReader
+ StorageSettingsCache,
> + StateReader
+ Clone
+ 'static
{
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Breaking change to ClientBounds public trait

Adding DatabaseProviderFactory<Provider: ...> + StateReader + 'static is a breaking change — any downstream type implementing ClientBounds must now also satisfy these bounds. Worth noting in the PR description / changelog. If these bounds are only needed by spawn_state_root_task (which is pub(crate)), consider introducing a separate StateRootClientBounds trait to keep ClientBounds stable for external consumers.

@mw2000 mw2000 force-pushed the mw2000/state-root-fast branch from 8979c8d to 454d4e8 Compare April 17, 2026 19:12
state_root = %outcome.state_root,
"Background state root task completed"
);
return Ok((outcome.state_root, outcome.trie_updates, hashed_state));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No consistency check between background trie_updates and locally-computed hashed_state

When the background task succeeds, outcome.trie_updates (computed by the sparse trie from per-tx diffs fed through the hook) is returned alongside hashed_state (computed locally from bundle_state on line 178). These two independently-derived views of the state are stored together in BuiltPayloadExecutedBlock and must be consistent for the engine to persist state correctly.

If any state update is ever lost in transit through the hook (e.g., a future refactoring accidentally skips a send_state_update call, or a new code path commits state without notifying the hook), the trie updates and hashed state would silently diverge — no error, no warning, just an inconsistent state persisted by the engine.

Consider adding a debug assertion or logging comparison (e.g., comparing the state root derived from hashed_state against outcome.state_root in debug/test builds) to catch divergence early. Something like:

Suggested change
return Ok((outcome.state_root, outcome.trie_updates, hashed_state));
return Ok((outcome.state_root, outcome.trie_updates, hashed_state));
// TODO: consider adding a debug assertion comparing the
// locally-derived state root against outcome.state_root to
// catch hook/bundle_state divergence.

Comment thread crates/builder/core/src/flashblocks/state_root_task.rs Outdated
Comment thread crates/builder/core/Cargo.toml Outdated
@mw2000 mw2000 force-pushed the mw2000/state-root-fast branch 3 times, most recently from c9c8b75 to 65dd6dd Compare April 22, 2026 22:35
Comment on lines +265 to +279
let (state_hook, handle) = spawn_state_root_task(
&self.state_root_deps,
&self.client,
ctx.parent().hash(),
ctx.parent().state_root,
ctx.block_number(),
)?;
let mut state_root_handle = Some(handle);

// 1. execute the pre steps and seal an early block with that
let sequencer_tx_start_time = Instant::now();
let mut state =
State::builder().with_database(cached_reads.as_db_mut(db)).with_bundle_update().build();

let mut info = execute_pre_steps(&mut state, &ctx)?;
let mut info = execute_pre_steps(&mut state, &ctx, &state_hook)?;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sparse trie task leak on early error returns

If execute_pre_steps fails here (or build_block / payload_tx.send fail a few lines below before the flashblock loop is entered), the function returns an error via ? without calling state_hook.finish(). The sparse trie task spawned at line 265 will block forever waiting for the FinishedStateUpdates signal that finish() sends.

The build_next_flashblock error path at line 530 correctly calls state_hook.finish() before returning, but these earlier paths don't.

Consider wrapping state_hook in a guard that calls finish() on Drop, or adding explicit state_hook.finish() calls to these error paths. Something like:

let mut info = match execute_pre_steps(&mut state, &ctx, &state_hook) {
    Ok(info) => info,
    Err(e) => {
        state_hook.finish();
        return Err(e);
    }
};

Alternatively, a RAII wrapper would make this foolproof:

struct FinishOnDrop<'a>(&'a BuilderStateHook);
impl Drop for FinishOnDrop<'_> {
    fn drop(&mut self) { self.0.finish(); }
}

Introduce an async state root task that runs alongside flashblock
execution using reth's PayloadProcessor, so the final state root
computation overlaps with block building instead of blocking it.
@mw2000 mw2000 force-pushed the mw2000/state-root-fast branch from 65dd6dd to 7d5c6e3 Compare April 22, 2026 22:58
@github-actions
Copy link
Copy Markdown
Contributor

Review Summary — State root via PayloadProcessor sparse trie task

Architecture

The design is sound: PayloadProcessor is reused across blocks (warm sparse trie), the builder drives EVM execution and forwards per-tx state diffs via a shared BuilderStateHook, and the final root is awaited at block seal. The FinishGuard RAII pattern ensures the sparse trie task is signalled on all exit paths including early errors.

Key positives:

  • Pre-execution system calls (EIP-4788, EIP-2935) are correctly routed through the state hook via set_state_hook before apply_pre_execution_changes
  • merge_transitions is called before reading hashed_post_state, ensuring consistency
  • debug_assertions block cross-checks the background state root against synchronous computation
  • Blocking recv() is justified — build_payload runs on spawn_blocking_task
  • Double finish() (from finalize_state_root + FinishGuard drop) is safe since Option::take() is idempotent

Existing inline comments — status

Several inline comments from the prior review run appear to reference concerns that the current code already addresses:

Concern Status
hashed_state computed before merge_transitions Addressed — line 195 merges before line 198 reads
Pre-execution state changes not sent to trie task Addressedset_state_hook at line 997
Blocking recv() on tokio worker Addressed — documented, runs on spawn_blocking_task
Consistency check between trie_updates and hashed_state Addresseddebug_assertions block at lines 215-226
Silent no-op after finish() Addresseddebug_assert! at line 111
Sparse trie task leak on early error returns AddressedFinishGuard at line 273
Double merge_transitions Valid observation — documented in comment at lines 192-194, safe but fragile coupling
Breaking change to ClientBounds Valid — downstream impls must now satisfy additional bounds
TreeConfig discrepancy (bench vs production) Valid — documented in comment at lines 58-60 of service.rs

Remaining observations

  1. No unit tests for state_root_task module — The new module has no #[cfg(test)] mod tests. Given this is a correctness-critical component (state root computation), even basic tests for BuilderStateHook (e.g., send_state_update after finish() triggers debug_assert, FinishGuard calls finish on drop) would improve confidence.

  2. ClientBounds public API expansion — The added DatabaseProviderFactory + StateReader bounds are a breaking change for any external code implementing ClientBounds. If these bounds are only needed by spawn_state_root_task (which is pub(crate)), a separate internal trait alias could keep the public API stable.

  3. Module visibility vs project conventionsstate_root_task is pub(crate) with all types pub(crate). Per project conventions, modules should not be pub(crate) and types within should be pub with re-exports from lib.rs. This is likely an intentional choice for internal-only types, but worth noting for consistency.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants