refactor(prover-client): deferred epoch finalization, checkpoint-ready promise, and reorg safety#22749
Conversation
6919b6f to
f560e42
Compare
…y promise, and reorg safety (A-954) Restructures the orchestrator to support optimistic proving: - startNewEpoch() no longer takes totalNumCheckpoints or finalBlobBatchingChallenges - New finalizeEpochStructure() method called after all checkpoints are processed - Checkpoints-ready promise (callback-based) resolves when all checkpoints complete block-level proving - Two-input gate: checkpoint roots only enqueue when both block merge proofs and epoch structure are ready - removeLastCheckpoint() for L1 reorg safety — closes world state forks and cleans up chonk verifier cache - Checkpoints must be registered sequentially (enforced); block processing remains parallel - EpochProvingJob registers checkpoints in order, then processes blocks via asyncPool - No behavior change: proving still triggers at epoch end Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
f560e42 to
806825c
Compare
…ucture-deferred-finalization-checkpoints
| * Removes the last checkpoint from the epoch. Only valid before `finalizeEpochStructure` has been called. | ||
| * Returns the removed checkpoint, or undefined if there are no checkpoints. | ||
| */ | ||
| public removeLastCheckpoint(): CheckpointProvingState | undefined { |
There was a problem hiding this comment.
In a follow-up PR this has changed to allow removal of a checkpoint at any index.
| * includes the removed checkpoint | ||
| * - Block merge jobs may be enqueued wastefully but will hit the same safe gates | ||
| */ | ||
| public removeLastCheckpoint() { |
There was a problem hiding this comment.
Later PR changes this to remove any checkpoint, not just last.
| // Wait for all checkpoints to complete block-level proving before finalizing the epoch structure. | ||
| // In the current non-optimistic flow this will resolve once all block merge proofs complete. | ||
| // In the future optimistic flow, this will be awaited alongside the epoch-end signal. | ||
| await this.prover.waitForAllCheckpointsReady(); |
There was a problem hiding this comment.
Not sure I understand this part. Why do we want to wait for checkpoints to finish proving here? What we want is to wait for all checkpoints in the epoch to have been mined, so we can finalize the epoch with all the blob data, right?
| // Transition to FULL if all checkpoints are added. | ||
| if (this.checkpoints.filter(c => !!c).length === totalNumCheckpoints) { | ||
| this.provingStateLifecycle = PROVING_STATE_LIFECYCLE.PROVING_STATE_FULL; | ||
| } |
There was a problem hiding this comment.
When is this function expected to be called? The jsdoc for the function states that "Finalizes the epoch structure after all checkpoints have been added", but this check hints that not all checkpoints may be added yet?
| const activeCheckpoints = this.checkpoints.filter(c => !!c) as CheckpointProvingState[]; | ||
| for (let i = 0; i < activeCheckpoints.length; i++) { | ||
| const checkpoint = activeCheckpoints[i]; | ||
|
|
There was a problem hiding this comment.
Heads up this changes semantics. If there's a gap in this.checkpoints (maybe because one checkpoint finalized simulation earlier than a previous one?) then we may compute the outHashHint for a checkpoint using the outHash from a checkpoint that's not the immediately previous one in the epoch.
| /** All callbacks waiting for checkpoints to be block-level ready. */ | ||
| private checkpointsReadyCallbacks: Array<{ resolve: () => void; reject: (reason: string) => void }> = []; |
There was a problem hiding this comment.
Nit: while I like this pattern, we're not using it anywhere else for proving orchestration. Maybe we can drop it and use the same checks we do elsewhere, for consistency's sake? Only if it's not too much effort.
Summary
Restructures the orchestrator to support optimistic proving (A-954):
startNewEpoch()no longer takestotalNumCheckpointsorfinalBlobBatchingChallenges— neither is known at epoch startfinalizeEpochStructure(totalNumCheckpoints, finalBlobBatchingChallenges)method called after all checkpoints are processedremoveLastCheckpoint()for L1 reorg safety (only valid before finalization)Fixes A-954
Test plan
orchestrator_deferred_finalization.test.tscovering: