Skip to content

refactor(chain)!: replace CanonicalIter with sans-io CanonicalizationTask#2038

Open
oleonardolima wants to merge 13 commits intobitcoindevkit:masterfrom
oleonardolima:refactor/canonical-iter-api
Open

refactor(chain)!: replace CanonicalIter with sans-io CanonicalizationTask#2038
oleonardolima wants to merge 13 commits intobitcoindevkit:masterfrom
oleonardolima:refactor/canonical-iter-api

Conversation

@oleonardolima
Copy link
Collaborator

@oleonardolima oleonardolima commented Sep 18, 2025

fixes #1816

Description

It completes a refactoring of the canonicalization API in bdk_chain, migrating from an iterator-based approach CanonicalIter to a sans-IO/task-based pattern CanonicalizationTask. This change improves the separation of concerns between canonicalization logic and I/O operations, making the code more testable and flexible.

Old API:

// Direct coupling between canonicalization logic and `ChainOracle`
let view = tx_graph.canonical_view(&chain, chain_tip, params)?;

New API:

// Step 1: Create a task (pure logic, no I/O)
let task = tx_graph.canonicalization_task(params);

// Step 2: Execute with a chain oracle (handles I/O)
let view = chain.canonicalize(task, Some(chain_tip));

The new flow works as follows:

  1. Task: CanonicalizationTask encapsulates all canonicalization logic without performing any chain queries.
  2. Query: The task generates CanonicalizationRequests for anchor verification as needed, allowing the ChainOracle to batch or optimize these queries.
  3. Resolution: The ChainOracle (e.g., LocalChain) processes requests and returns CanonicalizationResponse's indicating which anchors are the best in chain.
  4. CanonicalView: Once all queries are resolved, the task builds a complete CanonicalView containing all canonical transactions with their chain positions.

This sans-IO pattern enables:

  • Tasks can be tested with mock responses without a real chain
  • Different chain oracles can implement their own optimization strategies
  • Clear separation between business logic and I/O operations

Notes to the reviewers

The changes are splitted in multiple commits, as I think it could help reviewing it. Also it depends on #2029 PR, as it's built on top of it.

Changelog notice

  ### Changed
  - **Breaking:** Replace `TxGraph::canonical_iter()` and `TxGraph::canonical_view()` with `TxGraph::canonicalization_task()`
  - **Breaking:** Remove `CanonicalIter` in favor of `CanonicalizationTask`
  - **Breaking:** Change canonicalization to use a two-step process: create task, then execute with chain oracle
  - Move `CanonicalReason`, `ObservedIn`, and `CanonicalizationParams` from `canonical_iter` module to `canonical_task` module
  - Add `LocalChain::canonicalize()` method to execute canonicalization tasks

  ### Removed
  - **Breaking:** Remove `canonical_iter` module and `CanonicalIter` type
  - **Breaking:** Remove `TxGraph::try_canonical_view()` and `TxGraph::canonical_view()` methods
  - **Breaking:** Remove `CanonicalView::new()` public constructor

  ### Added
  - New sans-IO `CanonicalizationTask` for stateless canonicalization
  - `CanonicalizationRequest` and `CanonicalizationResponse` types for anchor verification

Checklists

All Submissions:

New Features:

  • I've added tests for the new feature
  • I've added docs for the new feature

Bugfixes:

  • This pull request breaks the existing API
  • I've added tests to reproduce the issue which are now passing
  • I'm linking the issue being fixed by this PR

@oleonardolima oleonardolima added this to the Wallet 3.0.0 milestone Sep 18, 2025
@notmandatory notmandatory moved this to In Progress in BDK Chain Sep 18, 2025
@oleonardolima oleonardolima force-pushed the refactor/canonical-iter-api branch 6 times, most recently from d851ba6 to c02636d Compare September 23, 2025 00:54
@oleonardolima oleonardolima added module-blockchain api A breaking API change labels Sep 23, 2025
@oleonardolima oleonardolima force-pushed the refactor/canonical-iter-api branch from c02636d to 78c0538 Compare September 23, 2025 01:08
@oleonardolima oleonardolima force-pushed the refactor/canonical-iter-api branch 3 times, most recently from 677e25a to 9e27ab1 Compare September 29, 2025 01:47
@oleonardolima oleonardolima marked this pull request as ready for review October 2, 2025 06:18
Copy link
Member

@evanlinjin evanlinjin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work.

This is my initial round of reviews.

Are you planning to introduce topological ordering in a separate PR?

Comment on lines +72 to +74
let chain_tip = chain.tip().block_id();
let task = graph.canonicalization_task(chain_tip, Default::default());
let canonical_view = chain.canonicalize(task);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about the following naming:

  • CanonicalizationTask -> CanonicalResolver.
  • TxGraph::canonicalization_task -> TxGraph::resolver.
  • LocalChain::canonicalize -> LocalChain::resolve.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been thinking about this, I agree with the CanonicalResolver, though the TxGraph::resolver and LocalChain::resolve seems a bit off.

What do you think about (?):

  • CanonicalResolver
  • TxGraph::canonical_resolver
  • LocalChain::canonical_resolve

for txid in undo_not_canonical {
self.not_canonical.remove(&txid);
}
} else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Have the detected_self_double_spend early return instead of having the else branch.

Rationale: Early return is easier to read and results in less nesting.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should've addressed this, let me double check.

@oleonardolima oleonardolima force-pushed the refactor/canonical-iter-api branch from 9e27ab1 to f6c8b02 Compare October 3, 2025 00:33
#[derive(Debug, Clone, PartialEq, Eq)]
pub struct ChainRequest<B = BlockId> {
/// The chain tip to use as reference for the query.
pub chain_tip: B,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering whether it will be better to remove this from ChainRequest and have a new method on ChainQuery called something like query_tip(&self) -> BlockId.

Rationale: I'm assuming that we always want this tip to be consistent across the lifetime of the ChainQuery impl.

Counter-argument: We may want to have a ChainQuery impl which queries across conflicting chains - what would this be used for though?

WDYT

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it seems weird that the request would include the chain tip when the chain oracle should be aware of its own tip.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ValuedMammal although the chain oracle knows the tip, we want the tip to stay consistent across requests. We are just deferring the responsibility to the ChainQuery to make the tip consistent (instead of having each ChainRequest be responsible for it).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than pass chain_tip to the CanonicalizationTask constructor, where it is merely copied into every request, I think the chain_tip parameter could be passed to LocalChain::canonicalize.

/// Returns `Some(B)` if at least one of the requested blocks
/// is confirmed in the chain, or `None` if none are confirmed.
/// The generic parameter `B` represents the block identifier type, which defaults to `BlockId`.
pub type ChainResponse<B = BlockId> = Option<B>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better API to have:

pub type ChainResponse<B = BlockId> = Vec<(B, bool)>;

Rationale: We only care about whether a block is in the chain or not. However, with Option<B>, we now need to worry about the order the data is coming in.

Counter-argument: This increasing internal complexity.

Counter-counter-argument: No, this decreases internal complexity - allowing us to represent things in the most simplistic manner internally - whether something exists or not.

    /// We don't have it yet. This is our `ChainRequest`
    pending_blocks: BTreeSet<BlockId>,
    /// We have it!
    blocks: HashMap<BlockId, bool>,

At the anchored-tx/transitive-tx stages, we try to advance with the blocks we have. If we are missing anything, just populate pending_blocks then try again later. Very simple.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, it looks like a better API. I'll give it a try and see if I face any other tradeoffs in the code.

@oleonardolima oleonardolima force-pushed the refactor/canonical-iter-api branch 4 times, most recently from a276c37 to b7f8fba Compare October 8, 2025 04:53
@oleonardolima oleonardolima force-pushed the refactor/canonical-iter-api branch 2 times, most recently from b46555b to af396fb Compare January 23, 2026 23:09
@luisschwab luisschwab requested review from luisschwab and removed request for LagginTimes and luisschwab January 24, 2026 17:08
@oleonardolima oleonardolima force-pushed the refactor/canonical-iter-api branch 4 times, most recently from febab93 to 1f956f1 Compare February 3, 2026 17:23

fn next_query(&mut self) -> Option<ChainRequest> {
// Try to advance to the next stage if needed
self.try_advance();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Were we going to get rid of this?

Comment on lines +478 to +490
// Check if this transaction was marked transitively and needs its own anchors verified
if reason.is_transitive() {
if let Some(anchors) = self.tx_graph.all_anchors().get(txid) {
// only check anchors we haven't already confirmed
if !self.direct_anchors.contains_key(txid) {
self.unprocessed_transitively_anchored_txs.push_back((
*txid,
tx.clone(),
anchors,
));
}
}
}
Copy link
Member

@evanlinjin evanlinjin Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@oleonardolima You mentioned to me that you aren't happy with this, and @LLFourn brought up splitting things into two steps in the past.

Problem Description

Currently, CanonicalizationTask convolutes two concerns in one step:

  1. Canonicalization - determining which transactions are canonical and why (CanonicalReason).
  2. Position resolution and topological ordering - mapping reasons into ChainPosition (confirmed vs unconfirmed) and in the future, ordering transactions topologically.

This makes the logic harder to follow than it needs to be. I'm thinking of splitting into two tasks:

  • CanonicalTask - the core algorithm. Resolves conflicts and outputs CanonicalTxs (transactions + CanonicalReason, no position data).

  • CanonicalViewTask - thin wrapper that takes CanonicalTxs and resolves reasons into ChainPosition (and doing topological ordering in the future), producing the final CanonicalView.

    Both types would be public, so callers that don't need position data can stop at step 1.

Proposed API

  let canonical_txs = chain.resolve(tx_graph.canonical_task(tip, params));
  let view = chain.resolve(canonical_txs.view_task());

Maybe even a helper method on LocalChain:

  impl LocalChain {
      pub fn canonical_view(
          &self,
          tx_graph: &TxGraph<A>,
          tip: BlockId,
          canon_params: CanonicalizationParams,
          view_params: CanonicalViewParams,
      ) -> CanonicalView<A> {
          let canonical_txs = self.resolve(tx_graph.canonical_task(tip, canon_params));
          self.resolve(canonical_txs.view_task(view_params))
      }
  }

Additionally, since CanonicalTxs and CanonicalView will want to share the majority of the methods, we can do something like:

struct Canonical<A, P> { ... }

type CanonicalTxs<A> = Canonical<A, CanonicalReason<A>>;
type CanonicalView<A> = Canonical<A, ChainPosition<A>>;

FullTxOut and CanonicalTx will also need that P generic as well. I would also rename FullTxOut::chain_position to pos.

@evanlinjin
Copy link
Member

evanlinjin commented Feb 13, 2026

Before ACK and merge (@evanlinjin's list)

  • Rename CanonicalizationTask to CanonicalTask.
  • Consider renaming ChainQuery to ChainTask.
  • Remove and inline single-used private methods: try_advance, process_assumed_txs (I would add another variant to CanonicalStage) process_last_seen, process_leftover_txs. 1
  • Remove is_finished (unless if you can provide justification for it's existence). My argument: "No more queries" instantly means "actually finished" - there is no state inbetween to represent.
  • Remove chain_tip from ChainRequest. Add ChainQuery::tip() -> BlockId method. 2
  • Change CanonicalTask to return CanonicalTxs & introduce CanonicalViewTask that returns CanonicalView. 3
  • Remove all generics on ChainTask (ChainQuery) as we will flesh out this trait in follow-up PRs.

Edit:

I got Claude to look into the majority of these. Let me know if these commits are useful: https://github.com/evanlinjin/bdk/tree/refactor/canonical-iter-api-evanlinjin

Footnotes

  1. https://github.com/bitcoindevkit/bdk/pull/2038#discussion_r2658462257

  2. https://github.com/bitcoindevkit/bdk/pull/2038#discussion_r2401496855

  3. https://github.com/bitcoindevkit/bdk/pull/2038#discussion_r2802430801

oleonardolima and others added 7 commits March 6, 2026 15:05
replace `CanonicalIter` with sans-io `CanonicalizationTask`

Introduces new `CanonicalizationTask`, which implements the canonicalization process
through a request/response pattern, that allow us to remove the
`ChainOracle` dependency in the future.

The `CanonicalizationTask` handles direct/transitive anchor determination, also tracks
already confirmed anchors to avoid redundant queries. After all the `CanonicalizationRequest`'s
have been resolved, the `CanonicalizationTask` can be finalized returning the final `CanonicalView`.

It batches all the anchors, which require a chain query, for a given transaction into a single
`CanonicalizationRequest`, instead of having multiple requests for each one.

- Add new `CanonicalizationTask`, relying on
  `Canonicalization{Request|Response}` for chain queries. It
- Replaces the old `CanonicalIter` usage, with new
  `CanonicalizationTask`.

BREAKING CHANGE: It replaces direct `ChainOracle` querying in canonicalization process, with
the new request/response pattern by `CanonicalizationTask`.
The new API introduces a sans-io behavior, separating the
canonicalization logic from `I/O` operations, it should be used as
follows:

1. Create a new `CanonicalizationTask` with a `TxGraph`, by calling:
   `graph.canonicalization_task(params)`
2. Execute the canonicalization process with a chain oracle (e.g
   `LocalChain`, which implements `ChainOracle` trait), by calling:
   `chain.canonicalize(task, chain_tip)`

- Replace `CanonicalView::new()` constructor with internal `CanonicalView::new()` for use by `CanonicalizationTask`
- Remove `TxGraph::try_canonical_view()` and `TxGraph::canonical_view()` methods
- Add `TxGraph::canonicalization_task()` method to create canonicalization tasks
- Add `LocalChain::canonicalize()` method to process tasks and return `CanonicalView`'s
- Update `IndexedTxGraph` to delegate canonicalization to underlying `TxGraph`

BREAKING CHANGE: Remove `CanonicalView::new()` and `TxGraph::canonical_view()` methods in favor of task-based approach
- Adds `CanonicalReason`, `ObservedIn`, and `CanonicalizationParams` to
  `canonical_task.rs` module, instead of using the ones from
  `canonical_iter.rs`.
- Removes the `canonical_iter.rs` file and its module declaration.

BREAKING CHANGE: `CanonicalIter` and all its exports are removed
…icalizationTask`

Introduce a new `ChainQuery` trait in `bdk_core` that provides an
interface for query-based operations against blockchain data. This trait
enables sans-IO patterns for algorithms that need to interact with blockchain
oracles without directly performing I/O.

The `CanonicalizationTask` now implements this trait, making it more composable
and allowing the query pattern to be reused for other blockchain query operations.

- Add `ChainQuery` trait with associated types for Request, Response, Context, and Result
- Implement `ChainQuery` for `CanonicalizationTask` with `BlockId` as context

BREAKING CHANGE: `CanonicalizationTask::finish()` now requires a `BlockId` parameter

Co-Authored-By: Claude <noreply@anthropic.com>
Make `ChainRequest`/`ChainResponse` generic over block identifier types to enable
reuse beyond BlockId. Move `chain_tip` into `ChainRequest` for better encapsulation
and simpler API.

- Make `ChainRequest` and `ChainResponse` generic types with `BlockId` as default
- Add `chain_tip` field to `ChainRequest` to make it self-contained
- Change `ChainQuery` trait to use generic parameter `B` for block identifier type
- Remove `chain_tip` parameter from `LocalChain::canonicalize()` method
- Rename `ChainQuery::Result` to `ChainQuery::Output` for clarity

BREAKING CHANGE:
- `ChainRequest` now has a `chain_tip` field and is generic over block identifier type
- `ChainResponse` is now generic with default type parameter `BlockId`
- `ChainQuery` trait now takes a generic parameter `B = BlockId`
- `LocalChain::canonicalize()` no longer takes a `chain_tip` parameter

Co-authored-by: Claude <noreply@anthropic.com>

refactor(chain): make `LocalChain::canonicalize()` generic over `ChainQuery`

Allow any type implementing `ChainQuery` trait instead of requiring
`CanonicalizationTask` specifically.

Signed-off-by: Leonardo Lima <oleonardolima@users.noreply.github.com>
- Unify both `unprocessed_anchored_txs` and `pending_anchored_txs` in a
  single `unprocessed_anchored_txs` queue.
- Changes the `unprocessed_anchored_txs from `Iterator` to `VecDeque`.
- Removes the `pending_anchored_txs` field and it's usage.
- Collects all `anchored_txs` upfront instead of lazy iteration.
- Add new `CanonicalStage` enum for tracking the different
  canonicalization phases/stages.
- Add new `try_advance()` method for stage progression.
- Add new `is_transitive()` helper to `CanonicalReason`.
- Change internal `confirmed_anchors` to `direct_anchors` for better
  clarity.
- Update the `resolve_query()` to handle staged-based processing.

Co-authored-by: Claude <noreply@anthropic.com>
@oleonardolima oleonardolima force-pushed the refactor/canonical-iter-api branch from 3906047 to f6b2565 Compare March 6, 2026 18:12
evanlinjin and others added 6 commits March 6, 2026 15:32
Inline all stage-processing logic into `next_query()`, removing the
separate `try_advance()` method, `process_*_txs()` helpers, and
`is_finished()` from the `ChainQuery` trait. Add `AssumedTxs` as an
explicit first stage and `CanonicalStage::advance()` for centralized
stage transitions. Document the `ChainQuery` protocol contract.
…`Canonical<A, P>`

Separate concerns by splitting `CanonicalizationTask` into two phases:

1. `CanonicalTask` determines which transactions are canonical and why
   (`CanonicalReason`), outputting `CanonicalTxs<A>`.
2. `CanonicalViewTask` resolves reasons into `ChainPosition`s (confirmed
   vs unconfirmed), outputting `CanonicalView<A>`.

Make `Canonical<A, P>`, `CanonicalTx<P>`, and `FullTxOut<P>` generic over
the position type so the same structs serve both phases. Add
`LocalChain::canonical_view()` convenience method for the common two-step
pipeline.

Renames: `CanonicalizationTask` -> `CanonicalTask`,
`CanonicalizationParams` -> `CanonicalParams`,
`canonicalization_task()` -> `canonical_task()`,
`FullTxOut::chain_position` -> `FullTxOut::pos`.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…Query::tip()`

The chain tip is constant for the lifetime of a query, so it belongs on
the trait rather than being redundantly copied into every request.
`ChainRequest` is now a type alias for `Vec<B>`.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ChainResponse`

These types only ever used `BlockId`, so the generic parameter added
unnecessary complexity. All three are now hardcoded to `BlockId`.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Assumed transactions bypass the `AnchoredTxs` stage and are marked
canonical immediately with `CanonicalReason::Assumed`. Previously,
`view_task()` only queued anchor checks for transitive txs, so directly
assumed txs (`Assumed { descendant: None }`) were never checked and
always resolved to `Unconfirmed` even when they had confirmed anchors.

Queue all `Assumed` txs for anchor checks in `view_task()` and look up
`direct_anchors` for both `Assumed` variants in `finish()`.

Fixes bitcoindevkit#2088

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…onical_view_task.rs`

Move shared types (`CanonicalTx`, `Canonical`, `CanonicalView`, `CanonicalTxs`)
and convenience methods into `canonical.rs`. Keep only the phase-2 task
(`CanonicalViewTask`) in `canonical_view_task.rs`. Also rename `FullTxOut` to
`CanonicalTxOut` and move it to `canonical.rs`.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api A breaking API change module-blockchain

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

Remove ChainOracle trait by inverting dependency

3 participants