refactor(chain)!: replace CanonicalIter with sans-io CanonicalizationTask#2038
refactor(chain)!: replace CanonicalIter with sans-io CanonicalizationTask#2038oleonardolima wants to merge 13 commits intobitcoindevkit:masterfrom
CanonicalIter with sans-io CanonicalizationTask#2038Conversation
d851ba6 to
c02636d
Compare
c02636d to
78c0538
Compare
677e25a to
9e27ab1
Compare
evanlinjin
left a comment
There was a problem hiding this comment.
Great work.
This is my initial round of reviews.
Are you planning to introduce topological ordering in a separate PR?
| let chain_tip = chain.tip().block_id(); | ||
| let task = graph.canonicalization_task(chain_tip, Default::default()); | ||
| let canonical_view = chain.canonicalize(task); |
There was a problem hiding this comment.
What do you think about the following naming:
CanonicalizationTask->CanonicalResolver.TxGraph::canonicalization_task->TxGraph::resolver.LocalChain::canonicalize->LocalChain::resolve.
There was a problem hiding this comment.
I've been thinking about this, I agree with the CanonicalResolver, though the TxGraph::resolver and LocalChain::resolve seems a bit off.
What do you think about (?):
- CanonicalResolver
- TxGraph::canonical_resolver
- LocalChain::canonical_resolve
crates/chain/src/canonical_task.rs
Outdated
| for txid in undo_not_canonical { | ||
| self.not_canonical.remove(&txid); | ||
| } | ||
| } else { |
There was a problem hiding this comment.
Nit: Have the detected_self_double_spend early return instead of having the else branch.
Rationale: Early return is easier to read and results in less nesting.
There was a problem hiding this comment.
I should've addressed this, let me double check.
9e27ab1 to
f6c8b02
Compare
crates/core/src/chain_query.rs
Outdated
| #[derive(Debug, Clone, PartialEq, Eq)] | ||
| pub struct ChainRequest<B = BlockId> { | ||
| /// The chain tip to use as reference for the query. | ||
| pub chain_tip: B, |
There was a problem hiding this comment.
I'm wondering whether it will be better to remove this from ChainRequest and have a new method on ChainQuery called something like query_tip(&self) -> BlockId.
Rationale: I'm assuming that we always want this tip to be consistent across the lifetime of the ChainQuery impl.
Counter-argument: We may want to have a ChainQuery impl which queries across conflicting chains - what would this be used for though?
WDYT
There was a problem hiding this comment.
Yes it seems weird that the request would include the chain tip when the chain oracle should be aware of its own tip.
There was a problem hiding this comment.
@ValuedMammal although the chain oracle knows the tip, we want the tip to stay consistent across requests. We are just deferring the responsibility to the ChainQuery to make the tip consistent (instead of having each ChainRequest be responsible for it).
There was a problem hiding this comment.
Rather than pass chain_tip to the CanonicalizationTask constructor, where it is merely copied into every request, I think the chain_tip parameter could be passed to LocalChain::canonicalize.
crates/core/src/chain_query.rs
Outdated
| /// Returns `Some(B)` if at least one of the requested blocks | ||
| /// is confirmed in the chain, or `None` if none are confirmed. | ||
| /// The generic parameter `B` represents the block identifier type, which defaults to `BlockId`. | ||
| pub type ChainResponse<B = BlockId> = Option<B>; |
There was a problem hiding this comment.
I think it's better API to have:
pub type ChainResponse<B = BlockId> = Vec<(B, bool)>;Rationale: We only care about whether a block is in the chain or not. However, with Option<B>, we now need to worry about the order the data is coming in.
Counter-argument: This increasing internal complexity.
Counter-counter-argument: No, this decreases internal complexity - allowing us to represent things in the most simplistic manner internally - whether something exists or not.
/// We don't have it yet. This is our `ChainRequest`
pending_blocks: BTreeSet<BlockId>,
/// We have it!
blocks: HashMap<BlockId, bool>,At the anchored-tx/transitive-tx stages, we try to advance with the blocks we have. If we are missing anything, just populate pending_blocks then try again later. Very simple.
There was a problem hiding this comment.
Indeed, it looks like a better API. I'll give it a try and see if I face any other tradeoffs in the code.
a276c37 to
b7f8fba
Compare
b46555b to
af396fb
Compare
febab93 to
1f956f1
Compare
crates/chain/src/canonical_task.rs
Outdated
|
|
||
| fn next_query(&mut self) -> Option<ChainRequest> { | ||
| // Try to advance to the next stage if needed | ||
| self.try_advance(); |
There was a problem hiding this comment.
Were we going to get rid of this?
crates/chain/src/canonical_task.rs
Outdated
| // Check if this transaction was marked transitively and needs its own anchors verified | ||
| if reason.is_transitive() { | ||
| if let Some(anchors) = self.tx_graph.all_anchors().get(txid) { | ||
| // only check anchors we haven't already confirmed | ||
| if !self.direct_anchors.contains_key(txid) { | ||
| self.unprocessed_transitively_anchored_txs.push_back(( | ||
| *txid, | ||
| tx.clone(), | ||
| anchors, | ||
| )); | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
@oleonardolima You mentioned to me that you aren't happy with this, and @LLFourn brought up splitting things into two steps in the past.
Problem Description
Currently, CanonicalizationTask convolutes two concerns in one step:
- Canonicalization - determining which transactions are canonical and why (
CanonicalReason). - Position resolution and topological ordering - mapping reasons into ChainPosition (confirmed vs unconfirmed) and in the future, ordering transactions topologically.
This makes the logic harder to follow than it needs to be. I'm thinking of splitting into two tasks:
-
CanonicalTask- the core algorithm. Resolves conflicts and outputsCanonicalTxs(transactions +CanonicalReason, no position data). -
CanonicalViewTask- thin wrapper that takesCanonicalTxsand resolves reasons intoChainPosition(and doing topological ordering in the future), producing the finalCanonicalView.Both types would be public, so callers that don't need position data can stop at step 1.
Proposed API
let canonical_txs = chain.resolve(tx_graph.canonical_task(tip, params));
let view = chain.resolve(canonical_txs.view_task());Maybe even a helper method on LocalChain:
impl LocalChain {
pub fn canonical_view(
&self,
tx_graph: &TxGraph<A>,
tip: BlockId,
canon_params: CanonicalizationParams,
view_params: CanonicalViewParams,
) -> CanonicalView<A> {
let canonical_txs = self.resolve(tx_graph.canonical_task(tip, canon_params));
self.resolve(canonical_txs.view_task(view_params))
}
}Additionally, since CanonicalTxs and CanonicalView will want to share the majority of the methods, we can do something like:
struct Canonical<A, P> { ... }
type CanonicalTxs<A> = Canonical<A, CanonicalReason<A>>;
type CanonicalView<A> = Canonical<A, ChainPosition<A>>;FullTxOut and CanonicalTx will also need that P generic as well. I would also rename FullTxOut::chain_position to pos.
Before ACK and merge (@evanlinjin's list)
Edit: I got Claude to look into the majority of these. Let me know if these commits are useful: https://github.com/evanlinjin/bdk/tree/refactor/canonical-iter-api-evanlinjin Footnotes |
1f956f1 to
3906047
Compare
replace `CanonicalIter` with sans-io `CanonicalizationTask`
Introduces new `CanonicalizationTask`, which implements the canonicalization process
through a request/response pattern, that allow us to remove the
`ChainOracle` dependency in the future.
The `CanonicalizationTask` handles direct/transitive anchor determination, also tracks
already confirmed anchors to avoid redundant queries. After all the `CanonicalizationRequest`'s
have been resolved, the `CanonicalizationTask` can be finalized returning the final `CanonicalView`.
It batches all the anchors, which require a chain query, for a given transaction into a single
`CanonicalizationRequest`, instead of having multiple requests for each one.
- Add new `CanonicalizationTask`, relying on
`Canonicalization{Request|Response}` for chain queries. It
- Replaces the old `CanonicalIter` usage, with new
`CanonicalizationTask`.
BREAKING CHANGE: It replaces direct `ChainOracle` querying in canonicalization process, with
the new request/response pattern by `CanonicalizationTask`.
The new API introduces a sans-io behavior, separating the canonicalization logic from `I/O` operations, it should be used as follows: 1. Create a new `CanonicalizationTask` with a `TxGraph`, by calling: `graph.canonicalization_task(params)` 2. Execute the canonicalization process with a chain oracle (e.g `LocalChain`, which implements `ChainOracle` trait), by calling: `chain.canonicalize(task, chain_tip)` - Replace `CanonicalView::new()` constructor with internal `CanonicalView::new()` for use by `CanonicalizationTask` - Remove `TxGraph::try_canonical_view()` and `TxGraph::canonical_view()` methods - Add `TxGraph::canonicalization_task()` method to create canonicalization tasks - Add `LocalChain::canonicalize()` method to process tasks and return `CanonicalView`'s - Update `IndexedTxGraph` to delegate canonicalization to underlying `TxGraph` BREAKING CHANGE: Remove `CanonicalView::new()` and `TxGraph::canonical_view()` methods in favor of task-based approach
- Adds `CanonicalReason`, `ObservedIn`, and `CanonicalizationParams` to `canonical_task.rs` module, instead of using the ones from `canonical_iter.rs`. - Removes the `canonical_iter.rs` file and its module declaration. BREAKING CHANGE: `CanonicalIter` and all its exports are removed
…icalizationTask` Introduce a new `ChainQuery` trait in `bdk_core` that provides an interface for query-based operations against blockchain data. This trait enables sans-IO patterns for algorithms that need to interact with blockchain oracles without directly performing I/O. The `CanonicalizationTask` now implements this trait, making it more composable and allowing the query pattern to be reused for other blockchain query operations. - Add `ChainQuery` trait with associated types for Request, Response, Context, and Result - Implement `ChainQuery` for `CanonicalizationTask` with `BlockId` as context BREAKING CHANGE: `CanonicalizationTask::finish()` now requires a `BlockId` parameter Co-Authored-By: Claude <noreply@anthropic.com>
Make `ChainRequest`/`ChainResponse` generic over block identifier types to enable reuse beyond BlockId. Move `chain_tip` into `ChainRequest` for better encapsulation and simpler API. - Make `ChainRequest` and `ChainResponse` generic types with `BlockId` as default - Add `chain_tip` field to `ChainRequest` to make it self-contained - Change `ChainQuery` trait to use generic parameter `B` for block identifier type - Remove `chain_tip` parameter from `LocalChain::canonicalize()` method - Rename `ChainQuery::Result` to `ChainQuery::Output` for clarity BREAKING CHANGE: - `ChainRequest` now has a `chain_tip` field and is generic over block identifier type - `ChainResponse` is now generic with default type parameter `BlockId` - `ChainQuery` trait now takes a generic parameter `B = BlockId` - `LocalChain::canonicalize()` no longer takes a `chain_tip` parameter Co-authored-by: Claude <noreply@anthropic.com> refactor(chain): make `LocalChain::canonicalize()` generic over `ChainQuery` Allow any type implementing `ChainQuery` trait instead of requiring `CanonicalizationTask` specifically. Signed-off-by: Leonardo Lima <oleonardolima@users.noreply.github.com>
- Unify both `unprocessed_anchored_txs` and `pending_anchored_txs` in a single `unprocessed_anchored_txs` queue. - Changes the `unprocessed_anchored_txs from `Iterator` to `VecDeque`. - Removes the `pending_anchored_txs` field and it's usage. - Collects all `anchored_txs` upfront instead of lazy iteration.
- Add new `CanonicalStage` enum for tracking the different canonicalization phases/stages. - Add new `try_advance()` method for stage progression. - Add new `is_transitive()` helper to `CanonicalReason`. - Change internal `confirmed_anchors` to `direct_anchors` for better clarity. - Update the `resolve_query()` to handle staged-based processing. Co-authored-by: Claude <noreply@anthropic.com>
3906047 to
f6b2565
Compare
Inline all stage-processing logic into `next_query()`, removing the separate `try_advance()` method, `process_*_txs()` helpers, and `is_finished()` from the `ChainQuery` trait. Add `AssumedTxs` as an explicit first stage and `CanonicalStage::advance()` for centralized stage transitions. Document the `ChainQuery` protocol contract.
…`Canonical<A, P>` Separate concerns by splitting `CanonicalizationTask` into two phases: 1. `CanonicalTask` determines which transactions are canonical and why (`CanonicalReason`), outputting `CanonicalTxs<A>`. 2. `CanonicalViewTask` resolves reasons into `ChainPosition`s (confirmed vs unconfirmed), outputting `CanonicalView<A>`. Make `Canonical<A, P>`, `CanonicalTx<P>`, and `FullTxOut<P>` generic over the position type so the same structs serve both phases. Add `LocalChain::canonical_view()` convenience method for the common two-step pipeline. Renames: `CanonicalizationTask` -> `CanonicalTask`, `CanonicalizationParams` -> `CanonicalParams`, `canonicalization_task()` -> `canonical_task()`, `FullTxOut::chain_position` -> `FullTxOut::pos`. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…Query::tip()` The chain tip is constant for the lifetime of a query, so it belongs on the trait rather than being redundantly copied into every request. `ChainRequest` is now a type alias for `Vec<B>`. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ChainResponse` These types only ever used `BlockId`, so the generic parameter added unnecessary complexity. All three are now hardcoded to `BlockId`. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Assumed transactions bypass the `AnchoredTxs` stage and are marked
canonical immediately with `CanonicalReason::Assumed`. Previously,
`view_task()` only queued anchor checks for transitive txs, so directly
assumed txs (`Assumed { descendant: None }`) were never checked and
always resolved to `Unconfirmed` even when they had confirmed anchors.
Queue all `Assumed` txs for anchor checks in `view_task()` and look up
`direct_anchors` for both `Assumed` variants in `finish()`.
Fixes bitcoindevkit#2088
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…onical_view_task.rs` Move shared types (`CanonicalTx`, `Canonical`, `CanonicalView`, `CanonicalTxs`) and convenience methods into `canonical.rs`. Keep only the phase-2 task (`CanonicalViewTask`) in `canonical_view_task.rs`. Also rename `FullTxOut` to `CanonicalTxOut` and move it to `canonical.rs`. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
fixes #1816
Description
It completes a refactoring of the canonicalization API in
bdk_chain, migrating from an iterator-based approachCanonicalIterto a sans-IO/task-based patternCanonicalizationTask. This change improves the separation of concerns between canonicalization logic and I/O operations, making the code more testable and flexible.Old API:
New API:
The new flow works as follows:
CanonicalizationTaskencapsulates all canonicalization logic without performing any chain queries.CanonicalizationRequestsfor anchor verification as needed, allowing theChainOracleto batch or optimize these queries.ChainOracle(e.g., LocalChain) processes requests and returnsCanonicalizationResponse's indicating which anchors are the best in chain.CanonicalViewcontaining all canonical transactions with their chain positions.This sans-IO pattern enables:
Notes to the reviewers
The changes are splitted in multiple commits, as I think it could help reviewing it. Also it depends on #2029 PR, as it's built on top of it.
Changelog notice
Checklists
All Submissions:
New Features:
Bugfixes: