Skip to content

Feat/mixed probing strategy#10

Open
dzdidi wants to merge 20 commits intofix/no-route-errorfrom
feat/mixed-probing-strategy
Open

Feat/mixed probing strategy#10
dzdidi wants to merge 20 commits intofix/no-route-errorfrom
feat/mixed-probing-strategy

Conversation

@dzdidi
Copy link
Copy Markdown
Collaborator

@dzdidi dzdidi commented Apr 22, 2026

Based on last bitkit call:

  • mixed probing strategy
  • significnat refactoring

@dzdidi dzdidi requested a review from Copilot April 22, 2026 10:17
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a mixed probing strategy (peer-list + random-graph) and refactors probing/event handling into dedicated modules to support the new behavior.

Changes:

  • Introduces src/probing.rs with a unified probing loop supporting both peer-list probing and random-graph probing modes.
  • Extracts LDK event handling into src/events.rs, wiring probe success/failure events into a shared ProbeTracker.
  • Extends probing configuration (random_min_amount_msat, random_nodes_per_interval) and updates examples/docs accordingly.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
src/probing.rs New probing module implementing mixed probing strategy and probe tracking.
src/events.rs New event handler module; forwards probe events to ProbeTracker and contains prior event logic.
src/main.rs Wires in events + probing modules; updates event handler and probing loop startup.
src/config.rs Adds new probing config fields + defaults, and maps them into CLI config.
src/cli.rs Extends ProbingConfig to carry new random-probing settings.
README.md Documents random-graph probing settings and updates example config snippet.
config.example.toml Adds random probing fields to the example probing config.
prober_config.json.example Adds random probing fields to the JSON example config.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread README.md
Comment thread config.example.toml
Comment thread src/probing.rs Outdated
Comment on lines +78 to +91
fn prepare_probe(
channel_manager: &ChannelManager, graph: &NetworkGraph, logger: &FilesystemLogger,
scorer: &RwLock<ProbabilisticScorer<Arc<NetworkGraph>, Arc<FilesystemLogger>>>,
pub_key_hex: &str, probe_amount: u64,
) -> Option<PaymentHash> {
if probe_amount == 0 {
return None;
}
let pub_key_bytes = hex_utils::to_vec(pub_key_hex)?;
if let Ok(pk) = PublicKey::from_slice(&pub_key_bytes) {
return send_probe(channel_manager, pk, graph, logger, probe_amount, scorer);
}
None
}
Copy link

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prepare_probe expects probe_config.peers entries to be raw hex pubkeys (it passes the string directly to hex_utils::to_vec). However, the examples/docs use pubkey@host:port-style strings, which will fail hex parsing and get reported upstream as NO_ROUTE. Consider either parsing out the pubkey prefix (split at @) or changing the config/docs to clearly require hex-only pubkeys, and log an explicit "invalid pubkey" warning when parsing fails.

Copilot uses AI. Check for mistakes.
Comment thread src/probing.rs Outdated
Comment on lines +202 to +209
tokio::spawn(async move {
let ProbingDeps { channel_manager, network_graph, logger, scorer, tracker } = deps;
let mut interval = tokio::time::interval(Duration::from_secs(probe_config.interval_sec));
let our_node_id = NodeId::from_pubkey(&channel_manager.get_our_node_id());

loop {
interval.tick().await;

Copy link

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tokio::time::interval defaults to MissedTickBehavior::Burst. If probing a full peer list + random graph takes longer than interval_sec, the next ticks will fire immediately and can create back-to-back probe bursts. Consider setting interval.set_missed_tick_behavior(MissedTickBehavior::Skip/Delay) (similar to the rapid gossip sync loop) to keep probing cadence stable under load.

Copilot uses AI. Check for mistakes.
Comment thread src/probing.rs Outdated
Comment on lines +296 to +328
if random_probing_enabled {
let mut random_candidates = Vec::new();
{
let graph_read_only = network_graph.read_only();
for (node_id, _node_info) in graph_read_only.nodes().unordered_iter() {
if *node_id == our_node_id {
continue;
}
if let Ok(pubkey) = node_id.as_pubkey() {
random_candidates.push(pubkey);
}
}
}

if random_candidates.is_empty() {
lightning::log_warn!(
&*logger,
"Random probe skipped: no graph nodes available"
);
continue;
}

let random_recipients = {
let mut rng = thread_rng();
rng.shuffle(&mut random_candidates);

let requested_count =
probe_config.random_nodes_per_interval.try_into().unwrap_or(usize::MAX);
let random_probe_count =
std::cmp::min(requested_count, random_candidates.len());

random_candidates.into_iter().take(random_probe_count).collect::<Vec<_>>()
};
Copy link

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Random-graph probing currently rebuilds a full random_candidates list of all graph nodes and shuffles it every interval. On a large network graph this is O(N) work and allocates a potentially large Vec each tick even when random_nodes_per_interval is small. Consider sampling random_nodes_per_interval nodes without shuffling the full list (e.g., reservoir sampling / rand sampling helpers) or caching candidates between intervals.

Copilot uses AI. Check for mistakes.
Comment thread src/probing.rs Outdated
Comment on lines +133 to +143
async fn await_probe_result(
tracker: &Arc<Mutex<ProbeTracker>>, payment_hash: PaymentHash, timeout: Duration,
) -> ProbeWaitResult {
let receiver = tracker.lock().unwrap().register_probe(payment_hash);
match tokio::time::timeout(timeout, receiver).await {
Ok(Ok(ProbeOutcome::Success)) => ProbeWaitResult::Success,
Ok(Ok(ProbeOutcome::Failed)) => ProbeWaitResult::Failed,
Ok(Err(_)) => ProbeWaitResult::Dropped,
Err(_) => {
tracker.lock().unwrap().remove_pending(&payment_hash);
ProbeWaitResult::Timeout
Copy link

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a race where Event::ProbeSuccessful/ProbeFailed can be handled before await_probe_result calls register_probe for the returned payment_hash. In that case complete_* finds no sender and drops the outcome, leading to an artificial timeout. To avoid lost outcomes, consider buffering completed probe outcomes in ProbeTracker when no sender is registered yet, and having register_probe immediately resolve if an outcome already exists.

Suggested change
async fn await_probe_result(
tracker: &Arc<Mutex<ProbeTracker>>, payment_hash: PaymentHash, timeout: Duration,
) -> ProbeWaitResult {
let receiver = tracker.lock().unwrap().register_probe(payment_hash);
match tokio::time::timeout(timeout, receiver).await {
Ok(Ok(ProbeOutcome::Success)) => ProbeWaitResult::Success,
Ok(Ok(ProbeOutcome::Failed)) => ProbeWaitResult::Failed,
Ok(Err(_)) => ProbeWaitResult::Dropped,
Err(_) => {
tracker.lock().unwrap().remove_pending(&payment_hash);
ProbeWaitResult::Timeout
fn map_probe_outcome(outcome: ProbeOutcome) -> ProbeWaitResult {
match outcome {
ProbeOutcome::Success => ProbeWaitResult::Success,
ProbeOutcome::Failed => ProbeWaitResult::Failed,
}
}
async fn await_probe_result(
tracker: &Arc<Mutex<ProbeTracker>>, payment_hash: PaymentHash, timeout: Duration,
) -> ProbeWaitResult {
match tracker.lock().unwrap().register_probe(payment_hash) {
ProbeRegistration::Completed(outcome) => map_probe_outcome(outcome),
ProbeRegistration::Pending(receiver) => match tokio::time::timeout(timeout, receiver).await {
Ok(Ok(outcome)) => map_probe_outcome(outcome),
Ok(Err(_)) => ProbeWaitResult::Dropped,
Err(_) => {
tracker.lock().unwrap().remove_pending(&payment_hash);
ProbeWaitResult::Timeout
},

Copilot uses AI. Check for mistakes.
Comment thread src/main.rs Outdated
event,
)
.await;
events::handle_ldk_events(event_context.as_ref().clone(), event).await;
Copy link

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

event_context.as_ref().clone() clones the whole EventContext (many Arcs) for every event. Consider changing handle_ldk_events to take an Arc<EventContext> (or &EventContext) so the event handler can pass Arc::clone(&event_context) and avoid repeated struct cloning.

Suggested change
events::handle_ldk_events(event_context.as_ref().clone(), event).await;
events::handle_ldk_events(Arc::clone(&event_context), event).await;

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/probing.rs Outdated
Comment thread src/probing.rs Outdated
Comment on lines +72 to +76
if let Some(sender) = self.pending_probes.remove(payment_hash) {
let _ = sender.send(outcome);
} else {
self.completed_probes.insert(payment_hash.clone(), outcome);
}
Copy link

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

complete_probe buffers outcomes in completed_probes whenever there is no pending sender. This also happens after timeouts (since await_probe_result calls remove_pending), so a probe that completes after the timeout will be inserted into completed_probes and never cleaned up (no subsequent register_probe occurs), causing an unbounded memory leak over time. Consider tracking timed-out hashes and dropping late completions (or pruning completed_probes by size/age).

Copilot uses AI. Check for mistakes.
Comment thread src/probing.rs Outdated
Comment thread README.md
Comment thread src/probing.rs Outdated
@dzdidi dzdidi force-pushed the feat/mixed-probing-strategy branch from 4d36f8f to e3f3f6b Compare April 23, 2026 12:43
@dzdidi dzdidi changed the base branch from main to tmp/file-rename April 23, 2026 12:43
@dzdidi dzdidi requested a review from Copilot April 23, 2026 12:49
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/probing.rs Outdated
Comment on lines +264 to +271
let peer_short = truncate_pubkey(peer);
'amounts: for &amount in &sorted_amounts {
let payment_hash = prepare_probe(
&channel_manager,
&network_graph,
&logger,
&scorer,
peer,
Copy link

Copilot AI Apr 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probe_config.peers entries in the docs/examples are in the form pubkey@host:port, but this loop passes the full string into prepare_probe, which hex-decodes it as a pubkey. That will fail for any entry containing @... and cause peer-list probing to always hit the NO_ROUTE branch.

Extract the pubkey component (e.g., peer.split('@').next()) before calling prepare_probe (and ideally before truncate_pubkey so logs show the pubkey).

Suggested change
let peer_short = truncate_pubkey(peer);
'amounts: for &amount in &sorted_amounts {
let payment_hash = prepare_probe(
&channel_manager,
&network_graph,
&logger,
&scorer,
peer,
let peer_pubkey = peer.split('@').next().unwrap_or(peer);
let peer_short = truncate_pubkey(peer_pubkey);
'amounts: for &amount in &sorted_amounts {
let payment_hash = prepare_probe(
&channel_manager,
&network_graph,
&logger,
&scorer,
peer_pubkey,

Copilot uses AI. Check for mistakes.
Comment thread src/probing.rs Outdated
Comment on lines +154 to +156
let scorer = scorer.read().unwrap();
let inflight_scorer = ScorerAccountingForInFlightHtlcs::new(&scorer, &in_flight_htlcs);
let score_params: ProbabilisticScoringFeeParameters = Default::default();
Copy link

Copilot AI Apr 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

send_probe uses ProbabilisticScoringFeeParameters::default(), but the main router is configured with a non-default base_penalty_msat: 500_000 (see src/main.rs). This makes probe routing behavior diverge from normal payment routing and can produce misleading probe results.

Consider reusing the same ProbabilisticScoringFeeParameters as the router (e.g., pass it in via ProbingDeps or centralize it as a shared constant).

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/probing.rs Outdated
Comment on lines +369 to +380
for (peer_idx, peer) in probe_config.peers.iter().enumerate() {
let peer_short = truncate_pubkey(peer);
'amounts: for &amount in &sorted_amounts {
let payment_hash = prepare_probe(
&channel_manager,
&network_graph,
&logger,
&scorer,
&scoring_fee_params,
peer,
amount,
);
Copy link

Copilot AI Apr 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prepare_probe expects pub_key_hex to be a raw hex-encoded pubkey (it hex-decodes the full string). However probing.peers is documented as pubkey@host:port (see config.example.toml/README), and the loop passes the full peer string into prepare_probe, which will fail hex parsing and cause every peer-list probe to be treated as NO_ROUTE. Consider splitting on '@' (or otherwise extracting the pubkey portion) before calling prepare_probe, and use the same extracted pubkey for logging/truncation.

Copilot uses AI. Check for mistakes.
Comment thread src/probing.rs Outdated
Comment on lines +455 to +485
let mut random_candidates = Vec::new();
{
let graph_read_only = network_graph.read_only();
for (node_id, _node_info) in graph_read_only.nodes().unordered_iter() {
if *node_id == our_node_id {
continue;
}
if let Ok(pubkey) = node_id.as_pubkey() {
random_candidates.push(pubkey);
}
}
}

if random_candidates.is_empty() {
lightning::log_warn!(
&*logger,
"Random probe skipped: no graph nodes available"
);
continue;
}

let random_recipients = {
let mut rng = thread_rng();
rng.shuffle(&mut random_candidates);

let requested_count =
probe_config.random_nodes_per_interval.try_into().unwrap_or(usize::MAX);
let random_probe_count =
std::cmp::min(requested_count, random_candidates.len());

random_candidates.into_iter().take(random_probe_count).collect::<Vec<_>>()
Copy link

Copilot AI Apr 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Random-graph probing currently rebuilds a Vec of all graph node pubkeys every interval and then shuffles the entire list before taking random_nodes_per_interval. On large graphs this is O(N) memory + time each interval and may become a noticeable CPU spike. Consider sampling k random nodes without collecting/shuffling the whole set (e.g., reservoir sampling while iterating nodes, or sampling indices/using a partial shuffle), and/or caching the candidate list and refreshing it less frequently.

Suggested change
let mut random_candidates = Vec::new();
{
let graph_read_only = network_graph.read_only();
for (node_id, _node_info) in graph_read_only.nodes().unordered_iter() {
if *node_id == our_node_id {
continue;
}
if let Ok(pubkey) = node_id.as_pubkey() {
random_candidates.push(pubkey);
}
}
}
if random_candidates.is_empty() {
lightning::log_warn!(
&*logger,
"Random probe skipped: no graph nodes available"
);
continue;
}
let random_recipients = {
let mut rng = thread_rng();
rng.shuffle(&mut random_candidates);
let requested_count =
probe_config.random_nodes_per_interval.try_into().unwrap_or(usize::MAX);
let random_probe_count =
std::cmp::min(requested_count, random_candidates.len());
random_candidates.into_iter().take(random_probe_count).collect::<Vec<_>>()
let requested_count =
probe_config.random_nodes_per_interval.try_into().unwrap_or(usize::MAX);
let random_recipients = {
let mut rng = rand::thread_rng();
let mut seen_candidates = 0usize;
let mut reservoir = Vec::with_capacity(requested_count);
let graph_read_only = network_graph.read_only();
for (node_id, _node_info) in graph_read_only.nodes().unordered_iter() {
if *node_id == our_node_id {
continue;
}
if let Ok(pubkey) = node_id.as_pubkey() {
seen_candidates += 1;
if reservoir.len() < requested_count {
reservoir.push(pubkey);
} else if requested_count != 0 {
let replace_index =
rand::Rng::gen_range(&mut rng, 0..seen_candidates);
if replace_index < requested_count {
reservoir[replace_index] = pubkey;
}
}
}
}
if seen_candidates == 0 {
lightning::log_warn!(
&*logger,
"Random probe skipped: no graph nodes available"
);
continue;
}
reservoir

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/events/mod.rs
Comment on lines +27 to +33
pub(crate) struct EventContext {
pub(crate) channel_manager: Arc<ChannelManager>,
pub(crate) bitcoind_client: Arc<BitcoindClient>,
pub(crate) network_graph: Arc<NetworkGraph>,
pub(crate) keys_manager: Arc<crate::KeysManager>,
pub(crate) bump_tx_event_handler: Arc<BumpTxEventHandler>,
pub(crate) peer_manager: Arc<PeerManager>,
Copy link

Copilot AI Apr 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EventContext.keys_manager is typed as Arc<crate::KeysManager>, but the crate does not appear to define/export a KeysManager type alias (it’s only imported in main.rs). This will fail to compile. Use Arc<lightning::sign::KeysManager> (and import KeysManager here), or re-export KeysManager from the crate root (e.g., pub(crate) use lightning::sign::KeysManager;) and keep the current path consistent across modules.

Copilot uses AI. Check for mistakes.
Comment thread src/probing.rs Outdated
Comment on lines +547 to +561
let requested_count =
probe_config.random_nodes_per_interval.try_into().unwrap_or(usize::MAX);
let random_recipients = {
let graph_read_only = network_graph.read_only();
let mut rng = thread_rng();
reservoir_sample(
graph_read_only.nodes().unordered_iter().filter_map(|(node_id, _node_info)| {
if *node_id == our_node_id {
return None;
}
node_id.as_pubkey().ok()
}),
requested_count,
&mut rng,
)
Copy link

Copilot AI Apr 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

random_nodes_per_interval is converted with try_into().unwrap_or(usize::MAX). If the value doesn’t fit in usize (or is simply very large), this can drive reservoir_sample to attempt allocating a Vec with an enormous capacity, risking an OOM and process crash from a config value. Use a checked conversion and either cap the value to a safe upper bound (and/or the graph node count) or treat overflow/too-large values as a configuration error.

Suggested change
let requested_count =
probe_config.random_nodes_per_interval.try_into().unwrap_or(usize::MAX);
let random_recipients = {
let graph_read_only = network_graph.read_only();
let mut rng = thread_rng();
reservoir_sample(
graph_read_only.nodes().unordered_iter().filter_map(|(node_id, _node_info)| {
if *node_id == our_node_id {
return None;
}
node_id.as_pubkey().ok()
}),
requested_count,
&mut rng,
)
let random_recipients = {
let graph_read_only = network_graph.read_only();
let eligible_node_count = graph_read_only
.nodes()
.unordered_iter()
.filter(|(node_id, _node_info)| **node_id != our_node_id)
.count();
if eligible_node_count == 0 {
Vec::new()
} else {
let requested_count = match usize::try_from(
probe_config.random_nodes_per_interval,
) {
Ok(count) => count.min(eligible_node_count),
Err(_) => {
lightning::log_warn!(
&*logger,
"random_nodes_per_interval exceeds supported range; capping to {} eligible graph nodes",
eligible_node_count
);
eligible_node_count
}
};
if requested_count == 0 {
Vec::new()
} else {
let mut rng = thread_rng();
reservoir_sample(
graph_read_only.nodes().unordered_iter().filter_map(
|(node_id, _node_info)| {
if *node_id == our_node_id {
return None;
}
node_id.as_pubkey().ok()
},
),
requested_count,
&mut rng,
)
}
}

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/probing.rs Outdated
return Vec::new();
}

let mut reservoir = Vec::with_capacity(sample_size);
Copy link

Copilot AI Apr 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reservoir_sample pre-allocates Vec::with_capacity(sample_size). When sample_size comes from config (random_nodes_per_interval), a large value can trigger a huge allocation attempt even if the candidate population is small, potentially panicking/aborting the node. Avoid pre-allocating to sample_size directly (e.g., cap the capacity to a safe max or reserve incrementally based on observed population).

Suggested change
let mut reservoir = Vec::with_capacity(sample_size);
let mut reservoir = Vec::new();

Copilot uses AI. Check for mistakes.
dzdidi added 10 commits April 24, 2026 13:22
Signed-off-by: dzdidi <dzdidi@protonmail.com>
Signed-off-by: dzdidi <dzdidi@protonmail.com>
Signed-off-by: dzdidi <dzdidi@protonmail.com>
Signed-off-by: dzdidi <dzdidi@protonmail.com>
Signed-off-by: dzdidi <dzdidi@protonmail.com>
Signed-off-by: dzdidi <dzdidi@protonmail.com>
Signed-off-by: dzdidi <dzdidi@protonmail.com>
Signed-off-by: dzdidi <dzdidi@protonmail.com>
Signed-off-by: dzdidi <dzdidi@protonmail.com>
Signed-off-by: dzdidi <dzdidi@protonmail.com>
@dzdidi dzdidi force-pushed the feat/mixed-probing-strategy branch from a3eb9d1 to 54948c8 Compare April 24, 2026 10:27
@dzdidi dzdidi changed the base branch from tmp/file-rename to fix/no-route-error April 24, 2026 10:28
dzdidi added 6 commits April 24, 2026 13:52
Signed-off-by: dzdidi <dzdidi@protonmail.com>
Signed-off-by: dzdidi <dzdidi@protonmail.com>
Signed-off-by: dzdidi <dzdidi@protonmail.com>
Signed-off-by: dzdidi <dzdidi@protonmail.com>
Signed-off-by: dzdidi <dzdidi@protonmail.com>
Signed-off-by: dzdidi <dzdidi@protonmail.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/probing/runner.rs
Comment thread src/probing/runner.rs
Comment thread src/main.rs
Comment thread config.example.toml
dzdidi added 3 commits April 24, 2026 14:47
Changed
- Added typed error support with thiserror in Cargo.toml and updated lockfile (Cargo.toml, Cargo.lock).
- Replaced startup Result<_, ()> with typed errors in src/args.rs, including structured config/create-dir failures.
- Decoupled config from CLI by introducing shared runtime DTOs in src/runtime_config.rs and switching src/config.rs + src/probing/runner.rs to use them.
- Extracted payment/state domain types from entrypoint into src/state.rs, and re-exported from src/main.rs so existing module usage stays stable.
- Replaced app-owned Result<(), ()> in CLI peer/channel helpers with CliError (src/cli.rs), and updated call sites to print actionable errors.
- Hardened multiple runtime panic points in startup/shutdown + event flow:
- safer handling for key-seed read/write, chain sync/load failures, resolver/socket conversion, background processor shutdown (src/main.rs)
- safer handling for missing preimages, tx decode/deserialize failures, persistence write failures, and output sweeper tracking errors (src/events.rs)
- Added explicit fallback diagnostics when disk state cannot be opened/deserialized (src/disk.rs).
- Tightened validation rules for config invariants (src/config.rs) and added new tests.
- Fixed odd-length hex acceptance bug and added tests (src/hex_utils.rs).
- Added parse-peer tests in CLI (src/cli.rs).

Key structural outcomes
- config no longer depends on cli (src/config.rs now depends on src/runtime_config.rs).
- Payment model moved out of main into a dedicated state module (src/state.rs).
- App-level “unit error” signatures are removed from args/CLI paths.

Signed-off-by: dzdidi <dzdidi@protonmail.com>
Signed-off-by: dzdidi <dzdidi@protonmail.com>
- Removed remaining runtime unwrap/expect from event handling paths in src/events.rs.
- Replaced poisoned-lock panics with guarded handling + user-visible errors for:
- inbound/outbound payment state
- probe tracker state
- Replaced direct prompt flush panics with non-fatal flush handling throughout src/events.rs.
- Removed get_mut(...).unwrap() in payment-failure handling and used safe conditional update.
- Updated startup chain-info fetch in src/main.rs to handle get_blockchain_info failures gracefully.
- Verified src/main.rs has no runtime unwrap/expect left (only one commented example line remains).

Signed-off-by: dzdidi <dzdidi@protonmail.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 19 out of 20 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/events/mod.rs
Comment on lines +234 to +252
for (id, payment) in outbound.payments.iter_mut() {
if *id == payment_id {
payment.preimage = Some(payment_preimage);
payment.status = HTLCStatus::Succeeded;
println!(
"\nEVENT: successfully sent payment of {} millisatoshis{} from \
payment hash {} with preimage {}",
payment.amt_msat,
if let Some(fee) = fee_paid_msat {
format!(" (fee {} msat)", fee)
} else {
"".to_string()
},
payment_hash,
payment_preimage
);
print!("> ");
let _ = std::io::stdout().flush();
}
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Event::PaymentSent, iterating over all outbound payments to find a single payment_id is unnecessary and continues scanning even after a match is found. Using outbound.payments.get_mut(&payment_id) (and returning early/breaking after updating) would avoid O(n) work and makes it clearer that only one entry should be updated/logged.

Suggested change
for (id, payment) in outbound.payments.iter_mut() {
if *id == payment_id {
payment.preimage = Some(payment_preimage);
payment.status = HTLCStatus::Succeeded;
println!(
"\nEVENT: successfully sent payment of {} millisatoshis{} from \
payment hash {} with preimage {}",
payment.amt_msat,
if let Some(fee) = fee_paid_msat {
format!(" (fee {} msat)", fee)
} else {
"".to_string()
},
payment_hash,
payment_preimage
);
print!("> ");
let _ = std::io::stdout().flush();
}
if let Some(payment) = outbound.payments.get_mut(&payment_id) {
payment.preimage = Some(payment_preimage);
payment.status = HTLCStatus::Succeeded;
println!(
"\nEVENT: successfully sent payment of {} millisatoshis{} from \
payment hash {} with preimage {}",
payment.amt_msat,
if let Some(fee) = fee_paid_msat {
format!(" (fee {} msat)", fee)
} else {
"".to_string()
},
payment_hash,
payment_preimage
);
print!("> ");
let _ = std::io::stdout().flush();

Copilot uses AI. Check for mistakes.
Comment thread src/probing/runner.rs
Comment on lines +124 to +137
let requested_count =
probe_config.random_nodes_per_interval.try_into().unwrap_or(usize::MAX);
let random_recipients = {
let graph_read_only = network_graph.read_only();
let mut rng = thread_rng();
reservoir_sample(
graph_read_only.nodes().unordered_iter().filter_map(|(node_id, _)| {
if *node_id == our_node_id {
return None;
}
node_id.as_pubkey().ok()
}),
requested_count,
&mut rng,
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

requested_count falls back to usize::MAX on conversion failure, which can lead to attempting extremely large allocations/iterations when sampling graph nodes (especially on 32-bit targets). Consider clamping random_nodes_per_interval to a sane upper bound (and/or the current graph node count) and returning a config validation error when it exceeds that bound instead of using usize::MAX.

Copilot uses AI. Check for mistakes.
Comment thread src/probing/runner.rs
Comment on lines +346 to +353
let mut reservoir = Vec::with_capacity(sample_size);
let mut seen = 0usize;

for candidate in candidates {
seen += 1;
if reservoir.len() < sample_size {
reservoir.push(candidate);
continue;
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reservoir_sample pre-allocates with Vec::with_capacity(sample_size). If sample_size is user-controlled (via config) and large, this can cause very large allocations or OOM. Consider bounding the capacity (e.g., min(sample_size, max_cap)), and/or building the reservoir without pre-allocating to the full requested size when it exceeds the candidate population.

Copilot uses AI. Check for mistakes.
Signed-off-by: dzdidi <dzdidi@protonmail.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 36 out of 37 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/events/mod.rs
Comment on lines +34 to +39
pub(crate) channel_manager: Arc<ChannelManager>,
pub(crate) bitcoind_client: Arc<BitcoindClient>,
pub(crate) network_graph: Arc<NetworkGraph>,
pub(crate) keys_manager: Arc<crate::KeysManager>,
pub(crate) bump_tx_event_handler: Arc<BumpTxEventHandler>,
pub(crate) peer_manager: Arc<PeerManager>,
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EventContext stores keys_manager as Arc<crate::KeysManager>, but the crate doesn't define/re-export a KeysManager type (it's lightning::sign::KeysManager in main.rs). This looks like a compile error; change the field type/import to use lightning::sign::KeysManager (or re-export it from the crate if that’s the intent).

Copilot uses AI. Check for mistakes.
Comment thread src/probing/runner.rs
Comment on lines +128 to +131
let requested_count =
probe_config.random_nodes_per_interval.try_into().unwrap_or(usize::MAX);
let random_recipients = {
let graph_read_only = network_graph.read_only();
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

requested_count falls back to usize::MAX when random_nodes_per_interval doesn’t fit in usize. That value is then used as Vec::with_capacity(sample_size) inside reservoir_sample, which can attempt an enormous allocation and crash/abort. Clamp requested_count to a sane upper bound (e.g. graph node count) and avoid using usize::MAX as a sentinel capacity.

Suggested change
let requested_count =
probe_config.random_nodes_per_interval.try_into().unwrap_or(usize::MAX);
let random_recipients = {
let graph_read_only = network_graph.read_only();
let random_recipients = {
let graph_read_only = network_graph.read_only();
let available_node_count = graph_read_only
.nodes()
.unordered_iter()
.filter(|(node_id, _)| **node_id != our_node_id)
.count();
let requested_count = probe_config
.random_nodes_per_interval
.try_into()
.unwrap_or(available_node_count)
.min(available_node_count);

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants