Skip to content

feat: relay server missing features#354

Open
mskrzypkows wants to merge 7 commits intomainfrom
claude/issue-129-20260313-1533
Open

feat: relay server missing features#354
mskrzypkows wants to merge 7 commits intomainfrom
claude/issue-129-20260313-1533

Conversation

@mskrzypkows
Copy link
Copy Markdown
Contributor

@mskrzypkows mskrzypkows commented Apr 24, 2026

github-actions Bot and others added 6 commits March 13, 2026 15:51
- Start Prometheus monitoring server on configured monitoring_addr
- Handle relay swarm events: ReservationReqAccepted/Denied, ReservationTimedOut,
  CircuitReqAccepted/Denied, CircuitClosed — with structured logging
- Track connection metrics (connection_total, active_connections) via RELAY_METRICS
  on SwarmEvent::ConnectionEstablished and ConnectionClosed
- Log version, git hash, and build time on startup using pluto_core::version
- Make RelayMetrics fields public; add PeerWithPeerClusterLabels::new() constructor
- Remove #[allow(missing_docs)] from p2p.rs; add module doc comment
- Add vise-exporter to regular (non-dev) dependencies
- Update serve_addr_metrics test: remove #[ignore], test monitoring endpoint,
  check for prometheus output

Closes #129

Co-authored-by: Bohdan Ohorodnii <varex83@users.noreply.github.com>
- Dereference LazyLock for VERSION in tracing macro: %*VERSION
- Use inc_by(1)/dec_by(1) instead of inc()/dec() on vise::Gauge
- Add .. to ReservationReqDenied and CircuitReqDenied patterns for missing status field
- Merge pluto_p2p imports and reorder metrics imports alphabetically
- Reformat ListenerClosed, ReservationReqAccepted, ReservationReqDenied match arms per rustfmt

Co-authored-by: Bohdan Ohorodnii <varex83@users.noreply.github.com>
Manually merges changes from main branch commit e41e6d9:
- Add crates/p2p/src/bootnode.rs and crates/p2p/src/relay.rs
- Add crates/p2p/examples/bootnode.rs
- Update p2p peer.rs: add Default for MutablePeer, addr_infos_from_p2p_addrs
- Update p2p gater.rs: remove Arc wrapper from relay peers
- Update p2p utils.rs: add multi_addrs_via_relay helper
- Update p2p lib.rs: export bootnode and relay modules
- Update p2p Cargo.toml: add backon, reqwest, url deps
- Fix crypto share indexing: use 1-indexed keys throughout
- Remove MathError enum, replace with direct DivisionByZero variant
- Fix cluster test_cluster.rs: use 1-indexed share lookup
- Fix app obolapi: add FailedToConvertShareIndexToU8/MathOverflow errors
- Add app/src/utils.rs with archive helpers
- Add core/src/parasigdb/memory.rs
- Update Cargo.toml and Cargo.lock with flate2, tar deps
- Update relay_server example: simplify TCP addr, add docblock

Co-authored-by: Bohdan Ohorodnii <varex83@users.noreply.github.com>
Manual merge of 4e68ef9 into this branch (shallow clone prevents git merge).

- Add crates/core/src/signeddata.rs and signeddata testdata
- Add pub mod signeddata to crates/core/src/lib.rs
- Update crates/core/Cargo.toml with signeddata deps
- Update crates/crypto/Cargo.toml: remove pluto-core dep (avoids cycle)
- Update crates/crypto/src/tblsconv.rs: remove core conversions (avoids cycle)
- Add/update crates/eth2api/ spec types for signeddata
- Add/update crates/eth2util/src/types.rs
- Update Cargo.lock

Co-authored-by: Lautaro Emanuel <emlautarom1@users.noreply.github.com>
@mskrzypkows
Copy link
Copy Markdown
Contributor Author

@claude review

@claude
Copy link
Copy Markdown

claude Bot commented Apr 24, 2026

Claude finished @mskrzypkows's task in 4m 39s —— View job


PR Review

  • Read changed files and gather context
  • Review crates/core/src/parasigdb/memory.rs (new file)
  • Review crates/eth2api/src/spec/ssz_types.rs (new file)
  • Review crates/relay-server/src/p2p.rs
  • Review crates/relay-server/src/web.rs
  • Review crates/relay-server/src/metrics.rs
  • Review crates/cli/src/commands/relay.rs
  • Review crates/p2p/src/utils.rs
  • Review crates/relay-server/Cargo.toml
  • Post final review

Summary

This PR adds a Prometheus monitoring server for the relay, refactors swarm event handling, makes relay metrics accessible, adds SSZ type wrappers for the eth2 API, and adds a new parasigdb/memory.rs file. The relay-server improvements are solid overall, but there are some critical structural issues with the parasigdb module addition.


Critical Issues

crates/core/src/parasigdb/memory.rs — Orphaned file, will not compile

The new file lives in crates/core/src/parasigdb/memory.rs (parasigdb with an extra a), but the existing module is parsigdb (no a). Several problems:

  1. Not declared as a module anywhere. There is no mod.rs in the parasigdb/ directory, no mod parasigdb in any lib.rs, and no pub mod memory anywhere that references this file. The file is unreachable dead code.

  2. References a non-existent submodule. Line 8 imports:

    use crate::parasigdb::metrics::PARASIG_DB_METRICS;

    But crates/core/src/parasigdb/ only contains memory.rs — there is no metrics.rs. The existing metrics.rs lives in parsigdb/ (no a) and exports PARSIG_DB_METRICS (not PARASIG_DB_METRICS).

  3. References a non-existent test file. At the bottom:

    #[path = "memory_internal_test.rs"]
    mod memory_internal_test;

    memory_internal_test.rs exists in crates/core/src/parsigdb/, not in parasigdb/.

  4. MemDBError drops SignedDataError. The existing parsigdb/memory.rs has SignedDataError(#[from] SignedDataError) in MemDBError and calls sig.signed_data.message_root().map_err(...). The new file calls sig.signed_data.message_root() without handling a potential error — if message_root() can return an Err, this will fail to compile or silently drop errors depending on the API.

This file appears to be an in-progress replacement that was committed prematurely. It needs either a proper parasigdb/mod.rs + metrics.rs + integration into core/lib.rs, or to be removed until ready. Fix this →


Moderate Issues

crates/relay-server/src/web.rs:251-252 — Magic dummy port numbers in ENR handler

(Some((ip, tcp_p)), None) => (ip, tcp_p, 9999), // Dummy UDP port
(None, Some((ip, udp_p))) => (ip, 9999, udp_p), // Dummy TCP port

Port 9999 is used as a dummy when only one transport is available, with no explanation and no validation. The ENR will be published with a port that doesn't actually accept connections. Consider returning an error instead (consistent with the (None, None) case below), or documenting why a dummy value is acceptable here. Fix this →

crates/core/src/parasigdb/memory.rs:291unwrap() after is_none() check (less idiomatic)

if deadliner_rx.is_none() {
    warn!("Deadliner channel is not available");
    return;
}
let mut deadliner_rx = deadliner_rx.unwrap();

The existing parsigdb/memory.rs correctly uses let Some(mut deadliner_rx) = self.deadliner.c() else { ... }. The new file reverts to a less idiomatic pattern that would trigger a Clippy warning. Fix this →

crates/relay-server/src/p2p.rsnetwork_sent_bytes_total / network_received_bytes_total never populated

The RelayMetrics struct now has network_sent_bytes_total and network_received_bytes_total fields (public), but handle_swarm_event only updates connection_total and active_connections. The byte-count metrics are never incremented and will always read zero. Either wire them up from SwarmEvent::OutboundProbeOk/similar events, or remove them from the struct until they are implemented. Fix this →


Minor Issues

crates/core/src/parasigdb/memory.rs:343-345keys_by_duty accumulates duplicate keys

inner
    .keys_by_duty
    .entry(k.duty.clone())
    .or_insert_with(Vec::new)
    .push(k.clone());   // always pushes, even if k is already present

Every new signature for a given (duty, pubkey) pushes the same key into the keys_by_duty Vec again. With N validators signing the same duty+pubkey, the key appears N times, so trim() calls entries.remove(&key) N times (all but the first are no-ops). This wastes memory and is shared with the existing parsigdb/memory.rs. A HashSet<Key> or a pre-existence check would be cleaner.

crates/relay-server/src/p2p.rs:185ListenerClosed has no log entry

NewListenAddr and ExpiredListenAddr both log at debug!, but ListenerClosed produces no log. This makes it hard to diagnose listener lifecycle in production. Fix this →

crates/eth2api/src/spec/ssz_types.rs — Missing From<SszVector<T, SIZE>> for Vec<T>

SszList has both From<Vec<T>> and From<SszList<T,MAX>> for Vec<T>, but SszVector only has From<Vec<T>> with no inverse conversion. For consistency, add:

impl<T, const SIZE: usize> From<SszVector<T, SIZE>> for Vec<T> {
    fn from(value: SszVector<T, SIZE>) -> Self { value.0 }
}

crates/cli/src/commands/relay.rs:402-415serve_addr_enr_ext_host test may be timing-dependent

The test asserts enr.ip().unwrap().is_loopback(), relying on the external host resolver not having run yet when the request is made. Since www.google.com resolves quickly and the resolver interval could tick before the test request, this could be flaky in slow CI or if the resolver's first tick runs immediately. Consider using a hostname guaranteed not to resolve (e.g. invalid.invalid) or controlling the resolver interval in tests.


Positive Aspects

  • Clean AddrUpdate enum in p2p.rs decouples address management from the event loop — good pattern.
  • Monitoring server is properly integrated with graceful shutdown and matches the ENR server pattern.
  • Version/git-hash logging on startup is a valuable operational improvement.
  • PeerWithPeerClusterLabels::new() constructor is a clean addition.
  • vise-exporter promoted from dev-deps to deps — correct since it's now used in production code.
  • SSZ type wrappers (SszList, SszVector, BitList, BitVector) are well-implemented with correct TreeHash semantics and good test coverage.
  • Good integration tests in relay.rs covering the relay lifecycle, ENR, multiaddrs, external IP/host, and metrics endpoints.

@mskrzypkows mskrzypkows changed the title Claude/issue 129 20260313 1533 feat: relay server missing features Apr 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant