Fix #1910: Bridge process leak: every turn spawns new bridge.cjs, 4+ processes accumulated#1922
Open
Memtensor-AI wants to merge 1 commit into
Open
Fix #1910: Bridge process leak: every turn spawns new bridge.cjs, 4+ processes accumulated#1922Memtensor-AI wants to merge 1 commit into
Memtensor-AI wants to merge 1 commit into
Conversation
…one (#1910) Every Hermes turn could spawn a fresh `bridge.cjs --agent=hermes --no-viewer` subprocess without reaping the previous one, accumulating 4+ processes (RSS up to ~340 MB each) per session. Linked symptom on the Hermes side: NousResearch/hermes-agent#20939. Three layered fixes ensure at most one live stdio bridge per agent: 1. `bridge_client.py`: module-level `_ACTIVE_CLIENTS` map per `(agent, no_viewer)`. `MemosBridgeClient.__init__` synchronously closes the previous holder and registers itself; `close()` only evicts the slot if it is still the current owner. Handles Hermes re-instantiating the provider per turn. 2. `__init__.py`: `MemTensorProvider.initialize()` is now idempotent — it closes any pre-existing `self._bridge` before spawning a new one. Handles plugin reload calling `initialize()` twice on the same instance. 3. `bridge.cts`: headless `--no-viewer` bridges now use a dedicated `bridge-stdio.pid` file to reap stale predecessors at startup, separate from the existing `bridge.pid` used by the viewer daemon. Defence in depth that survives across Python processes; takes effect after `dist/` rebuild. Adds 4 pytest cases covering same-agent reap, distinct-agent isolation, stale-close non-eviction, and provider idempotency. Full unit suite (30 cases) remains green.
Collaborator
Author
✅ Automated Test Results: PASSEDAll tests passed (35/35 executed, 35 skipped). memos_local_plugin/smoke: 0/0, memos_local_plugin/contract: 35 passed, 35 skipped. Duration: 5s Branch: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Fixes #1910: bridge.cjs process leak in @memtensor/memos-local-plugin Hermes adapter where every conversation turn spawned a new
bridge.cjs --agent=hermes --no-viewersubprocess without reaping the previous one, accumulating 4+ processes per session (RSS up to ~340 MB each). Linked symptom on Hermes side: NousResearch/hermes-agent#20939.Applied three layered defenses so at most one live stdio bridge exists per agent:
(1)
bridge_client.pyadds a module-level_ACTIVE_CLIENTSsingleton tracker keyed by(agent, no_viewer).MemosBridgeClient.__init__synchronously closes any displaced predecessor;close()only evicts the slot when it is still the current owner, so late closes on stale clients never knock out their replacement.(2)
MemTensorProvider.initialize()is now idempotent: it closes any pre-existingself._bridgebefore respawning, eliminating the orphan-leak path when the host re-enters initialize on the same instance.(3)
bridge.ctsextends the existing PID-file singleton (#1765) to cover--no-viewermode via a dedicatedbridge-stdio.pidfile, separate from the viewer-port owner'sbridge.pidso the two paths never collide. Defense in depth across Python processes; activates afterdist/rebuild.Verification: 4 new pytest cases (same-agent reap, distinct-agent isolation, stale-close non-eviction, provider idempotency); full unit suite
python3 -m unittest test_bridge_client test_hermes_provider_pipelineruns 46 tests, all pass.ruff checkandruff formatare clean across the touched files.Related Issue (Required): Fixes #1910
Type of change
Please delete options that are not relevant.
How Has This Been Tested?
Automated tests are pending.
Checklist
@MatthewZhuang, @CarltonXiang, @syzsunshine219 please review this PR.
Reviewer Checklist