fix(sessions): ancestor-chain ownership — Cursor session close / audit / worklog were dead#153
Merged
Merged
Conversation
…ssion close was dead
QA finding (extension 0.1.6 full functional pass, 2026-06-11):
axme_begin_close returned "No active AXME session found" on every Cursor
extension install, which silently killed session close, the audit
pipeline and worklog updates for the whole channel.
Root cause: hooks record ownerPpid = getClaudeCodePid() (their
grandparent above the sh wrapper). Under Claude Code that PID equals the
MCP server's PARENT — one claude process spawns both, so the strict
`ownerPpid === process.ppid` equality worked. Cursor adds a layer:
hooks hang off the cursor-server main process while the MCP server is a
child of the EXTENSION HOST:
cursor-server(A) ─┬─ sh → hook ownerPpid = A
└─ exthost(B) → server process.ppid = B ≠ A
The stale-adoption fallback never fired either — A is alive.
Fix: getOwnAncestorPids(maxDepth=4) walks the server's ancestor chain
(Linux: /proc, microseconds; macOS: ps per level; Windows: whole chain
in ONE powershell CIM call) and ownership checks now test membership in
that set. chain[0] is process.ppid, so Claude Code behavior is bit-for-
bit unchanged; Cursor matches at chain[1]. Applied to all three sites:
getOwnedSessionIdForLogging, cleanupAndExit, auditOrphansInBackground.
Stale-adoption fallback untouched.
Verification:
- 613/613 tests (5 new in test/session-ownership.test.ts), tsc, build.
- E2E against the built dist/server.js with a real bash interposer
reproducing the Cursor topology: mapping owned by the server's
grandparent -> begin_close returns the checklist; control with an
ALIVE unrelated owner pid -> still "No active AXME session found"
(matching is selective); dead owner pid -> stale-adoption still
fires (VS Code reload recovery preserved).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The CI matrix caught a real defect in the ancestor-chain fix: the lazy
`require("node:child_process")` inside readParentPidPosix and
getOwnAncestorPidsWindows throws ReferenceError under ESM (the package
is type:module), gets swallowed by the try/catch, and silently degrades
the chain to [process.ppid] — i.e. the Cursor session-ownership fix
would have been a no-op on macOS and Windows. Linux was unaffected
(/proc path, no exec), which is why the local 613/613 run was green
while the macOS CI leg failed the new ">=2 ancestors" assertion
(chain came back as a single element).
Replaced with a top-level static import; node:child_process is
side-effect-free and cheap.
Note: the parallel `ensureAxmeSessionForClaude` E2E in audit-dedup
flaked once during the full local run — 3x green in isolation and a
full-suite rerun is 0-fail; pre-existing flake (also seen during the
v0.6.0 release prep), unrelated to this change.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
The full functional QA pass on extension 0.1.6 (today) found its one real bug — and it's a launch blocker for the Cursor channel:
axme_begin_closereturns "No active AXME session found" on every Cursor extension install, which silently kills session close, the audit pipeline, and worklog updates for the whole channel. (Almost certainly also the root cause behind the earlier remote-machine report of being unable to close a session.)Root cause
Hooks record
ownerPpid = getClaudeCodePid()— their grandparent, one step above theshwrapper. The MCP server matched with strictownerPpid === process.ppid:One extra process layer → strict equality never matches. The stale-adoption fallback doesn't fire either:
cursor-serveris alive. QA captured the live process tree: hooks' owner1382370(cursor-server) vs server's ppid1383835(exthost).Fix
getOwnAncestorPids(maxDepth=4)walks the server's ancestor chain and ownership checks test membership in that set:/proc/<pid>/statwalk (microseconds; reuses the parsergetClaudeCodePidalready had)ps -o ppid=per level[process.ppid]= the old strict behaviorchain[0]isprocess.ppid, so Claude Code behavior is unchanged; Cursor matches atchain[1]. Applied to all three ownership sites (getOwnedSessionIdForLogging,cleanupAndExit,auditOrphansInBackground). The stale-adoption fallback (VS Code reload recovery) is untouched.Verification
test/session-ownership.test.ts: chain[0]==ppid, depth, uniqueness, Claude-Code membership invariant), tsc clean, build cleandist/server.jswith a real bash interposer reproducing the Cursor topology:begin_closereturns the checklist ✅No active AXME session found✅ (matching is selective, not always-true)Risk
PID-reuse window widens from one pid to ≤4 ancestor pids — bounded by: same workspace storage, mapping file must exist, and the matched pid must be an ancestor of a live MCP server. Negligible vs. the channel being completely broken.
🤖 Generated with Claude Code