add session resume supervisor for faster codex rotation#139
add session resume supervisor for faster codex rotation#139
Conversation
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 Walkthroughwalkthroughadds an interactive session supervisor for the codex cli, a backend toggle and plugin flag, abort-signal support for quota probing, storage transaction path threading and tie-break fixes, routing helpers, and many tests plus integration into the codex wrapper. (≤50 words) changes
sequence diagram(s)sequenceDiagram
participant cli as cli invocation
participant sup as session supervisor
participant lock as storage lock
participant mgr as account manager
participant probe as quota probe
participant child as codex child process
participant sess as session store
participant monitor as monitor loop
cli->>sup: runCodexSupervisorIfEnabled(rawArgs)
alt supervisor enabled & interactive
sup->>lock: acquire supervisor lock
lock-->>sup: lock acquired
sup->>mgr: select launchable account
mgr->>probe: fetchCodexQuotaSnapshot(signal)
probe-->>mgr: quota snapshot
mgr-->>sup: chosen account
sup->>child: spawn codex with resume args
child-->>sess: append session .jsonl
sup->>sess: scan session metadata / bind session
loop monitor
monitor->>probe: probe current account quota
probe-->>monitor: quota status
alt quota near exhaustion
monitor->>child: request restart (SIGINT→SIGTERM→SIGKILL)
child-->>monitor: exit
monitor->>mgr: prepare/select next account (prewarm)
monitor->>child: spawn resumed child
end
end
child-->>sup: exit code
sup->>lock: release lock
lock-->>sup: released
sup-->>cli: return exit code
else supervisor disabled or non-interactive
sup-->>cli: null (fallback to direct forward)
end
estimated code review effort🎯 5 (critical) | ⏱️ ~120 minutes flagged issues (explicit):
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
✨ Simplify code
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 8
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
lib/storage.ts (3)
2324-2337:⚠️ Potential issue | 🟡 Minorstorage path comparison correctly prevents stale snapshot reuse after rotation, but missing regression test for path-mismatch scenario.
the dual check at lib/storage.ts:2331-2333 handles the supervisor rotation scenario correctly:
transactionState?.active— transaction still in progresstransactionState.storagePath === activeStoragePath— storage path hasn't changed since transaction startif
setStoragePath()is called during a supervisor-driven account rotation, subsequentexportAccountscalls correctly fall back to a fresh read instead of returning data from the old path.however: no regression test exists for this scenario. test/storage.test.ts has basic exportAccounts tests (file exists, no accounts) but none that verify correct behavior when storage path changes mid-transaction. add a test that:
- starts a transaction via
withAccountStorageTransaction- calls
setStoragePath()to a different path- calls
exportAccountsfrom within the handler or after- verify it reads from the new path, not the stale snapshot
the deadlock concern about
withAccountStorageTransaction((current) => Promise.resolve(current))fallback is valid only if exportAccounts is called from within an existing transaction handler. the current codebase doesn't do this, but lib/storage.ts lacks re-entrancy guards onwithStorageLock(lines 174-181), which could allow this pattern to deadlock if introduced. this risk should be documented or guarded explicitly.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/storage.ts` around lines 2324 - 2337, Add a regression test that verifies exportAccounts does not return a stale snapshot when the storage path changes mid-transaction: write a test that calls withAccountStorageTransaction (to create an active transaction via withAccountStorageTransaction), then calls setStoragePath(...) to a different path, and then calls exportAccounts (either inside the transaction handler or immediately after) and asserts the returned data is read from the new path; ensure the test exercises the code path that uses transactionSnapshotContext, getStoragePath, and the fallback to withAccountStorageTransaction so the conditional transactionState?.active && transactionState.storagePath === activeStoragePath is validated. Also add a short note or additional test to cover potential deadlock re-entrancy by exercising exportAccounts while inside a withStorageLock-held handler (using withStorageLock) to document/guard the risk.
2331-2337:⚠️ Potential issue | 🟡 Minorpotential deadlock if storagePath changes during active transaction before exportAccounts is called.
exportAccounts safely avoids deadlock when called from inside
withAccountStorageTransactionvia a storagePath check: iftransactionState.active && transactionState.storagePath === activeStoragePath, it reuses the snapshot without re-acquiring the mutex. however, if supervisor rotation triggerssetStoragePath()to a new path while inside a transaction callback, and thenexportAccountsis called from that handler (as demonstrated in test/named-backup-export.test.ts:347-362), the fallback towithAccountStorageTransaction()would deadlock trying to re-acquirestorageMutex.current usage is safe because
setStoragePath()is only called at command start, never mid-execution. but the code is fragile: if future auth rotation logic changesstoragePathduring an active transaction, callers would hang.either:
- document explicitly that
setStoragePath()must never be called during activewithAccountStorageTransactioncallbacks- add a
withoutLockvariant for internal reads when already holding the mutex (lib/storage.ts)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/storage.ts` around lines 2331 - 2337, The current exportAccounts path can deadlock if setStoragePath changes during an active transaction because it falls back to withAccountStorageTransaction and re-acquires storageMutex; add an internal "withoutLock" read path to avoid re-locking when already in a transaction: implement a new helper (e.g., readAccountStorageWithoutLock or withAccountStorageSnapshotNoLock) that returns the current transactionState.snapshot or the current storage snapshot when not in a transaction without acquiring storageMutex, update exportAccounts to use that helper when transactionState.active is true (or when storagePath equality check indicates the caller already holds the snapshot), and mark the helper internal/private and document that it must not acquire the mutex so callers inside withAccountStorageTransaction use it instead of withAccountStorageTransaction to avoid re-entrancy deadlocks; keep withAccountStorageTransaction unchanged for external callers.
1909-1935:⚠️ Potential issue | 🟠 Majormissing regression test for try/finally cleanup fix.
the try/finally at
lib/storage.ts:1927-1933correctly ensuresstate.active = falseeven if handler throws. this fixes the stale snapshot bug inexportAccountsatlib/storage.ts:2331which reuses snapshot whentransactionState?.activeis true.however, per coding guidelines for
lib/**, this change must cite affected regression tests (vitest). no test exists covering: handler throws →state.activestill cleared → subsequentexportAccountsdoes fresh read. add this totest/storage.test.ts.also check
withAccountAndFlaggedStorageTransactionatlib/storage.ts:1989— it has the same pattern and likely needs the same test coverage.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/storage.ts` around lines 1909 - 1935, Add a vitest regression test in test/storage.test.ts that verifies when withAccountStorageTransaction's handler throws, the finally block clears state.active so subsequent calls to exportAccounts perform a fresh load (i.e., do not reuse a stale snapshot); specifically, simulate a handler that throws inside withAccountStorageTransaction and then call exportAccounts to confirm it reads updated data, and add the same test scenario for withAccountAndFlaggedStorageTransaction to ensure both transaction implementations (transactionSnapshotContext, state.active) are correctly cleared after exceptions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@lib/config.ts`:
- Line 159: The scheduler's default (DEFAULT_REMAINING_PERCENT_THRESHOLD) is out
of sync with the config value (preemptiveQuotaRemainingPercent5h) so update
lib/preemptive-quota-scheduler.ts to read the configured value via
getPreemptiveQuotaRemainingPercent5h() (or call the config accessor you have)
instead of hardcoding DEFAULT_REMAINING_PERCENT_THRESHOLD, and initialize
primaryRemainingPercentThreshold and secondaryRemainingPercentThreshold from
that accessor; alternatively, if you prefer constants, change
DEFAULT_REMAINING_PERCENT_THRESHOLD to 10 to match lib/config.ts:159 and
add/adjust tests for preemptive quota rotation to cover both code paths that set
thresholds so regression is caught.
In `@scripts/codex-supervisor.js`:
- Around line 413-419: The deduplication key currently includes
item.refreshToken which is unstable; change the key construction used with
seen.has to exclude refreshToken and use only stable identifiers (item.index,
item.accountId, item.email) so that accounts with refreshed tokens aren't
treated as distinct; update the code that builds the key variable and any
related checks (seen.has, seen.add) to reference the new key composed from those
stable fields only.
In `@scripts/codex.js`:
- Around line 529-540: The supervisor path intentionally calls
runCodexSupervisorIfEnabled followed by autoSyncManagerActiveSelectionIfEnabled
so auto-sync runs unconditionally even when the supervisor returns early; add a
regression test that enables the supervisor and exercises the early-return case
(where runCodexSupervisorIfEnabled returns a non-null supervisedExitCode) and
assert that autoSyncManagerActiveSelectionIfEnabled (and underlying
codex-manager autoSyncActiveAccountToCodex / queuedRefresh behavior) was invoked
before the function returns, and that forwardToRealCodex is not called in that
path; use mocks/stubs for runCodexSupervisorIfEnabled,
autoSyncManagerActiveSelectionIfEnabled, and forwardToRealCodex to verify the
call ordering and side effects.
In `@test/codex-bin-wrapper.test.ts`:
- Around line 596-610: Add a regression test that covers the interactive command
path: create a fixture (use createWrapperFixture and
writeSupervisorRuntimeFixture), create the fake codex binary with
createFakeCodexBin, then call runWrapper with no args (simulating an interactive
`codex` invocation) and CODEX_AUTH_CLI_SESSION_SUPERVISOR=1; assert that the
supervisor behavior differs from the non-interactive path (per
scripts/codex-supervisor.js) — e.g., verify the wrapper invokes the
resume/interactive supervisor wrapper instead of forwarding with the file auth
store injection used in the existing test (check stdout/stderr or exit status
accordingly).
In `@test/codex-supervisor.test.ts`:
- Around line 1-6: Add a regression test that exercises concurrent lock
acquisition for withSupervisorStorageLock: create a temporary storage dir,
import the test API (__testOnly as supervisorTestApi) and spawn two async
callers that race to call supervisorTestApi.withSupervisorStorageLock; assert
only one caller holds the lock at a time (e.g., the second waits until the first
releases) and that a stale lock file is detected/cleaned by simulating an old
lock file (set mtime/contents) before calling; ensure the test cleans up temp
files and uses deterministic timing/Promises to avoid flakiness.
- Around line 91-111: The test currently uses platform "win32" which skips the
SIGINT step; update or add a test that calls
supervisorTestApi?.requestChildRestart with a non-win32 platform (e.g., "linux")
and the same AbortController flow, then assert the child.kill mock was called
with "SIGINT" as well as "SIGTERM" and "SIGKILL" in the expected sequence;
reference the FakeChild class, requestChildRestart call, controller.abort(), and
the child.kill expectations so the test covers the non-win32 SIGINT → SIGTERM →
SIGKILL sequence.
In `@test/plugin-config.test.ts`:
- Around line 991-1001: Add a test case to verify the config-only override for
getCodexCliSessionSupervisor: ensure CODEX_AUTH_CLI_SESSION_SUPERVISOR is unset,
call getCodexCliSessionSupervisor({ codexCliSessionSupervisor: true }) and
expect true, and then clean up by ensuring the env var remains unset; this
mirrors the pattern used for getFastSession and complements the existing default
and env-override tests.
In `@test/quota-probe.test.ts`:
- Around line 173-205: Add a test that verifies fetchCodexQuotaSnapshot
immediately rejects when passed a signal that's already aborted: create an
AbortController, call controller.abort() before invoking
fetchCodexQuotaSnapshot, stub global fetch (vi.stubGlobal("fetch", ...)) to a
mock, then assert the promise rejects with an abort error and that fetch was
never called; this covers the early-return branch in fetchCodexQuotaSnapshot
(the pre-aborted signal check in lib/quota-probe.ts).
---
Outside diff comments:
In `@lib/storage.ts`:
- Around line 2324-2337: Add a regression test that verifies exportAccounts does
not return a stale snapshot when the storage path changes mid-transaction: write
a test that calls withAccountStorageTransaction (to create an active transaction
via withAccountStorageTransaction), then calls setStoragePath(...) to a
different path, and then calls exportAccounts (either inside the transaction
handler or immediately after) and asserts the returned data is read from the new
path; ensure the test exercises the code path that uses
transactionSnapshotContext, getStoragePath, and the fallback to
withAccountStorageTransaction so the conditional transactionState?.active &&
transactionState.storagePath === activeStoragePath is validated. Also add a
short note or additional test to cover potential deadlock re-entrancy by
exercising exportAccounts while inside a withStorageLock-held handler (using
withStorageLock) to document/guard the risk.
- Around line 2331-2337: The current exportAccounts path can deadlock if
setStoragePath changes during an active transaction because it falls back to
withAccountStorageTransaction and re-acquires storageMutex; add an internal
"withoutLock" read path to avoid re-locking when already in a transaction:
implement a new helper (e.g., readAccountStorageWithoutLock or
withAccountStorageSnapshotNoLock) that returns the current
transactionState.snapshot or the current storage snapshot when not in a
transaction without acquiring storageMutex, update exportAccounts to use that
helper when transactionState.active is true (or when storagePath equality check
indicates the caller already holds the snapshot), and mark the helper
internal/private and document that it must not acquire the mutex so callers
inside withAccountStorageTransaction use it instead of
withAccountStorageTransaction to avoid re-entrancy deadlocks; keep
withAccountStorageTransaction unchanged for external callers.
- Around line 1909-1935: Add a vitest regression test in test/storage.test.ts
that verifies when withAccountStorageTransaction's handler throws, the finally
block clears state.active so subsequent calls to exportAccounts perform a fresh
load (i.e., do not reuse a stale snapshot); specifically, simulate a handler
that throws inside withAccountStorageTransaction and then call exportAccounts to
confirm it reads updated data, and add the same test scenario for
withAccountAndFlaggedStorageTransaction to ensure both transaction
implementations (transactionSnapshotContext, state.active) are correctly cleared
after exceptions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 888f1f7f-29bf-4a9d-8cd2-035939b924a3
📒 Files selected for processing (13)
lib/codex-manager/settings-hub.tslib/config.tslib/quota-probe.tslib/schemas.tslib/storage.tslib/ui/copy.tsscripts/codex-supervisor.jsscripts/codex.jstest/codex-bin-wrapper.test.tstest/codex-supervisor.test.tstest/plugin-config.test.tstest/quota-probe.test.tstest/settings-hub-utils.test.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (2)
lib/**
⚙️ CodeRabbit configuration file
focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest) and that new queues handle EBUSY/429 scenarios. check for logging that leaks tokens or emails.
Files:
lib/ui/copy.tslib/schemas.tslib/codex-manager/settings-hub.tslib/storage.tslib/config.tslib/quota-probe.ts
test/**
⚙️ CodeRabbit configuration file
tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior. reject changes that mock real secrets or skip assertions.
Files:
test/settings-hub-utils.test.tstest/codex-bin-wrapper.test.tstest/codex-supervisor.test.tstest/quota-probe.test.tstest/plugin-config.test.ts
🔇 Additional comments (37)
lib/storage.ts (2)
168-172: type extension for storagePath tracking looks correct.the added
storagePath: stringfield enables downstream checks to detect when the storage path has changed since the transaction started. this prevents stale snapshot reuse during supervisor-driven account rotation wheresetStoragePath()may be called.
1937-1993: consistent try/finally pattern applied here as well.the same state cleanup fix is correctly applied to the dual-storage transaction wrapper. the rollback logic inside persist() can throw
AggregateErrorat line 1969, and the try/finally ensuresstate.activeis still cleared in that case.lib/quota-probe.ts (4)
308-315: abort signal support looks clean.the interface extension and helper are straightforward.
createAbortErrorfollows the standard pattern for abort errors withname = "AbortError".
341-343: early abort check before each model probe is correct.checking
options.signal?.abortedbefore starting a probe iteration prevents unnecessary network calls when cancellation is requested between retries.
368-387: abort signal wiring to per-request controller is well-structured.the sequence handles the race correctly: check if already aborted before adding listener, then cleanup in finally block. the
{ once: true }onaddEventListenerprevents listener leaks.
426-435: abort error propagation preserves original error context.rethrowing the original error when
signal.abortedis true (line 427) keeps the stack trace intact. the final check after exhausting probes (lines 433-435) ensures abort is surfaced even if all models failed for other reasons first.lib/ui/copy.ts (1)
75-75: ui label for session supervisor toggle looks good.matches the setting key
codexCliSessionSupervisorand is consistent with other experimental setting labels in this file.lib/schemas.ts (1)
47-47: schema field for supervisor toggle is correctly defined.optional boolean matches the pattern of other feature flags in the schema.
lib/codex-manager/settings-hub.ts (5)
187-199: toggle key addition to union type is correct.
codexCliSessionSupervisoris added alongside other backend toggle keys. note that it's intentionally not added to anyBACKEND_CATEGORY_OPTIONSsince it's experimental-only.
275-284: experimental settings action type extended correctly.
toggle-session-supervisorfollows the naming pattern of other experimental actions liketoggle-refresh-guardian.
292-296: backend toggle option definition is complete.includes key, label, and description matching the pattern of other toggle options.
2540-2544: menu item wiring uses correct fallback chain.
draft.codexCliSessionSupervisor ?? BACKEND_DEFAULTS.codexCliSessionSupervisor ?? falseresolves correctly sinceBACKEND_DEFAULTS.codexCliSessionSupervisoris explicitlyfalseperlib/config.ts:148.
2601-2611: toggle handler inverts value correctly.the negation logic
!(draft.codexCliSessionSupervisor ?? BACKEND_DEFAULTS.codexCliSessionSupervisor ?? false)matches the display logic, ensuring consistent behavior.test/settings-hub-utils.test.ts (3)
137-154: async cleanup with retry is good for windows filesystem quirks.the retry logic handles
ENOTEMPTY,EPERM,EBUSYwith linear backoff. this addresses windows file locking issues during test cleanup.
205-209: separate async afterEach for directory cleanup is correct.vitest runs multiple afterEach hooks in registration order. separating async cleanup from synchronous env restoration is cleaner.
712-723: test for session supervisor toggle is deterministic and complete.queues toggle action followed by save, then asserts the config value flipped from
falsetotrue. follows the pattern of other experimental settings tests in this file.test/codex-bin-wrapper.test.ts (3)
60-63: fixture now includes supervisor script.copying
codex-supervisor.jsto the fixture ensures the wrapper can import it during tests.
67-110: supervisor runtime fixture stubs are minimal and correct.the stubs provide just enough functionality for the supervisor to run:
config.jsreturns supervisor enabled with quota settingsaccounts.jsprovides a mockAccountManagerwith required methodsquota-probe.jsreturns null (no quota data needed for this test path)note:
lib/codex-bin-wrapper.test.ts:93returns a mock account withemail: "healthy@example.com"which is acceptable for test fixtures.
410-413: windows decoy test updated with supervisor script.mirrors the change in
createWrapperFixture()to keep fixtures consistent.test/plugin-config.test.ts (2)
141-144: lgtm - default config assertion updated correctly.the new default values
preemptiveQuotaRemainingPercent5h: 10andcodexCliSessionSupervisor: falseare consistently applied across all config assertion blocks in this file.
974-974: lgtm - default threshold change from 5 to 10.the assertion at
test/plugin-config.test.ts:974correctly reflects the new default of10forpreemptiveQuotaRemainingPercent5h, matching the change inlib/config.ts:159.lib/config.ts (2)
861-878: lgtm - new accessor follows established pattern.the
getCodexCliSessionSupervisoraccessor correctly usesresolveBooleanSettingwith env varCODEX_AUTH_CLI_SESSION_SUPERVISOR, config keycodexCliSessionSupervisor, and defaultfalse. the jsdoc accurately describes the behavior.
148-148: lgtm - default config entry added.
codexCliSessionSupervisor: falseis correctly added toDEFAULT_PLUGIN_CONFIG, matching the accessor's default and the test assertions.test/codex-supervisor.test.ts (3)
10-27: lgtm - retry-based cleanup handles windows edge cases.the
removeDirectoryWithRetryhelper correctly handlesENOTEMPTY,EPERM, andEBUSYerrors with exponential backoff, which addresses windows filesystem locking behavior.
43-67: lgtm - boundary test validates scan limit correctly.the test creates exactly 199 preamble lines plus the session_meta line, placing it at position 200 (the limit). this correctly validates the
SESSION_META_SCAN_LINE_LIMIT = 200boundary inscripts/codex-supervisor.js:20.
69-89: lgtm - over-limit test validates scan cutoff.placing session_meta at line 201 (after 200 preamble lines) correctly tests that the scanner stops at the limit and returns
null.scripts/codex-supervisor.js (11)
8-30: lgtm - timing constants are reasonable.the reduced timings (poll=750ms, idle=400ms, probe=2-4s) support faster rotation detection. the platform-specific
DEFAULT_SIGNAL_TIMEOUT_MS(125ms on win32 vs 500ms) correctly accounts for windows signal handling differences.
33-53: lgtm - sleep correctly handles abort signals.the sleep implementation properly cleans up the abort listener and returns
falseon abort,trueon completion. the guard against double-settle is correct.
378-381: potential TOCTOU race in stale lock cleanup.at
scripts/codex-supervisor.js:378-380, between checkingisSupervisorLockStaleand callingsafeUnlink, another process could acquire the lock. this is a benign race (the new lock holder will simply retry), but worth noting. the current implementation is acceptable given the TTL fallback.
498-539: lgtm - quota evaluation logic is correct.the function correctly prioritizes 429 rate-limit detection, then checks preemptive quota thresholds when enabled. the remaining percentage calculation
100 - usedPercentis correct.
773-815: lgtm - session meta extraction is well-bounded.the scanner correctly limits to 200 non-empty lines, handles malformed JSON, and cleans up readline/stream resources in the finally block.
890-914: child.kill return value not checked.
child.kill("SIGTERM")returnsfalseif the process doesn't exist, but this is not checked. while the subsequentexitCode !== nullcheck handles this case, the code could be clearer. this is minor since the behavior is correct.
1010-1011: manager reassignment may lose state.at
scripts/codex-supervisor.js:1010, ifcurrentState.managerisnull(which shouldn't happen givenwithLockedManageralways returns the manager), the outermanageris retained. however, ifwithLockedManagerreturns a newer manager with updated state, and then fails on a subsequent call returning the stale outer manager, state could be inconsistent. the current implementation looks safe sincewithLockedManagerreloads from disk each time.
184-186: lgtm - no token or pii leakage in logs.
relaunchNoticeonly logs session IDs, wait times, and rotation reasons. no access tokens, refresh tokens, or email addresses are included in log output.
750-752: lgtm - symlink handling prevents traversal issues.skipping symbolic links at
scripts/codex-supervisor.js:750-752prevents potential infinite loops or symlink-based path traversal attacks.
765-771: lgtm - cwd normalization handles windows case-insensitivity.
normalizeCwdcorrectly lowercases paths on win32 and normalizes path separators, enabling reliable cwd matching across platforms.
1183-1245: lgtm - entry point has proper signal cleanup.the
runCodexSupervisorIfEnabledfunction correctly registers and unregisters SIGINT/SIGTERM handlers, returnsnullwhen supervisor is disabled (allowing fallback to non-supervised path), and handles abort errors gracefully.
There was a problem hiding this comment.
Actionable comments posted: 10
♻️ Duplicate comments (1)
test/codex-supervisor.test.ts (1)
334-374:⚠️ Potential issue | 🟠 Majoradd the actual two-caller lock race here.
this only covers stale-lock cleanup. it still never runs two
withLockedManagercallers against the same storage path, so a regression in the mutual-exclusion path inscripts/codex-supervisor.js:322-395would still pass. add a deterministic race where the first holder blocks on a promise and the second caller can only enter after release.as per coding guidelines,
test/**: tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/codex-supervisor.test.ts` around lines 334 - 374, Add a deterministic two-caller race to this test: create a deferred Promise (resolve/reject handlers) and use supervisorTestApi.withLockedManager with two concurrent callers against the same runtime/storage path—have the first caller (using FakeManager and createFakeRuntime) acquire the lock and await the deferred Promise to simulate a held lock, then start the second caller which should block until you resolve the deferred Promise; after resolving, assert the second caller runs only after the first releases (e.g., order of results or timestamps) and keep the existing stale-lock/EPERM unlink behavior; reference withLockedManager, supervisorTestApi, FakeManager, createFakeRuntime and the unlinkSpy mock to place the new concurrent-caller logic in this test while keeping it deterministic via the deferred Promise and vitest async assertions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@lib/preemptive-quota-scheduler.ts`:
- Line 27: Split the single DEFAULT_REMAINING_PERCENT_THRESHOLD into two
distinct constants (e.g., DEFAULT_REMAINING_PERCENT_THRESHOLD_PRIMARY and
DEFAULT_REMAINING_PERCENT_THRESHOLD_SECONDARY) and update the
PreemptiveQuotaScheduler constructor so it seeds the primary and secondary
windows with their respective defaults instead of the single shared value; also
ensure the config default named remainingPercentThresholdSecondary remains 5 and
adjust code that falls back to use the new secondary constant. Add a Vitest
regression adjacent to the existing preemptive-quota-scheduler tests (near
test/preemptive-quota-scheduler.test.ts) that verifies omitting
remainingPercentThresholdSecondary keeps the 7d fallback at 5 and does not
change behavior when PRIMARY is 10, and mention the changed symbols
DEFAULT_REMAINING_PERCENT_THRESHOLD_PRIMARY,
DEFAULT_REMAINING_PERCENT_THRESHOLD_SECONDARY, PreemptiveQuotaScheduler
constructor, and remainingPercentThresholdSecondary in your commit message.
In `@lib/storage.ts`:
- Around line 1916-1933: The transaction binds to a mutable global storage path,
but loadAccountsInternal and saveAccountsUnlocked inside the transaction still
call helpers that read getStoragePath() dynamically, allowing cross-path writes;
fix by capturing storagePath and passing it into loadAccountsInternal and into
the persist closure so both load and persist use the captured path (or
alternatively make persist immediately throw if getStoragePath() !==
state.storagePath), i.e., update the call sites around
transactionSnapshotContext.run (including the loadAccountsInternal invocation
and the persist closure that calls saveAccountsUnlocked) to accept/forward the
captured storagePath and ensure exportAccounts() and the mismatch check see
consistent paths; update/annotate tests in test/storage.test.ts (959-1097) to
cover the cross-path write case per lib/** guidelines.
In `@scripts/benchmark-session-supervisor.mjs`:
- Around line 202-267: The benchmark loop reuses probe snapshots so later
iterations see cache hits and report artificially low durations; fix by
disabling or clearing the snapshot cache before each iteration — e.g. set
process.env.CODEX_AUTH_CLI_SESSION_SNAPSHOT_CACHE_TTL_MS = "0" (or call the
test-only cache-clear helper on the imported api if available) at the top of the
for loop that uses buildRuntime / api (the loop that measures serialDurations,
overlapDurations, prewarmedDurations), ensuring the environment var or
cache-clear happens before each buildRuntime/api call so every iteration
measures a fresh probe.
In `@scripts/codex-supervisor.js`:
- Around line 1334-1358: The escalation from SIGTERM to SIGKILL in
requestChildRestart is redundant on Windows because SIGTERM/SIGKILL both
forcibly terminate; update requestChildRestart to either skip the final SIGKILL
step when platform === "win32" (i.e., only attempt SIGINT/SIGTERM there) or add
a clear comment in the function explaining the Windows behavior and why the
SIGKILL branch is intentionally left (to avoid unnecessary waits); make the
change around the kill calls (child.kill("SIGTERM") / child.kill("SIGKILL")) so
reviewers can see the platform-specific behavior documented or the SIGKILL call
gated by platform.
- Around line 78-94: The abortablePromise function currently leaves the abort
listener attached when the input promise resolves first; update abortablePromise
to always remove the onAbort listener when the race settles: keep the existing
onAbort handler and add a cleanup step that calls
signal.removeEventListener("abort", onAbort) when the original promise resolves
or rejects (e.g., attach a finally/then/catch on the input promise before
returning the race), ensuring the listener is removed on normal completion as
well as when the abort fires; reference the abortablePromise function and the
onAbort listener to locate where to add the cleanup.
- Around line 796-867: probeAccountSnapshot has a race where two callers can
both miss snapshotProbeCache.pending between the read of pendingEntry and the
later set of fetchPromise; to fix, allocate and insert a placeholder pending
entry into snapshotProbeCache for the cacheKey before starting the actual async
fetch so subsequent callers will see it and await it: create a deferred/promise
placeholder, set snapshotProbeCache.set(cacheKey, { snapshot: current?.snapshot
?? null, expiresAt: current?.expiresAt ?? 0, pending: placeholder }), then start
the async work that resolves/rejects that placeholder and in its finally replace
the cache entry with snapshot/expiresAt (clearing pending) as done in the
current finally block; update references to fetchPromise/pendingEntry
accordingly so all callers await the same pending promise.
In `@test/codex-bin-wrapper.test.ts`:
- Around line 613-627: The test currently only asserts the generic forwarded
output and can pass even when the supervisor path isn't used; modify the test
(the "uses the supervisor wrapper for non-interactive commands when enabled"
case that calls createWrapperFixture, writeSupervisorRuntimeFixture,
createFakeCodexBin and runWrapper) to assert a supervisor-only
marker/interaction—for example assert result.stdout contains a distinct token
emitted only by the supervisor path (e.g. "SUPERVISOR:FORWARDED" or similar) and
update createFakeCodexBin to emit that marker when
CODEX_AUTH_CLI_SESSION_SUPERVISOR is set so the test fails if scripts/codex.js
doesn't actually use the supervisor branch. Ensure the added assertion is
deterministic and uses the existing runWrapper/ createFakeCodexBin/
writeSupervisorRuntimeFixture helpers.
In `@test/codex-supervisor.test.ts`:
- Around line 475-529: Replace the brittle wall-clock assertions comparing
serialElapsedMs/overlapElapsedMs with explicit synchronization: remove
performance.now measurements and the expect(..toBeLessThan(..)) checks and
instead instrument the test flow (using promise latches/Deferreds or vitest fake
timers) to drive and observe the probe/restart phases deterministically; for
example, add latches inside your FakeManager/createFakeRuntime or FakeChild to
signal "probe started", "probe blocked", "restart requested" and "probe
finished", then coordinate supervisorTestApi?.requestChildRestart and
supervisorTestApi?.ensureLaunchableAccount so the test awaits those latches and
asserts ordering/concurrency (that the overlap path completes its
ensureLaunchableAccount before the serial path unblocks) using promise resolves
rather than elapsed ms. Ensure both occurrences (around
requestChildRestart/ensureLaunchableAccount pairs) are updated similarly.
In `@test/quota-probe.test.ts`:
- Around line 174-205: The test currently relies on a microtask hop (await
Promise.resolve()) which is brittle; change the fetch mock and test for
fetchCodexQuotaSnapshot to use an explicit "fetch started" latch: have the mock
expose or resolve a Promise as soon as it wires init.signal.addEventListener
(i.e., when the handler for "abort" is registered), await that latch in the test
before calling controller.abort(), then assert the pending promise rejects; this
ensures deterministic ordering regardless of internal awaits in
fetchCodexQuotaSnapshot and removes the fragile microtask race.
In `@test/storage.test.ts`:
- Around line 997-1044: The test only covers the rejection path; add a companion
test that verifies withAccountStorageTransaction reloads fresh storage after a
successful (non-throwing) transaction to catch a post-success AsyncLocalStorage
leak: create a test that uses saveAccounts to seed an initial account, call
withAccountStorageTransaction with a handler that calls persist(...) and returns
normally (no throw), then call saveAccounts again with a different account,
export via exportAccounts(exportPath) and assert the
exported.accounts[0].accountId is the post-success value; use the same helpers
(saveAccounts, withAccountStorageTransaction, exportAccounts, exportPath) and
vitest async patterns to keep the test deterministic.
---
Duplicate comments:
In `@test/codex-supervisor.test.ts`:
- Around line 334-374: Add a deterministic two-caller race to this test: create
a deferred Promise (resolve/reject handlers) and use
supervisorTestApi.withLockedManager with two concurrent callers against the same
runtime/storage path—have the first caller (using FakeManager and
createFakeRuntime) acquire the lock and await the deferred Promise to simulate a
held lock, then start the second caller which should block until you resolve the
deferred Promise; after resolving, assert the second caller runs only after the
first releases (e.g., order of results or timestamps) and keep the existing
stale-lock/EPERM unlink behavior; reference withLockedManager,
supervisorTestApi, FakeManager, createFakeRuntime and the unlinkSpy mock to
place the new concurrent-caller logic in this test while keeping it
deterministic via the deferred Promise and vitest async assertions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 3118c672-3ac1-4af7-a929-61a9e9811578
📒 Files selected for processing (11)
lib/preemptive-quota-scheduler.tslib/storage.tspackage.jsonscripts/benchmark-session-supervisor.mjsscripts/codex-supervisor.jstest/codex-bin-wrapper.test.tstest/codex-supervisor.test.tstest/plugin-config.test.tstest/preemptive-quota-scheduler.test.tstest/quota-probe.test.tstest/storage.test.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (2)
lib/**
⚙️ CodeRabbit configuration file
focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest) and that new queues handle EBUSY/429 scenarios. check for logging that leaks tokens or emails.
Files:
lib/preemptive-quota-scheduler.tslib/storage.ts
test/**
⚙️ CodeRabbit configuration file
tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior. reject changes that mock real secrets or skip assertions.
Files:
test/preemptive-quota-scheduler.test.tstest/quota-probe.test.tstest/storage.test.tstest/codex-bin-wrapper.test.tstest/plugin-config.test.tstest/codex-supervisor.test.ts
🔇 Additional comments (10)
scripts/codex-supervisor.js (10)
504-508: lgtm: dedup key fix applied.the dedup key now correctly excludes
refreshTokenand uses only stable identifiers (index,accountId,
1367-1381: lgtm: main supervision loop is well-bounded and properly cleans up.the loop is bounded by
MAX_SESSION_RESTARTS, abort signals are checked at each iteration, and the monitor controller is properly aborted before awaiting the monitor promise. good defensive design.
1747-1774: lgtm: test-only exports properly gated.the
__testOnlyexport is correctly guarded byNODE_ENV === "test", allowing comprehensive unit testing of internal functions without exposing them in production.
413-486: lgtm: file-based lock implementation is robust.the lock acquisition loop at line 435 is properly bounded by abort signal checks and deadline enforcement. stale lock detection at lines 469-472 handles edge cases gracefully. the exclusive
wxflag at line 441 provides atomic lock creation.
1175-1202: lgtm: safe directory traversal implementation.
listJsonlFilescorrectly skips symbolic links at line 1189-1190, preventing potential symlink attacks. the iterative approach avoids stack overflow on deeply nested directories.
869-895: lgtm: account marking with proper fallbacks.
markAccountUnavailablecorrectly clears the probe cache at line 870 before marking, ensuring fresh data on reselection. the fallback frommarkRateLimitedWithReasontomarkAccountCoolingDownhandles different manager API versions gracefully.
1212-1254: lgtm: safe session metadata extraction with proper cleanup.
extractSessionMetacorrectly cleans up the stream and readline interface in the finally block (lines 1248-1251). the scan limit at line 1227 prevents excessive I/O on large session files.
1-40: regression tests already cover the key supervisor scenarios.test/codex-supervisor.test.ts includes coverage for all major risks: line 334 handles stale lock cleanup with windows unlink failures (EPERM/EBUSY retries), line 446 validates snapshot cache ttl expiration, line 399 tests near-limit account skipping, and lines 505+ exercise overlapped handoff timing. line 614 also verifies probe batching doesn't add serial pause for degraded accounts, and line 660 confirms failed probe degradation. solid foundation.
39-39: cache isolation is already in place. test/codex-supervisor.test.ts:230 clearssnapshotProbeCachein theafterEachhook, preventing cross-test pollution. no action needed.
1706-1708: no issue found.runCodexSupervisorIfEnabledis async at line 1683, so returningforwardToRealCodex(codexBin, initialArgs)at line 1707 is correct—it returns a Promise that resolves to an exit code. the caller atscripts/codex.js:529properly awaits it, sosupervisedExitCodereceives the resolved number, not the Promise object. this is standard async/await pattern.> Likely an incorrect or invalid review comment.
There was a problem hiding this comment.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
lib/storage.ts (2)
1767-1776:⚠️ Potential issue | 🟠 Majorlegacy migration still escapes the captured storage binding.
lib/storage.ts:1767-1776capturesstoragePathfor the main load, butmigrateLegacyProjectStorageIfNeeded()atlib/storage.ts:1004-1102still re-readsgetStoragePathState()aftercleanupStaleRotatingBackupArtifacts(path)has already awaited, andloadAccounts()atlib/storage.ts:1603-1605still passes rawsaveAccounts. if the storage path flips whileloadAccountsInternal()is in flight, migration can load/delete files from one project and persist them into another. thread the captured storage state/path into the migration helper too, and add a vitest path-drift regression aroundtest/storage.test.ts:2427-2894.As per coding guidelines,
lib/**: focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/storage.ts` around lines 1767 - 1776, The migration helper currently re-reads storage state causing path drift: update migrateLegacyProjectStorageIfNeeded to accept the captured storagePath/state (from loadAccountsInternal) instead of calling getStoragePathState() internally, thread that new parameter through loadAccountsInternal and its caller loadAccounts (which currently passes saveAccounts) and ensure cleanupStaleRotatingBackupArtifacts(path) and getIntentionalResetMarkerPath(path) use the same captured path; update all call sites to pass the captured path/state and adjust types/signatures accordingly, and add a vitest regression (in test/storage.test.ts around the 2427-2894 range) that simulates a storagePath flip while loadAccountsInternal is in-flight to assert migration uses the original captured path and does not move files between projects.
2124-2145:⚠️ Potential issue | 🔴 Criticalcombined transactions still route flagged writes through the mutable global path.
lib/storage.ts:2124-2145binds account saves to the capturedstoragePath, butsaveFlaggedAccountsUnlocked()still resolves its target fromgetFlaggedAccountsPath()atlib/storage.ts:2401-2405. ifsetStoragePathDirect()drifts insidewithAccountAndFlaggedStorageTransaction(), the account file stays on project a while the flagged file is written into project b, so this helper no longer gives you an atomic paired update. capture the flagged path up front as well, and add a vitest drift regression next totest/storage.test.ts:1153-1256.possible direction
async function saveFlaggedAccountsUnlocked( - storage: FlaggedAccountStorageV1, + storage: FlaggedAccountStorageV1, + storagePath = getFlaggedAccountsPath(), ): Promise<void> { - const path = getFlaggedAccountsPath(); + const path = storagePath; const markerPath = getIntentionalResetMarkerPath(path); // ... }return withStorageLock(async () => { const storagePath = getStoragePath(); + const flaggedStoragePath = getFlaggedAccountsPath(); const state = { snapshot: await loadAccountsInternal( (storage) => saveAccountsUnlocked(storage, storagePath), storagePath, ), @@ - await saveFlaggedAccountsUnlocked(flaggedStorage); + await saveFlaggedAccountsUnlocked(flaggedStorage, flaggedStoragePath);As per coding guidelines,
lib/**: focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/storage.ts` around lines 2124 - 2145, The combined-transaction helper binds account file writes to the captured storagePath but still lets saveFlaggedAccountsUnlocked() resolve its path via getFlaggedAccountsPath() at write time, allowing setStoragePathDirect() to drift the flagged file to a different project and breaking atomicity; fix by capturing the flagged path up-front inside withAccountAndFlaggedStorageTransaction (capture result of getFlaggedAccountsPath() alongside storagePath when calling loadAccountsInternal/saveAccountsUnlocked and pass that captured flaggedPath into saveFlaggedAccountsUnlocked instead of letting it resolve internally), update functions saveFlaggedAccountsUnlocked/getFlaggedAccountsPath usages to accept an explicit path parameter, and add a vitest regression next to test/storage.test.ts:1153-1256 that exercises setStoragePathDirect() drift during the persist() code path to ensure paired writes remain atomic.
♻️ Duplicate comments (3)
test/codex-bin-wrapper.test.ts (1)
637-672:⚠️ Potential issue | 🟠 Majorassert a supervisor-only side effect in these two cases.
these assertions are still satisfied by the normal wrapper path in
scripts/codex.js:528-542: one forwarded invocation and one auto-sync. that meanstest/codex-bin-wrapper.test.ts:637-672can pass even ifrunCodexSupervisorIfEnabled()is bypassed. reuse thesupervisor\nmarker check fromtest/codex-bin-wrapper.test.ts:618-635, or add another supervisor-only signal here.As per coding guidelines,
test/**: tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/codex-bin-wrapper.test.ts` around lines 637 - 672, Tests assert a supervisor-only side effect but currently pass via the normal wrapper path; update the two cases in test/codex-bin-wrapper.test.ts (the "still auto-syncs once..." and "supports interactive..." tests) to require a supervisor-only marker so they fail if runCodexSupervisorIfEnabled() is bypassed—either reuse the "supervisor\n" marker check used earlier in the file or add a new supervisor-only signal written only by the supervisor wrapper, then assert that marker (e.g., readFileSync(markerPath) or a new supervisor marker file) is present in addition to the existing forwarded/auto-sync checks to ensure the supervisor path actually executed.test/codex-supervisor.test.ts (2)
870-914:⚠️ Potential issue | 🟠 Majordrop wall-clock performance assertions; assert overlap/batching via explicit probe ordering.
lines 899-914 (
test/codex-supervisor.test.ts:899-914) assertelapsedMs < 170usingperformance.now(). this is brittle on noisy runners and does not prove batching semantics directly.proposed deterministic pattern
- const startedAt = performance.now(); - const result = await supervisorTestApi?.ensureLaunchableAccount( + const d1 = createDeferred<void>(); + const d2 = createDeferred<void>(); + const d3 = createDeferred<void>(); + const started: string[] = []; + const resultPromise = supervisorTestApi?.ensureLaunchableAccount( runtime, {}, undefined, { probeTimeoutMs: 250 }, ); - const elapsedMs = performance.now() - startedAt; + + // assert concurrent probe startup/ordering via latches or onFetchStart hooks + // then release all degraded gates together + d1.resolve(); d2.resolve(); d3.resolve(); + const result = await resultPromise; @@ - expect(elapsedMs).toBeLessThan(170);As per coding guidelines,
test/**: tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/codex-supervisor.test.ts` around lines 870 - 914, The test is brittle because it asserts wall-clock timing (performance.now elapsedMs < 170); instead remove the elapsedMs/performance.now assertions and prove batching deterministically by instrumenting the probe path (e.g., FakeManager or the probe handler used by createFakeRuntime) to record probe start/finish order or sequence IDs for each account, then assert that the probes for "degraded-1", "degraded-2", and "degraded-3" were started concurrently/batched (their start sequence IDs are the same batch or their start timestamps fall within one logical batch window) and that "healthy" was selected as the result; update the test that calls supervisorTestApi?.ensureLaunchableAccount (and uses quotaProbeDelayMs/probeTimeoutMs) to assert on those recorded probe order/sequence markers rather than elapsedMs.
529-584:⚠️ Potential issue | 🟠 Majorreplace real sleep-based lock coordination with fake-timer or latch-driven sequencing.
line 574 (
test/codex-supervisor.test.ts:574) usessleep(40)as a correctness gate. this is scheduler-sensitive and can pass/fail for timing reasons instead of lock correctness.proposed deterministic fix
it("serializes concurrent callers behind the supervisor storage lock", async () => { + vi.useFakeTimers(); process.env.CODEX_AUTH_CLI_SESSION_LOCK_WAIT_MS = "1000"; process.env.CODEX_AUTH_CLI_SESSION_LOCK_POLL_MS = "10"; @@ - await supervisorTestApi?.sleep(40); + await vi.advanceTimersByTimeAsync(40); expect(secondHasLock).toBe(false); releaseFirst?.(); + await vi.runAllTimersAsync();As per coding guidelines,
test/**: tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/codex-supervisor.test.ts` around lines 529 - 584, The test uses a real sleep (supervisorTestApi.sleep(40)) to gate correctness; replace that timing-based check with a deterministic latch check: after scheduling the second with supervisorTestApi.withLockedManager, assert the secondEntered promise is still pending (e.g. await Promise.race([secondEntered, Promise.resolve("not-entered")]) and expect the result to be "not-entered") instead of sleeping, then releaseFirst and await secondEntered; locate the code around supervisorTestApi.withLockedManager, the firstEntered/secondEntered promises and releaseFirst latch to implement this deterministic pending-check.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@lib/storage.ts`:
- Around line 816-822: The comparator that chooses latest valid snapshot
currently falls back to descending index, preferring older backups on equal
mtime; change the tie-breaker in the sorting used to compute the latest snapshot
(the map/filter/sort chain in lib/storage.ts that produces the [0]?.snapshot
result) to use ascending index (i.e., prefer lower index so primary/.bak take
precedence) and add a vitest that mirrors the existing storage tests (next to
test/storage.test.ts:3170-3370) which creates multiple snapshots with identical
mtimeMs and asserts that the chosen latestValidPath prefers the earlier-priority
entries (primary, wal, .bak, .bak.1, .bak.2) to lock in correct behavior. Ensure
the test runs deterministically on Windows/coarse-mtime filesystems.
In `@lib/ui/copy.ts`:
- Line 101: The experimental help text is now misnumbered because the new toggle
experimentalSessionSupervisor was inserted before the existing items; update the
static help string (near the entry that lists "1 sync | 2 backup | 3 guard") in
lib/ui/copy.ts so the numbering reflects the new ordering (e.g., insert the new
item's number or renumber subsequent items), or replace the hardcoded sequence
with a generated/templated sequence if preferred; locate the help string and the
experimentalSessionSupervisor key to adjust the copy consistently.
In `@scripts/codex-supervisor.js`:
- Around line 1618-1637: The prewarm task started by
maybeStartPreparedResumeSelection is using the outer supervisor signal and can
continue touching storage after runInteractiveSupervision returns; ensure the
prewarm work is tied to the child/session lifetime by creating and passing a
session-specific cancellation signal (or AbortController) into
maybeStartPreparedResumeSelection (use the same session-bound signal that
runInteractiveSupervision watches), and/or explicitly cancel
preparedResumeSelectionPromise when runInteractiveSupervision is about to return
with no restartDecision; update references to preparedResumeSelectionStarted and
preparedResumeSelectionPromise to respect and propagate that session signal so
the prewarm task is aborted when the session exits without rotating.
- Around line 1592-1601: The catch branch that handles error?.name ===
"QuotaProbeUnavailableError" currently continues immediately and causes a hot
loop; change it to await a backoff delay before retrying and add a
consecutive-failure guard: introduce a consecutiveProbeFailures counter
(increment on QuotaProbeUnavailableError, reset to 0 on successful probe), use
an exponential or capped backoff (e.g., await sleep(Math.min(base *
2**consecutiveProbeFailures, max))) instead of immediate continue, and when
consecutiveProbeFailures exceeds a safe threshold log an error and either
escalate (throw or break) or apply a longer cooldown; keep checks for
monitorController.signal.aborted and AbortError as-is. Ensure the sleep
implementation is used the same way as the normal poll sleep that occurs later
in this loop so you don't bypass the intended I/O pacing.
- Around line 1873-1894: The code returns the promise from
runCodexSupervisorWithRuntime without awaiting, so the finally block removes
SIGINT/SIGTERM handlers (abort) before the supervision finishes; change the
return to await the call to runCodexSupervisorWithRuntime({ codexBin, rawArgs,
buildForwardArgs, forwardToRealCodex, runtime, signal: controller.signal }) so
the try block waits for the supervision to settle and the finally uninstalls
handlers only after completion; ensure loadSupervisorRuntime() and
controller.signal usage remain unchanged and keep the AbortError early-return
behavior.
In `@scripts/codex.js`:
- Around line 529-540: The supervisor exit should not always fall through to
autoSync; change the wrapper around
runCodexSupervisorIfEnabled()/autoSyncManagerActiveSelectionIfEnabled() so that
if runCodexSupervisorIfEnabled returns the abort sentinel (e.g.
supervisedExitCode === 130) you immediately return without calling
autoSyncManagerActiveSelectionIfEnabled(), and for paths where the supervisor
forwarded to the real binary via the forwardToRealCodex callback move or invoke
the pre-sync logic inside that callback instead of running it unconditionally
afterwards; update references to runCodexSupervisorIfEnabled,
autoSyncManagerActiveSelectionIfEnabled, forwardToRealCodex and
supervisedExitCode accordingly and add two vitest regressions in
test/codex-bin-wrapper.test.ts asserting (a) the supervisor early-return/forward
path does not trigger autoSync and (b) the abort (130) path exits before any
sync is performed.
In `@test/codex-bin-wrapper.test.ts`:
- Around line 67-115: The tests leak a real-user lock because
writeSupervisorRuntimeFixture does not stub dist/lib/storage.js; update
writeSupervisorRuntimeFixture to create a dist/lib/storage.js that exports
getStoragePath() (returning a path under the fixtureRoot) so the supervisor uses
a fixture-local lock, or alternatively set CODEX_HOME/HOME to fixtureRoot for
the spawned worker processes; modify the fixture writer to produce the
storage.js stub (export function getStoragePath() { ... }) or ensure environment
variables are forced to fixtureRoot before tests that spawn
scripts/codex-supervisor.js so locks remain hermetic.
---
Outside diff comments:
In `@lib/storage.ts`:
- Around line 1767-1776: The migration helper currently re-reads storage state
causing path drift: update migrateLegacyProjectStorageIfNeeded to accept the
captured storagePath/state (from loadAccountsInternal) instead of calling
getStoragePathState() internally, thread that new parameter through
loadAccountsInternal and its caller loadAccounts (which currently passes
saveAccounts) and ensure cleanupStaleRotatingBackupArtifacts(path) and
getIntentionalResetMarkerPath(path) use the same captured path; update all call
sites to pass the captured path/state and adjust types/signatures accordingly,
and add a vitest regression (in test/storage.test.ts around the 2427-2894 range)
that simulates a storagePath flip while loadAccountsInternal is in-flight to
assert migration uses the original captured path and does not move files between
projects.
- Around line 2124-2145: The combined-transaction helper binds account file
writes to the captured storagePath but still lets saveFlaggedAccountsUnlocked()
resolve its path via getFlaggedAccountsPath() at write time, allowing
setStoragePathDirect() to drift the flagged file to a different project and
breaking atomicity; fix by capturing the flagged path up-front inside
withAccountAndFlaggedStorageTransaction (capture result of
getFlaggedAccountsPath() alongside storagePath when calling
loadAccountsInternal/saveAccountsUnlocked and pass that captured flaggedPath
into saveFlaggedAccountsUnlocked instead of letting it resolve internally),
update functions saveFlaggedAccountsUnlocked/getFlaggedAccountsPath usages to
accept an explicit path parameter, and add a vitest regression next to
test/storage.test.ts:1153-1256 that exercises setStoragePathDirect() drift
during the persist() code path to ensure paired writes remain atomic.
---
Duplicate comments:
In `@test/codex-bin-wrapper.test.ts`:
- Around line 637-672: Tests assert a supervisor-only side effect but currently
pass via the normal wrapper path; update the two cases in
test/codex-bin-wrapper.test.ts (the "still auto-syncs once..." and "supports
interactive..." tests) to require a supervisor-only marker so they fail if
runCodexSupervisorIfEnabled() is bypassed—either reuse the "supervisor\n" marker
check used earlier in the file or add a new supervisor-only signal written only
by the supervisor wrapper, then assert that marker (e.g.,
readFileSync(markerPath) or a new supervisor marker file) is present in addition
to the existing forwarded/auto-sync checks to ensure the supervisor path
actually executed.
In `@test/codex-supervisor.test.ts`:
- Around line 870-914: The test is brittle because it asserts wall-clock timing
(performance.now elapsedMs < 170); instead remove the elapsedMs/performance.now
assertions and prove batching deterministically by instrumenting the probe path
(e.g., FakeManager or the probe handler used by createFakeRuntime) to record
probe start/finish order or sequence IDs for each account, then assert that the
probes for "degraded-1", "degraded-2", and "degraded-3" were started
concurrently/batched (their start sequence IDs are the same batch or their start
timestamps fall within one logical batch window) and that "healthy" was selected
as the result; update the test that calls
supervisorTestApi?.ensureLaunchableAccount (and uses
quotaProbeDelayMs/probeTimeoutMs) to assert on those recorded probe
order/sequence markers rather than elapsedMs.
- Around line 529-584: The test uses a real sleep (supervisorTestApi.sleep(40))
to gate correctness; replace that timing-based check with a deterministic latch
check: after scheduling the second with supervisorTestApi.withLockedManager,
assert the secondEntered promise is still pending (e.g. await
Promise.race([secondEntered, Promise.resolve("not-entered")]) and expect the
result to be "not-entered") instead of sleeping, then releaseFirst and await
secondEntered; locate the code around supervisorTestApi.withLockedManager, the
firstEntered/secondEntered promises and releaseFirst latch to implement this
deterministic pending-check.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 8ca62a89-c0a1-482b-a271-33f360279fc9
⛔ Files ignored due to path filters (1)
package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (16)
lib/codex-manager/settings-hub.tslib/config.tslib/preemptive-quota-scheduler.tslib/schemas.tslib/storage.tslib/ui/copy.tspackage.jsonscripts/benchmark-session-supervisor.mjsscripts/codex-supervisor.jsscripts/codex.jstest/codex-bin-wrapper.test.tstest/codex-supervisor.test.tstest/preemptive-quota-scheduler.test.tstest/quota-probe.test.tstest/settings-hub-utils.test.tstest/storage.test.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (2)
lib/**
⚙️ CodeRabbit configuration file
focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest) and that new queues handle EBUSY/429 scenarios. check for logging that leaks tokens or emails.
Files:
lib/schemas.tslib/preemptive-quota-scheduler.tslib/ui/copy.tslib/storage.tslib/codex-manager/settings-hub.tslib/config.ts
test/**
⚙️ CodeRabbit configuration file
tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior. reject changes that mock real secrets or skip assertions.
Files:
test/quota-probe.test.tstest/settings-hub-utils.test.tstest/codex-bin-wrapper.test.tstest/preemptive-quota-scheduler.test.tstest/storage.test.tstest/codex-supervisor.test.ts
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
lib/storage.ts (1)
2132-2145:⚠️ Potential issue | 🟠 Majorcapture the flagged path in the combined transaction too.
lib/storage.ts:2138-2140pins account writes to the openingstoragePath, but flagged writes still go throughsaveFlaggedAccountsUnlocked(), which resolvesgetFlaggedAccountsPath()at call time. if a caller flipssetStoragePathDirect()insidewithAccountAndFlaggedStorageTransaction(), the account file is written under the original root and the flagged file under the new root, so the transaction is no longer atomic. capture the flagged path up front and pass it through the flagged save call, then add a vitest besidetest/storage.test.ts:1154-1257for this drift case.As per coding guidelines,
lib/**: focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest) and that new queues handle EBUSY/429 scenarios.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/storage.ts` around lines 2132 - 2145, Persist currently writes accounts to the captured storagePath but calls saveFlaggedAccountsUnlocked() which resolves getFlaggedAccountsPath() at call-time, allowing setStoragePathDirect()/withAccountAndFlaggedStorageTransaction() to flip roots and break atomicity; fix by capturing the flagged path at the start of persist (call getFlaggedAccountsPath() once into flaggedPath) and pass that flaggedPath into a new or overloaded saveFlaggedAccountsUnlocked(flaggedStorage, flaggedPath) so both account and flagged writes use the same root for both commit and rollback; update withAccountAndFlaggedStorageTransaction(), persist, and any helpers to accept the explicit flaggedPath, add a vitest beside test/storage.test.ts covering the drift case (flip storage path between account and flagged save) and ensure the new save/queue logic handles EBUSY/429 retry semantics as per lib/** IO/concurrency guidelines and reference the affected tests in the change.
♻️ Duplicate comments (1)
lib/storage.ts (1)
1767-1776:⚠️ Potential issue | 🟠 Majorpin legacy migration to the captured transaction root.
lib/storage.ts:1774-1776still callsmigrateLegacyProjectStorageIfNeeded()with no path context. that helper readsgetStoragePathState()internally, so a concurrentsetStoragePath()can load and delete legacy files from one root while the boundpersistMigrationwrites into another. thread the captured storage path state into the migration helper too, and add a vitest neartest/storage.test.ts:1047-1106that reproduces path drift during transaction startup.As per coding guidelines,
lib/**: focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest) and that new queues handle EBUSY/429 scenarios.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/storage.ts` around lines 1767 - 1776, The migration call in loadAccountsInternal is unsafe because migrateLegacyProjectStorageIfNeeded() reads getStoragePathState() itself; change loadAccountsInternal to pass the captured storage path/state into migrateLegacyProjectStorageIfNeeded (update the helper signature to accept the path or storage state) and use that value instead of letting the helper call getStoragePathState(), so concurrent setStoragePath() cannot cause path drift; update references to migrateLegacyProjectStorageIfNeeded and ensure persistMigration is invoked against the provided path/state (functions: loadAccountsInternal, migrateLegacyProjectStorageIfNeeded, getStoragePathState, setStoragePath, persistMigration) and add a vitest near test/storage.test.ts:1047-1106 that reproduces concurrent path drift during transaction startup to assert the migration used the captured path; also run/update related tests to handle Windows IO/EAGAIN/EBUSY concurrency behaviors as noted.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@scripts/codex-supervisor.js`:
- Around line 1348-1363: The scan is unbounded when sessionId is set but
rolloutPathHint is missing because the files filter uses (sessionId ? true :
entry.mtimeMs >= sinceMs - 2_000), causing
listJsonlFiles/readSessionBindingEntry to iterate every JSONL; add a cached
mapping from sessionId→rolloutPath so subsequent supervisor iterations can
restrict the file list: introduce a module-level Map (e.g.,
sessionIdToRolloutPath) and, when matchSessionBindingEntry finds a binding with
a rollout path, store sessionId→rolloutPath; then, in the files
construction/filtering (the code building files using
listJsonlFiles/readSessionBindingEntry and variables sessionEntries, sinceMs,
rolloutPathHint, sessionId), short-circuit to only read files under the cached
rolloutPath (or persist the mapping in the binding state) when rolloutPathHint
is null to avoid full directory scans.
In `@scripts/codex.js`:
- Around line 68-72: The isSupervisorInteractiveCommand function currently only
inspects rawArgs[0]; modify it to iterate rawArgs from start and skip leading
option tokens (those starting with "-" or "--"), correctly handling flags that
consume a value (e.g., "--config", "-c" and any other known value-consuming
flags) by skipping their next token, then check the first non-option token and
return true if it equals "resume" or "fork"; update the same logic used around
the post-supervisor sync path (the block at ~545-551) so it uses this new
scanner instead of rawArgs[0], and add a vitest in
test/codex-bin-wrapper.test.ts (adjacent to lines 690-707) that verifies a
command like "codex -c foo=bar resume session-123" (and a long‑option variant)
is treated as interactive.
In `@test/codex-supervisor.test.ts`:
- Around line 958-1002: The test "batches probe work so multiple degraded
accounts do not add serial pause" currently uses a fragile wall-clock assertion
(elapsedMs < 170); change the test to avoid real-time timing by either enabling
vi.useFakeTimers()/fake timers around the call to
supervisorTestApi?.ensureLaunchableAccount and advancing timers
deterministically based on quotaProbeDelayMs and probeTimeoutMs, or remove the
elapsedMs assertion and instead assert parallelism by instrumenting FakeManager
(e.g., a concurrent-fetch counter inside FakeManager.fetch or createFakeRuntime)
to verify all degraded accounts were probed concurrently and that
manager.activeIndex ends up at 3; update the test to rely on these deterministic
checks rather than real scheduler timing.
---
Outside diff comments:
In `@lib/storage.ts`:
- Around line 2132-2145: Persist currently writes accounts to the captured
storagePath but calls saveFlaggedAccountsUnlocked() which resolves
getFlaggedAccountsPath() at call-time, allowing
setStoragePathDirect()/withAccountAndFlaggedStorageTransaction() to flip roots
and break atomicity; fix by capturing the flagged path at the start of persist
(call getFlaggedAccountsPath() once into flaggedPath) and pass that flaggedPath
into a new or overloaded saveFlaggedAccountsUnlocked(flaggedStorage,
flaggedPath) so both account and flagged writes use the same root for both
commit and rollback; update withAccountAndFlaggedStorageTransaction(), persist,
and any helpers to accept the explicit flaggedPath, add a vitest beside
test/storage.test.ts covering the drift case (flip storage path between account
and flagged save) and ensure the new save/queue logic handles EBUSY/429 retry
semantics as per lib/** IO/concurrency guidelines and reference the affected
tests in the change.
---
Duplicate comments:
In `@lib/storage.ts`:
- Around line 1767-1776: The migration call in loadAccountsInternal is unsafe
because migrateLegacyProjectStorageIfNeeded() reads getStoragePathState()
itself; change loadAccountsInternal to pass the captured storage path/state into
migrateLegacyProjectStorageIfNeeded (update the helper signature to accept the
path or storage state) and use that value instead of letting the helper call
getStoragePathState(), so concurrent setStoragePath() cannot cause path drift;
update references to migrateLegacyProjectStorageIfNeeded and ensure
persistMigration is invoked against the provided path/state (functions:
loadAccountsInternal, migrateLegacyProjectStorageIfNeeded, getStoragePathState,
setStoragePath, persistMigration) and add a vitest near
test/storage.test.ts:1047-1106 that reproduces concurrent path drift during
transaction startup to assert the migration used the captured path; also
run/update related tests to handle Windows IO/EAGAIN/EBUSY concurrency behaviors
as noted.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 080af616-f227-4b8d-80be-b179df052d4c
📒 Files selected for processing (9)
lib/config.tslib/storage.tslib/ui/copy.tspackage.jsonscripts/codex-supervisor.jsscripts/codex.jstest/codex-bin-wrapper.test.tstest/codex-supervisor.test.tstest/storage.test.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (2)
lib/**
⚙️ CodeRabbit configuration file
focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest) and that new queues handle EBUSY/429 scenarios. check for logging that leaks tokens or emails.
Files:
lib/ui/copy.tslib/config.tslib/storage.ts
test/**
⚙️ CodeRabbit configuration file
tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior. reject changes that mock real secrets or skip assertions.
Files:
test/storage.test.tstest/codex-supervisor.test.tstest/codex-bin-wrapper.test.ts
🔇 Additional comments (20)
scripts/codex-supervisor.js (10)
1-7: lgtm on imports and module structure.clean separation of node builtins. the dynamic import pattern at
scripts/codex-supervisor.js:169-183handles missing modules gracefully for environments where the dist build isn't present.
41-100: abort handling and cleanup look solid now.the
sleepfunction at lines 41-61 properly removes the abort listener infinish().abortablePromiseat lines 78-100 now callscleanup()infinallyon the original promise, fixing the memory leak flagged in previous reviews.
328-353:safeUnlinkretry logic handles windows antivirus edge cases well.the EPERM/EBUSY retry with exponential backoff at
scripts/codex-supervisor.js:341-350mirrors common windows filesystem contention patterns. the max attempts (4) and base delay (25ms) seem reasonable for transient locks.
419-492: supervisor lock implementation looks robust.TTL-based stale detection at lines 391-417, retry on EEXIST/EPERM/EBUSY at lines 471-478, and proper cleanup in finally at line 464. good foundation for cross-process coordination.
510-520: dedup key no longer includes refreshToken.previous review flagged that including
refreshTokenin the dedup key could cause duplicates during token refresh. now using only stable identifiers (index, accountId, email) at lines 510-514.
802-872: race window in probe cache is closed.the placeholder promise is now set at lines 826-837 before starting async work, preventing duplicate fetches under concurrency.
pendingResolver/pendingRejecterpropagate results to waiters correctly.
1424-1449: signal escalation now documents the windows behavior.line 1446 explains that SIGTERM is already forceful on windows but SIGKILL is kept as the unix fallback. the comment addresses the previous review feedback about redundant waits.
1548-1684: monitor loop paces probe outages correctly.lines 1605-1613 now sleep for
pollMsafterQuotaProbeUnavailableErrorbefore continuing, fixing the hot-loop concern from the previous review. the sleep also respects the monitor abort signal.
1631-1651: prewarm is cancelled when session exits without rotating.
preparedResumeSelectionControlleris created per-launch at line 1511 and connected to the outer signal at lines 1512-1516. line 1638 passespreparedResumeSelectionController.signalto the prewarm. on clean exit without restart, line 1764 aborts the controller and line 1766 awaits the promise to ensure cleanup.
1879-1912: signal handlers now stay installed for the full supervision duration.line 1895 uses
return await runCodexSupervisorWithRuntime(...)so thefinallyblock at lines 1908-1911 only removes SIGINT/SIGTERM handlers after supervision settles. this fixes the critical issue from the previous review.test/codex-supervisor.test.ts (10)
1-7: imports and test setup look clean.vitest runner, node:events for fake child, perf_hooks for timing assertions. the
__testOnlyimport pattern keeps the test surface explicit.
24-41: retry logic for temp dir cleanup handles windows filesystem contention.ENOTEMPTY/EPERM/EBUSY retries at
test/codex-supervisor.test.ts:25-40with exponential backoff prevent flaky test teardown on windows when antivirus or indexing holds handles. as per coding guidelines, tests should handle windows filesystem behavior.
49-55: deferred helper enables deterministic async coordination.the
createDeferred<T>()pattern at lines 49-55 is used throughout for promise latches, avoiding wall-clock timing dependencies in concurrency tests.
267-313: boundary tests for 200-line scan limit are solid.test at lines 267-291 verifies metadata is found on the 200th line; test at lines 293-313 verifies it's missed on line 201. good edge case coverage.
394-433: signal escalation tests cover both windows and non-windows paths.test at lines 394-414 verifies win32 skips SIGINT and goes straight to SIGTERM→SIGKILL. test at lines 416-433 verifies linux sends SIGINT before escalating. addresses the missing non-win32 coverage from previous review.
529-584: concurrent lock serialization test is deterministic.uses promise latches (
resolveFirstEntered,releaseFirst,secondEntered) at lines 536-544 instead of wall-clock sleeps. asserts ordering viaorderarray at line 583. this addresses the previous review request for concurrent lock regression testing.
656-742: test verifies prewarm abort signal fires on clean exit.at
test/codex-supervisor.test.ts:741, the test assertspreparedProbeSignal?.abortedistrueafter a clean child exit without rotation. this confirms the fix for the cancelled-prewarm concern raised in the past review.
773-812: in-flight probe deduplication test uses latches correctly.
nearLimitGateandprobeStarteddeferreds at lines 777-778 coordinate the test without wall-clock timing. asserts thatcallsarray shows only one fetch despite two concurrent callers.
1114-1166: pacing test validates probe unavailable backoff.fake timers at line 1118, poll interval set to 20ms at line 1119, advances 100ms at line 1163. the test verifies the supervisor completes without hot-looping. good regression for the previously flagged issue.
1057-1112: monitor failure test has 10s timeout which is long.the
{ timeout: 10_000 }at line 1059 suggests the test might occasionally be slow. if the test consistently finishes quickly, consider lowering this or adding a note. otherwise, lgtm on the assertion that child exit code is preserved when the monitor throws.
There was a problem hiding this comment.
Actionable comments posted: 7
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@scripts/codex-supervisor.js`:
- Around line 465-537: The lock currently writes a fixed expiresAt in
withSupervisorStorageLock (written via handle.writeFile) but never refreshes it
while fn() runs, so implement a heartbeat that periodically updates the lock's
expiresAt (or re-writes the lock file contents with a new acquiredAt/expiresAt)
for the duration of the critical section, or alternatively perform an ownership
check before removing the lock in isSupervisorLockStale/safeUnlink to ensure the
process still owns it; run the heartbeat in a background timer started before
invoking fn() and cleared in the finally that unlinks the lock so the lease
never appears stale while fn() is executing.
- Around line 1740-1769: The monitorFailure is only logged but the supervisor
still returns the child's exit code (variable result), allowing a successful
exit code despite monitor faults; after detecting monitorFailure (the block
using monitorFailure, monitorController, monitorActive, monitorPromise and
relaunchNotice) override the returned result to indicate failure instead of
leaving the child's exitCode intact — e.g. set result.exitCode to a non-zero
value and attach the error (or set a result.error) so the supervised run fails
when monitorFailure exists and signal?.aborted is false; update the handling
where result is constructed (the Promise that listens on child 'error' and
'exit') so callers consuming result or normalizeExitCode see the
supervisor-failure state.
- Around line 147-166: The bypass logic in isSupervisorAccountGateBypassCommand
currently returns false if no primary command is found and also scans all tokens
(allowing "codex exec -- --version" to bypass); change it to only inspect
top-level flags that appear before the primary command (and before any "--"
separator) or, if no primary exists, only the initial tokens (leading flags) for
"--help", "-h", or "--version". Use findPrimaryCodexCommand to locate the
primary command and its index (or treat index as rawArgs.length if missing),
stop scanning when you hit "--" or at the primary index, and only check tokens
in that prefix; keep the explicit primaryCommand name checks for
"auth","help","version"; add regression tests for both "codex --help"/"codex
--version" and "codex exec -- --version" shapes.
- Around line 1072-1085: The catch block that collapses probe failures into an
account-level "probe-error" evaluation must not mark accounts unhealthy when the
quota probe itself is down; detect QuotaProbeUnavailableError (or whatever
sentinel the probe throws) in the catch inside the quota probe invocation and
treat it like the monitor loop: do not return an evaluation that triggers
markAccountUnavailable/rotate, but instead surface the probe-unavailable
condition to the supervisor retry/backoff logic (either by rethrowing the
QuotaProbeUnavailableError or returning a non-mutating evaluation such as
rotate: false and a distinct reason like "probe-unavailable"); update the catch
around the probe (the try/catch that currently returns { evaluation: { rotate:
true, reason: "probe-error" } }) to special-case QuotaProbeUnavailableError, and
add a vitest regression that simulates a transient probe outage ensuring
markAccountUnavailable is not called.
- Around line 328-330: The current buildResumeArgs(buildForwardArgs)
implementation discards the caller's original CLI flags by returning only
["resume", sessionId]; instead, preserve and reuse the original argv and splice
in the new session id at the command boundary provided by codex-routing.js, then
pass that reconstructed argv into buildForwardArgs so caller-supplied flags are
kept for resumed children. Locate buildResumeArgs and the rotated relaunch site
that calls it, retrieve the original argv used for the child launch, use the
command boundary logic from codex-routing.js (the command boundary helper in
that module) to replace the session token with the new sessionId (and append any
required restart marker), and return buildForwardArgs(reconstructedArgv).
In `@test/storage.test.ts`:
- Around line 1164-1267: The tests that exercise
withAccountAndFlaggedStorageTransaction need to also verify flagged-storage was
refreshed: after each combined transaction case (both the throwing and
successful handlers) call loadFlaggedAccounts (or saveFlaggedAccounts/readback)
and assert the flagged accounts/state reflects the post-transaction write (or
was reloaded) rather than stale data; update the two specs that currently only
use saveAccounts/exportAccounts to include assertions using loadFlaggedAccounts
and/or saveFlaggedAccounts to deterministically prove flagged storage refresh
behavior for withAccountAndFlaggedStorageTransaction.
- Around line 964-1006: Add a deterministic vitest regression that proves the
mismatch guard in lib/storage.ts does not treat windows-equivalent paths as
different: write a test using withAccountStorageTransaction,
setStoragePathDirect and exportAccounts that starts a transaction on one path
(testStoragePath), then within the transaction call setStoragePathDirect with a
windows-equivalent alias of the same path (e.g., different case and backslashes
vs slashes or using path.win32.join/normalize/resolve to produce the alias),
then call exportAccounts and assert it does NOT throw and that export file
exists; ensure the test uses deterministic temp paths (testWorkDir/exportPath)
and mirrors the existing fail-fast test pattern so it verifies the guard uses
path normalization/canonicalization rather than raw string inequality.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 4f79ba2a-e9bb-4921-9d31-213760265479
📒 Files selected for processing (8)
lib/config.tslib/preemptive-quota-scheduler.tsscripts/codex-routing.jsscripts/codex-supervisor.jsscripts/codex.jstest/codex-bin-wrapper.test.tstest/codex-supervisor.test.tstest/storage.test.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (2)
lib/**
⚙️ CodeRabbit configuration file
focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest) and that new queues handle EBUSY/429 scenarios. check for logging that leaks tokens or emails.
Files:
lib/preemptive-quota-scheduler.tslib/config.ts
test/**
⚙️ CodeRabbit configuration file
tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior. reject changes that mock real secrets or skip assertions.
Files:
test/storage.test.tstest/codex-bin-wrapper.test.tstest/codex-supervisor.test.ts
🪛 ast-grep (0.41.1)
test/storage.test.ts
[warning] 998-1002: Regular expression constructed from variable input detected. This can lead to Regular Expression Denial of Service (ReDoS) attacks if the variable contains malicious patterns. Use libraries like 'recheck' to validate regex safety or use static patterns.
Context: new RegExp(
storage path mismatch: transaction path is ${escapeRegExp( testStoragePath, )}, active path is ${escapeRegExp(secondaryStoragePath)},
)
Note: [CWE-1333] Inefficient Regular Expression Complexity [REFERENCES]
- https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS
- https://cwe.mitre.org/data/definitions/1333.html
(regexp-from-variable)
🔇 Additional comments (5)
test/storage.test.ts (1)
3383-3415: nice deterministic tie-break regression for equal mtime snapshots.
test/storage.test.ts:3383-3415cleanly validates the intended tie-break behavior implemented inlib/storage.ts:812-823. this is stable and deterministic.lib/preemptive-quota-scheduler.ts (1)
27-29: good split between the 5h and 7d defaults.lib/preemptive-quota-scheduler.ts:27-29 and lib/preemptive-quota-scheduler.ts:153-154 now keep the aggressive 5h threshold separate from the 7d fallback, which removes the earlier config drift.
Also applies to: 153-154
scripts/codex.js (1)
535-557: good sync placement around supervisor forwarding.scripts/codex.js:536-557 runs startup sync inside the real forwarding callback and gates the post-supervisor sync on
supervisorDidForward. that keeps the abort fast-path intact while avoiding the earlier double-sync ordering problem.test/codex-bin-wrapper.test.ts (1)
116-124: good fixture-local storage path for spawned supervisor tests.test/codex-bin-wrapper.test.ts:116-124 stubs
getStoragePath()under the fixture root, so the supervisor lock file stays hermetic instead of touching the real user home during wrapper tests.test/codex-supervisor.test.ts (1)
481-630: good windows/concurrency lock coverage.test/codex-supervisor.test.ts:481-630 adds stale-lock cleanup, transient
eperm/ebusy, and concurrent-caller coverage against the supervisor lock. that materially improves the regression net around the new cross-process coordination path.
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
lib/storage.ts (2)
1949-1961:⚠️ Potential issue | 🟠 Majorpin
.gitignorewrites to the captured project root too.
lib/storage.ts:1961now saves to the captured file path, butensureGitignore()still prefers the ambientcurrentProjectRootatlib/storage.ts:298-309. if a transaction starts on project a and then callssetStoragePath(projectB)beforepersist(), the account file goes back to project a while.gitignoreis updated in project b. pass a captured project-root hint throughsaveAccountsUnlocked()/ensureGitignore()and add a vitest that drifts between two git roots beforepersist().possible fix
-async function ensureGitignore(storagePath: string): Promise<void> { - const state = getStoragePathState(); - if (!state.currentStoragePath) return; +async function ensureGitignore( + storagePath: string, + projectRootHint?: string | null, +): Promise<void> { const configDir = dirname(storagePath); const inferredProjectRoot = dirname(configDir); - const candidateRoots = [state.currentProjectRoot, inferredProjectRoot].filter( + const candidateRoots = [projectRootHint, inferredProjectRoot].filter( (root): root is string => typeof root === "string" && root.length > 0, );async function saveAccountsUnlocked( storage: AccountStorageV3, storagePath = getStoragePath(), + projectRootHint?: string | null, ): Promise<void> { const path = storagePath; ... - await ensureGitignore(path); + await ensureGitignore(path, projectRootHint);return withStorageLock(async () => { const storagePath = getStoragePath(); + const projectRootHint = getStoragePathState().currentProjectRoot; const state = { snapshot: await loadAccountsInternal( - (storage) => saveAccountsUnlocked(storage, storagePath), + (storage) => + saveAccountsUnlocked(storage, storagePath, projectRootHint), storagePath, ), }; ... - await saveAccountsUnlocked(storage, storagePath); + await saveAccountsUnlocked(storage, storagePath, projectRootHint);as per coding guidelines,
lib/**: focus on auth rotation, windows filesystem io, and concurrency. verify every change cites affected tests (vitest) and that new queues handle ebusy/429 scenarios.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/storage.ts` around lines 1949 - 1961, saveAccountsUnlocked writes account files to the captured storage path but calls ensureGitignore() which still uses the ambient currentProjectRoot; this causes .gitignore to be written to a different project when setStoragePath changes before persist. Fix by threading the resolved project-root/path into ensureGitignore (add an optional projectRoot or targetDir parameter) and update saveAccountsUnlocked to call ensureGitignore(path) instead of the ambient version; update any callers (persist, setStoragePath flows) to propagate the captured path and adjust getStoragePath usage in saveAccountsUnlocked to use the passed-in storagePath consistently. Add a vitest that switches between two temp git roots before persist to assert both the account file and .gitignore are written to the same captured root, and ensure tests cover Windows IO/EBUSY/429 retry behavior per lib/** guidelines.
2136-2155:⚠️ Potential issue | 🔴 Criticalpin flagged storage to the transaction root too.
lib/storage.ts:2154still callssaveFlaggedAccountsUnlocked(flaggedStorage), andlib/storage.ts:2418-2419resolves that path from the ambientgetStoragePath(). if a caller switches projects insidewithAccountAndFlaggedStorageTransaction(), account storage stays pinned to the capturedstoragePathbut flagged storage moves to the new project root. that splits what should be one atomic auth-rotation update. capture the flagged path alongsidestoragePathand add a vitest intest/storage.test.tsthat drifts the path beforepersist()and asserts both files stay under the original transaction root.possible fix
async function saveFlaggedAccountsUnlocked( storage: FlaggedAccountStorageV1, + storagePath = getFlaggedAccountsPath(), ): Promise<void> { - const path = getFlaggedAccountsPath(); + const path = storagePath; const markerPath = getIntentionalResetMarkerPath(path); ... }return withStorageLock(async () => { const storagePath = getStoragePath(); + const flaggedStoragePath = join( + dirname(storagePath), + FLAGGED_ACCOUNTS_FILE_NAME, + ); const state = { snapshot: await loadAccountsInternal( (storage) => saveAccountsUnlocked(storage, storagePath), storagePath, ), active: true, storagePath, }; ... const persist = async ( accountStorage: AccountStorageV3, flaggedStorage: FlaggedAccountStorageV1, ): Promise<void> => { ... await saveAccountsUnlocked(nextAccounts, storagePath); try { - await saveFlaggedAccountsUnlocked(flaggedStorage); + await saveFlaggedAccountsUnlocked(flaggedStorage, flaggedStoragePath); state.snapshot = nextAccounts;as per coding guidelines,
lib/**: focus on auth rotation, windows filesystem io, and concurrency. verify every change cites affected tests (vitest) and that new queues handle ebusy/429 scenarios.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/storage.ts` around lines 2136 - 2155, Capture the flagged-storage path at the start of the transaction and use that pinned path when persisting flagged accounts so both account and flagged files stay under the same transaction root: in withAccountAndFlaggedStorageTransaction()/persist, alongside the already-captured storagePath (from getStoragePath() used with loadAccountsInternal and saveAccountsUnlocked), capture the flaggedStoragePath (call the same resolver used elsewhere for flagged files) and replace the ambient saveFlaggedAccountsUnlocked(flaggedStorage) call with a pinned-save that writes to flaggedStoragePath; update any helper signatures (persist, loadAccountsInternal usage) to accept the pinned flaggedStoragePath and ensure cloneAccountStorageForPersistence/saveAccountsUnlocked usage is unchanged. Add a vitest in test/storage.test.ts that calls withAccountAndFlaggedStorageTransaction(), mutates the global/project storage path between transaction start and calling persist(), and asserts both account and flagged files were written under the original transaction root (also include retries/handling for Windows EBUSY and transient 429 behaviors per IO guidelines).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@scripts/codex-supervisor.js`:
- Around line 471-515: When the lock heartbeat loses the lease (in
createSupervisorLockHeartbeat) we must fail fast instead of only debug-logging:
when readSupervisorLockPayload returns falsy, when payload.ownerId !== ownerId,
or when writeSupervisorLockPayload throws, set stopped = true,
clearInterval(timer) and reject/promise-reject the pendingRefresh with a new
Error describing the condition so callers observing the heartbeat promise can
fail immediately; ensure the returned stop() still awaits pendingRefresh so the
rejection propagates, and include the lockPath and ownerId in the Error message
to aid diagnosis.
- Around line 2066-2092: The child process wait in the interactive path (the
child.once("exit") wait in the block around runCodexSupervisorWithRuntime / the
interactive monitor) must be raced against the outer controller.signal: replace
the plain child.once("exit") Promise with abortablePromise(child.once("exit"),
controller.signal) (or equivalent) so the wrapper stops waiting when the outer
signal aborts; on abort, ensure you call child.kill() (and on platforms where
needed use the proper signal handling fallback) and also call
monitorController.abort() to stop any monitor loops. Update
runCodexSupervisorWithRuntime/interactive monitor logic to accept the outer
controller.signal and to clean up by killing the child and aborting the monitor
controller when that signal fires. Add a regression test at
test/codex-supervisor.test.ts that spawns a never-exiting child, sends SIGTERM
to the wrapper, and asserts the wrapper exits and the child is killed (account
for Windows signal differences in the test).
- Around line 403-419: The getSupervisorStoragePath function currently swallows
exceptions and empty/whitespace results from runtime.getStoragePath(), causing
the supervisor to lock a default file while AccountManager (which calls
lib/storage.ts's getStoragePath) uses a different path; change
getSupervisorStoragePath to propagate failures instead of falling back: if
runtime.getStoragePath() throws, rethrow that error (or wrap and throw a
descriptive Error referencing getSupervisorStoragePath/runtime.getStoragePath)
and treat empty/whitespace return values as an error to be thrown rather than
returning the default path; update any call sites that expect a fallback to
handle the thrown error, and add a regression test in
test/codex-supervisor.test.ts that covers runtime.getStoragePath() throwing and
returning empty/whitespace to ensure the supervisor does not acquire the default
lock when storage-path resolution fails (reference functions
getSupervisorStoragePath, AccountManager.loadFromDisk, and lib/storage.ts
getStoragePath).
In `@test/codex-supervisor.test.ts`:
- Around line 926-959: The current test path (using
supervisorTestApi.prepareResumeSelection → markCurrentAccountForRestart →
commitPreparedSelection) keeps the same account object/token throughout and
misses the token-refresh/stale-object regression; add a deterministic vitest
test that calls prepareResumeSelection, then simulates a token refresh by
replacing the stored account token (the object returned by
manager.getCurrentAccountForFamily() or the supervisor's account store) before
calling commitPreparedSelection, then call requestChildRestart(new FakeChild(),
"win32") and commitPreparedSelection; assert the logical account is resolved
exactly once (e.g., check the same accountId "healthy" and that calls array
length is 1), and use vitest deterministic controls (fake timers/promises or
synchronous mocks) so the refresh happens strictly between prepare and commit.
---
Outside diff comments:
In `@lib/storage.ts`:
- Around line 1949-1961: saveAccountsUnlocked writes account files to the
captured storage path but calls ensureGitignore() which still uses the ambient
currentProjectRoot; this causes .gitignore to be written to a different project
when setStoragePath changes before persist. Fix by threading the resolved
project-root/path into ensureGitignore (add an optional projectRoot or targetDir
parameter) and update saveAccountsUnlocked to call ensureGitignore(path) instead
of the ambient version; update any callers (persist, setStoragePath flows) to
propagate the captured path and adjust getStoragePath usage in
saveAccountsUnlocked to use the passed-in storagePath consistently. Add a vitest
that switches between two temp git roots before persist to assert both the
account file and .gitignore are written to the same captured root, and ensure
tests cover Windows IO/EBUSY/429 retry behavior per lib/** guidelines.
- Around line 2136-2155: Capture the flagged-storage path at the start of the
transaction and use that pinned path when persisting flagged accounts so both
account and flagged files stay under the same transaction root: in
withAccountAndFlaggedStorageTransaction()/persist, alongside the
already-captured storagePath (from getStoragePath() used with
loadAccountsInternal and saveAccountsUnlocked), capture the flaggedStoragePath
(call the same resolver used elsewhere for flagged files) and replace the
ambient saveFlaggedAccountsUnlocked(flaggedStorage) call with a pinned-save that
writes to flaggedStoragePath; update any helper signatures (persist,
loadAccountsInternal usage) to accept the pinned flaggedStoragePath and ensure
cloneAccountStorageForPersistence/saveAccountsUnlocked usage is unchanged. Add a
vitest in test/storage.test.ts that calls
withAccountAndFlaggedStorageTransaction(), mutates the global/project storage
path between transaction start and calling persist(), and asserts both account
and flagged files were written under the original transaction root (also include
retries/handling for Windows EBUSY and transient 429 behaviors per IO
guidelines).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 62207bdd-c6c7-45cd-912e-52509921507a
📒 Files selected for processing (9)
lib/config.tslib/preemptive-quota-scheduler.tslib/storage.tsscripts/codex-routing.jsscripts/codex-supervisor.jsscripts/codex.jstest/codex-bin-wrapper.test.tstest/codex-supervisor.test.tstest/storage.test.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (2)
lib/**
⚙️ CodeRabbit configuration file
focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest) and that new queues handle EBUSY/429 scenarios. check for logging that leaks tokens or emails.
Files:
lib/preemptive-quota-scheduler.tslib/config.tslib/storage.ts
test/**
⚙️ CodeRabbit configuration file
tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior. reject changes that mock real secrets or skip assertions.
Files:
test/storage.test.tstest/codex-bin-wrapper.test.tstest/codex-supervisor.test.ts
🪛 ast-grep (0.41.1)
test/storage.test.ts
[warning] 998-1002: Regular expression constructed from variable input detected. This can lead to Regular Expression Denial of Service (ReDoS) attacks if the variable contains malicious patterns. Use libraries like 'recheck' to validate regex safety or use static patterns.
Context: new RegExp(
storage path mismatch: transaction path is ${escapeRegExp( testStoragePath, )}, active path is ${escapeRegExp(secondaryStoragePath)},
)
Note: [CWE-1333] Inefficient Regular Expression Complexity [REFERENCES]
- https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS
- https://cwe.mitre.org/data/definitions/1333.html
(regexp-from-variable)
🔇 Additional comments (1)
scripts/codex-routing.js (1)
15-16: verify this scanner skips every value-taking codex option.
scripts/codex-routing.js:15-16andscripts/codex-routing.js:55-66still hard-code only-c/--configas consuming a value. if the upstream codex cli accepts any other top-level value-taking flag, that value becomes the "command" here andresume/forkwill be missed. please validate the allowlist against current codex help and add a vitest for one supported non-config flag before relying on this parser.Also applies to: 55-66
|
@coderabbitai review |
|
✅ Actions performedReview triggered.
|
|
@coderabbitai review |
|
✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
lib/storage.ts (1)
2146-2159:⚠️ Potential issue | 🟠 Majorcapture the flagged storage path at transaction start.
lib/storage.ts:2154still resolves the flagged file throughgetFlaggedAccountsPath(). if a handler callssetStoragePathDirect()beforepersist(), accounts are written back to the captured root, but flagged accounts follow the new root. that splits one combined transaction across two storage trees and makes rollback partial. please thread a captured flagged path throughsaveFlaggedAccountsUnlocked()and add the same drift regression forwithAccountAndFlaggedStorageTransaction()next totest/storage.test.ts:1095-1345.possible fix
async function saveFlaggedAccountsUnlocked( storage: FlaggedAccountStorageV1, + flaggedStoragePath = getFlaggedAccountsPath(), ): Promise<void> { - const path = getFlaggedAccountsPath(); + const path = flaggedStoragePath; // ... } return withStorageLock(async () => { const storagePath = getStoragePath(); + const flaggedStoragePath = join( + dirname(storagePath), + FLAGGED_ACCOUNTS_FILE_NAME, + ); // ... - await saveFlaggedAccountsUnlocked(flaggedStorage); + await saveFlaggedAccountsUnlocked(flaggedStorage, flaggedStoragePath);As per coding guidelines,
lib/**: focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/storage.ts` around lines 2146 - 2159, In persist (and the surrounding transaction flow), capture the flagged accounts file path at the start (instead of calling getFlaggedAccountsPath() later) and pass that captured path into saveFlaggedAccountsUnlocked so flagged writes/rollbacks use the same root as the accounts write; update saveFlaggedAccountsUnlocked signature/usages to accept a flaggedPath parameter and use it instead of resolving getFlaggedAccountsPath(), and add a regression test alongside withAccountAndFlaggedStorageTransaction tests (test/storage.test.ts:1095-1345) that calls setStoragePathDirect() mid-transaction to assert both accounts and flagged accounts are written/rolled-back to the originally captured paths.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@lib/codex-manager/settings-hub.ts`:
- Around line 2578-2582: Adding the new supervisor row shifts numeric hotkeys;
update the numeric-to-action mapping so the new first item (value: { type:
"toggle-session-supervisor" }, label built with formatDashboardSettingState and
UI_COPY.settings.experimentalSessionSupervisor) is bound to key "1" and all
subsequent shortcuts increment by one. Locate the hotkey dispatch table in
settings-hub (the block that currently dispatches 1/2/3 around the previous
305-313 area) and renumber the entries to account for the inserted supervisor
row, update the help copy in lib/ui/copy.ts to reflect the new numbering, and
update the failing test test/settings-hub-utils.test.ts to assert the revised
key-to-action mapping (ensure the supervisor toggle is tested for key "1" and
other actions shifted accordingly).
In `@lib/preemptive-quota-scheduler.ts`:
- Around line 27-29: The comment above
DEFAULT_PREEMPTIVE_QUOTA_REMAINING_PERCENT_5H and
DEFAULT_SECONDARY_REMAINING_PERCENT_THRESHOLD is inaccurate because it claims
the 7d window is less aggressive while both constants are currently set to 5;
update the comment to state that these are two separate thresholds (5h and
secondary/7d) which are currently equal rather than implying a more/less
aggressive relationship, and ensure the comment references the two symbols
DEFAULT_PREEMPTIVE_QUOTA_REMAINING_PERCENT_5H and
DEFAULT_SECONDARY_REMAINING_PERCENT_THRESHOLD so future changes remain clear.
In `@package.json`:
- Line 54: The smoke npm script "test:session-supervisor:smoke" currently runs
several supervisor/wrapper tests but omits the high-risk storage regressions
located in test/storage.test.ts (the path-binding and async-local test cases
covering regressions in lib/storage.ts around the path-binding and async-local
sections); update the script to either add those specific storage tests to the
command (include test/storage.test.ts or a new trimmed smoke file containing
just the path-binding and async-local cases) or extract the relevant test blocks
from test/storage.test.ts into a new smaller file (e.g.,
test/storage-smoke.test.ts) and add that filename to the
"test:session-supervisor:smoke" entry so the new storage regressions run in the
fast validation gate.
In `@test/settings-hub-utils.test.ts`:
- Around line 163-179: The removeDirectoryWithRetry helper currently only
retries ENOTEMPTY, EPERM, and EBUSY; update the retryableCodes Set inside
removeDirectoryWithRetry to also include "EACCES" and "EAGAIN" so transient
Windows deletion failures are retried (match behavior in
lib/codex-manager/settings-hub.ts); ensure the rest of the retry logic (attempt
count, backoff, final throw) remains unchanged.
---
Outside diff comments:
In `@lib/storage.ts`:
- Around line 2146-2159: In persist (and the surrounding transaction flow),
capture the flagged accounts file path at the start (instead of calling
getFlaggedAccountsPath() later) and pass that captured path into
saveFlaggedAccountsUnlocked so flagged writes/rollbacks use the same root as the
accounts write; update saveFlaggedAccountsUnlocked signature/usages to accept a
flaggedPath parameter and use it instead of resolving getFlaggedAccountsPath(),
and add a regression test alongside withAccountAndFlaggedStorageTransaction
tests (test/storage.test.ts:1095-1345) that calls setStoragePathDirect()
mid-transaction to assert both accounts and flagged accounts are
written/rolled-back to the originally captured paths.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 3312a3d7-dbb0-4f5c-a76a-26910b3d9d96
📒 Files selected for processing (18)
lib/codex-manager/settings-hub.tslib/config.tslib/preemptive-quota-scheduler.tslib/quota-probe.tslib/schemas.tslib/storage.tslib/ui/copy.tspackage.jsonscripts/codex-routing.jsscripts/codex-supervisor.jsscripts/codex.jstest/codex-bin-wrapper.test.tstest/codex-supervisor.test.tstest/plugin-config.test.tstest/preemptive-quota-scheduler.test.tstest/quota-probe.test.tstest/settings-hub-utils.test.tstest/storage.test.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (2)
lib/**
⚙️ CodeRabbit configuration file
focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest) and that new queues handle EBUSY/429 scenarios. check for logging that leaks tokens or emails.
Files:
lib/schemas.tslib/quota-probe.tslib/codex-manager/settings-hub.tslib/preemptive-quota-scheduler.tslib/ui/copy.tslib/storage.tslib/config.ts
test/**
⚙️ CodeRabbit configuration file
tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior. reject changes that mock real secrets or skip assertions.
Files:
test/preemptive-quota-scheduler.test.tstest/plugin-config.test.tstest/settings-hub-utils.test.tstest/quota-probe.test.tstest/storage.test.tstest/codex-bin-wrapper.test.tstest/codex-supervisor.test.ts
🪛 ast-grep (0.41.1)
test/storage.test.ts
[warning] 998-1002: Regular expression constructed from variable input detected. This can lead to Regular Expression Denial of Service (ReDoS) attacks if the variable contains malicious patterns. Use libraries like 'recheck' to validate regex safety or use static patterns.
Context: new RegExp(
storage path mismatch: transaction path is ${escapeRegExp( testStoragePath, )}, active path is ${escapeRegExp(secondaryStoragePath)},
)
Note: [CWE-1333] Inefficient Regular Expression Complexity [REFERENCES]
- https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS
- https://cwe.mitre.org/data/definitions/1333.html
(regexp-from-variable)
🔇 Additional comments (33)
test/codex-bin-wrapper.test.ts (4)
68-125: fixture setup is now hermetic with storage.js stub.the
writeSupervisorRuntimeFixtureattest/codex-bin-wrapper.test.ts:116-124properly stubsgetStoragePath()to return a fixture-local path. this addresses the previous concern about supervisor locks leaking to real user paths on windows.the stub modules for config, accounts, quota-probe, and storage are minimal but sufficient for the wrapper routing tests.
632-649: supervisor marker assertion validates the supervisor path is actually taken.the test at
test/codex-bin-wrapper.test.ts:648asserts the marker file contains"supervisor\n", which is only written by the stubAccountManager.loadFromDisk()inwriteSupervisorRuntimeFixture. this ensures the test fails ifscripts/codex.jsstops honoring the supervisor flag.
669-691: exit code 130 sentinel behavior is correctly tested.
test/codex-bin-wrapper.test.ts:688-690validates that:
- exit code is 130
- no
FORWARDED:output (supervisor handled it)- no auto-sync marker file exists
this covers the abort-sentinel path in
scripts/codex.js.
712-741: double-sync prevention test looks good.
test/codex-bin-wrapper.test.ts:740asserts exactly one"sync\n"line in the marker file after a supervisor-forwarded interactive command. this validates the auto-sync deduplication logic.scripts/codex-supervisor.js (14)
90-112: abortablePromise cleanup is correct.
scripts/codex-supervisor.js:103attaches.finally(cleanup)to the resolved promise, ensuring the abort listener is removed on normal completion. this addresses the previous memory leak concern.
47-67: sleep implementation handles abort and cleanup correctly.the
finishcallback atscripts/codex-supervisor.js:55-61ensures both the timer and event listener are cleaned up regardless of how the sleep ends.
410-436: storage path resolution now fails fast on errors.
scripts/codex-supervisor.js:416-427throws descriptive errors whenruntime.getStoragePath()throws or returns empty, preventing the supervisor from locking a different file thanAccountManageruses. this addresses the previous cross-lock concern.
487-566: heartbeat fail-fast on lost lease is implemented.
scripts/codex-supervisor.js:517-534detects when the lock file disappears or changes owner and callsstopWithError()which rejects the heartbeat promise. the caller inwithSupervisorStorageLockat line 649-654 catches this and aborts the linked controller, propagating failure to the critical section.
1042-1054: probe cache now reserves the slot before async work.
scripts/codex-supervisor.js:1042-1053creates and stores apendingPromisein the cache before starting the actual fetch. this prevents duplicate fetches when concurrent callers race for the same account.
1847-1856: monitor loop backs off on probe unavailability.
scripts/codex-supervisor.js:1848-1855sleeps forpollMsbefore continuing when the quota probe throwsQuotaProbeUnavailableError. this prevents the hot-loop that would hammer storage and the probe endpoint.
1928-1956: child wait is properly raced against outer signal.
scripts/codex-supervisor.js:1928-1944wraps the child exit promise inabortablePromise(...)using the outersignal. when the outer signal aborts, the catch block at lines 1945-1956 kills the child and returns exit code 130.
2138-2171: signal handlers stay installed during supervision.
scripts/codex-supervisor.js:2154usesawaitbeforerunCodexSupervisorWithRuntime, ensuring the finally block at lines 2168-2169 only removes SIGINT/SIGTERM handlers after supervision completes. this addresses the previous concern about premature handler removal.
726-731: dedup key excludes refreshToken.
scripts/codex-supervisor.js:726-730builds the dedup key fromindex,accountId, and
330-338: resume args now preserve caller cli options.
scripts/codex-supervisor.js:331-337usessplitCodexCommandArgsto extract leading config args (e.g.,-c profile="dev") and trailing args, then rebuilds the resume command while preserving them. this addresses the previous concern about dropped flags.
1691-1692: windows signal behavior documented.
scripts/codex-supervisor.js:1691adds a comment explaining that SIGTERM is already forceful on windows, clarifying why the SIGKILL step is retained as a unix fallback.
1961-1966: monitor failure overrides success exit code.
scripts/codex-supervisor.js:1965returns1when the child exited with0but the monitor failed. this ensures the supervisor doesn't report success when quota monitoring crashed mid-session.
157-168: bypass detection restricted to top-level flags.
scripts/codex-supervisor.js:158useshasTopLevelHelpOrVersionFlagfromcodex-routing.jswhich only scans tokens before the primary command or--separator. this preventscodex exec -- --versionfrom bypassing account gating.
1316-1323: probe-unavailable accounts are not cooled down.
scripts/codex-supervisor.js:1316-1318collects accounts withreason: "probe-unavailable"into a separate list without callingmarkAccountUnavailable. only accounts with actual probe errors (line 1321) are marked. this prevents transient probe outages from cooling down every account.test/codex-supervisor.test.ts (15)
1-54: test setup is clean with proper env var capture/restore.
test/codex-supervisor.test.ts:19-21captures original env values and249-260restores them in afterEach. thecreateDeferredhelper at lines 48-54 enables deterministic synchronization.
56-150: FakeManager covers the supervisor's manager interface.
test/codex-supervisor.test.ts:56-150implements all methods the supervisor calls:getAccountsSnapshot,getCurrentOrNextForFamilyHybrid,markRateLimitedWithReason, etc. thecoolingDownUntiltracking enables quota exhaustion tests.
152-247: createFakeRuntime enables fine-grained control over quota probing.
test/codex-supervisor.test.ts:165-167provides hooks foronFetch,onFetchStart, andwaitForFetchByAccountIdthat allow tests to coordinate and assert on probe ordering without wall-clock timing.
575-630: lock serialization test uses deterministic coordination.
test/codex-supervisor.test.ts:585-630uses deferred promises instead of sleep/timing to verify that the second caller waits until the first releases. theorderarray at line 629 provides a clear assertion on serialization.
481-573: windows lock retry behavior is covered.
test/codex-supervisor.test.ts:481-521and523-573test that the supervisor retries after transientEPERM/EBUSYerrors on unlink and open. this covers the windows filesystem edge cases required by the coding guidelines.
967-1043: token refresh race regression is covered.
test/codex-supervisor.test.ts:1024-1025mutates the stored account'srefreshTokenandaccessbetweenprepareResumeSelectionandcommitPreparedSelection. the assertion at lines 1034-1042 verifies the same logical account is resolved with the refreshed credentials.
1485-1543: heartbeat fail-fast test exercises lost lease detection.
test/codex-supervisor.test.ts:1524deletes the lock file mid-section, and line 1537-1538 asserts the critical section rejects with the expected "lost lease" error. the second section at line 1540 confirms lock recovery.
1629-1701: probe outage pacing test bounds fetch attempts.
test/codex-supervisor.test.ts:1691-1693asserts exactly 1 fetch after startup, then waits 20ms and confirms still 1 fetch. this verifies the backoff sleep is taken before the second probe attempt, addressing the previous concern about only checking exit code.
860-907: overlap test uses promise coordination instead of timing.
test/codex-supervisor.test.ts:869-876usesnearLimitGateandprobeStarteddeferreds to verify that selection probing starts before restart finishes. noperformance.now()or elapsed time assertions.
1155-1223: batch parallelism test uses explicit coordination.
test/codex-supervisor.test.ts:1207-1214pollsstartedAccounts.sizeusingsetImmediateto confirm all 3 degraded probes started in parallel before any resolved. no wall-clockelapsedMsassertions.
1402-1439: storage path error propagation is tested.
test/codex-supervisor.test.ts:1402-1439covers both throwing and empty-return cases forruntime.getStoragePath(). lines 1431-1437 verify the error propagates and the default lock file is not created.
1545-1600: monitor failure exit code test validates the override.
test/codex-supervisor.test.ts:1592asserts the result is1even though the child exited with0, and line 1593-1594 verifies the expected stderr output about monitor failure.
1331-1377: signal propagation test verifies child is killed on abort.
test/codex-supervisor.test.ts:1373-1375asserts the supervisor exits with130and the child'skillmethod was called when the outer signal aborts. theHangingChildclass at lines 1337-1348 only exits when explicitly killed.
1281-1302: nested--versionbypass regression is covered.
test/codex-supervisor.test.ts:1287-1301verifies thatcodex exec -- --versiondoes NOT bypass account gating. the test expectsensureLaunchableAccount reachedto be thrown, confirming the supervisor attempts account selection.
1-1729: comprehensive supervisor test suite.the test file covers:
- lock serialization (
test/codex-supervisor.test.ts:575-630)- windows filesystem retries (
test/codex-supervisor.test.ts:481-573)- token refresh race (
test/codex-supervisor.test.ts:967-1043)- heartbeat fail-fast (
test/codex-supervisor.test.ts:1485-1543)- probe outage pacing (
test/codex-supervisor.test.ts:1629-1701)- signal propagation (
test/codex-supervisor.test.ts:1331-1377)all tests use deterministic coordination via deferred promises or explicit mocking instead of wall-clock timing.
| { | ||
| label: `${formatDashboardSettingState(draft.codexCliSessionSupervisor ?? BACKEND_DEFAULTS.codexCliSessionSupervisor ?? false)} ${UI_COPY.settings.experimentalSessionSupervisor}`, | ||
| value: { type: "toggle-session-supervisor" }, | ||
| color: "yellow", | ||
| }, |
There was a problem hiding this comment.
renumber the experimental hotkeys with this new first item.
adding the supervisor row at lib/codex-manager/settings-hub.ts:2578-2582 shifts the numeric shortcuts, but lib/codex-manager/settings-hub.ts:305-313 still dispatches 1/2/3 as sync/backup/guard. with the updated help in lib/ui/copy.ts:97-101, pressing 1 now runs sync instead of toggling the supervisor, and test/settings-hub-utils.test.ts:754-791 still locks in the stale mapping.
possible fix
function mapExperimentalMenuHotkey(
raw: string,
): ExperimentalSettingsAction | undefined {
- if (raw === "1") return { type: "sync" };
- if (raw === "2") return { type: "backup" };
- if (raw === "3") return { type: "toggle-refresh-guardian" };
+ if (raw === "1") return { type: "toggle-session-supervisor" };
+ if (raw === "2") return { type: "sync" };
+ if (raw === "3") return { type: "backup" };
+ if (raw === "4") return { type: "toggle-refresh-guardian" };
if (raw === "[" || raw === "-") return { type: "decrease-refresh-interval" };
if (raw === "]" || raw === "+") return { type: "increase-refresh-interval" };As per coding guidelines, lib/**: focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest) and that new queues handle EBUSY/429 scenarios.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| { | |
| label: `${formatDashboardSettingState(draft.codexCliSessionSupervisor ?? BACKEND_DEFAULTS.codexCliSessionSupervisor ?? false)} ${UI_COPY.settings.experimentalSessionSupervisor}`, | |
| value: { type: "toggle-session-supervisor" }, | |
| color: "yellow", | |
| }, | |
| function mapExperimentalMenuHotkey( | |
| raw: string, | |
| ): ExperimentalSettingsAction | undefined { | |
| if (raw === "1") return { type: "toggle-session-supervisor" }; | |
| if (raw === "2") return { type: "sync" }; | |
| if (raw === "3") return { type: "backup" }; | |
| if (raw === "4") return { type: "toggle-refresh-guardian" }; | |
| if (raw === "[" || raw === "-") return { type: "decrease-refresh-interval" }; | |
| if (raw === "]" || raw === "+") return { type: "increase-refresh-interval" }; | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@lib/codex-manager/settings-hub.ts` around lines 2578 - 2582, Adding the new
supervisor row shifts numeric hotkeys; update the numeric-to-action mapping so
the new first item (value: { type: "toggle-session-supervisor" }, label built
with formatDashboardSettingState and
UI_COPY.settings.experimentalSessionSupervisor) is bound to key "1" and all
subsequent shortcuts increment by one. Locate the hotkey dispatch table in
settings-hub (the block that currently dispatches 1/2/3 around the previous
305-313 area) and renumber the entries to account for the inserted supervisor
row, update the help copy in lib/ui/copy.ts to reflect the new numbering, and
update the failing test test/settings-hub-utils.test.ts to assert the revised
key-to-action mapping (ensure the supervisor toggle is tested for key "1" and
other actions shifted accordingly).
|
@coderabbitai review |
|
✅ Actions performedFull review triggered. |
|
Superseded by the split cleanup PRs:
Closing this bloated branch so review can continue on the narrowed replacements. |
|
Superseded by #147. I rebuilt the session-supervisor work from current |
Summary
codex resumeinstead of forcing a cold restartcodexCliSessionSupervisorWhat Changed
scripts/codex-supervisor.jsand wired it throughscripts/codex.jsfor interactive-session supervisionscripts/codex-routing.jscodexCliSessionSupervisorconfig/schema/settings togglelib/storage.ts, including path-safety checks and backup snapshot selection fixesValidation
npm run lintnpm run typechecknpm testnpm test -- test/documentation.test.tsnpm run buildnpm test -- --run test/codex-supervisor.test.ts test/codex-bin-wrapper.test.ts test/storage.test.tsDocs and Governance Checklist
docs/getting-started.mdupdated (if onboarding flow changed)docs/features.mdupdated (if capability surface changed)docs/reference/*pages updated (if commands/settings/paths changed)docs/upgrade.mdupdated (if migration behavior changed)SECURITY.mdandCONTRIBUTING.mdreviewed for alignmentRisk and Rollback
git-plan/05-fast-supervised-resumeor revert PR add session resume supervisor for faster codex rotation #139 before enabling the feature flag anywhereAdditional Notes
71f5b3fand all live review threads were resolved before rerunning CodeRabbitnote: greptile review for oc-chatgpt-multi-auth. cite files like
lib/foo.ts:123. confirm regression tests + windows concurrency/token redaction coverage.Greptile Summary
this PR lands the experimental session-resume supervisor — a new
codex.jswrapper that monitors a running codex process, detects quota exhaustion via periodic probe, pre-warms a successor account, and relaunches withcodex resume <sessionId>so the user's session survives account rotation. the feature is gated behindcodexCliSessionSupervisor: falseby default and exposed through the experimental settings menu. alongside the supervisor, the PR hardenslib/storage.tstransaction handling (storage path pinned at transaction open,state.active=falsein finally, hard error on path mismatch during export) and fixes thelatestValidSnapshotstable sort.key changes
scripts/codex-supervisor.js(2205 lines): full supervisor with file-lock heartbeat, probe-snapshot cache, session JSONL binding, pre-warm selection, restart loop (max 16 restarts), Windows-safesafeUnlinkwith EPERM/EBUSY retryscripts/codex-routing.js: clean arg-parser helpers (findPrimaryCodexCommand,splitCodexCommandArgs,hasTopLevelHelpOrVersionFlag) used by the supervisor to route commandsscripts/codex.js: supervisor wired into main path;autoSyncManagerActiveSelectionIfEnabledcorrectly scoped to avoid double-sync on supervised exitslib/storage.ts:storagePathsnapshotted at transaction start and threaded through to all save/load calls;exportAccountsnow throws on path mismatch instead of silently loading from a different path;latestValidSnapshotstable sort by original index breaksmtimeMsties deterministicallylib/quota-probe.ts:AbortSignalsupport added tofetchCodexQuotaSnapshot— per-model abort check, proper listener cleanup infinallylib/config.ts/lib/schemas.ts/lib/preemptive-quota-scheduler.ts:codexCliSessionSupervisortoggle plumbed through config, schema, and defaults;DEFAULT_PREEMPTIVE_QUOTA_REMAINING_PERCENT_5Hexported for shared useissues found
getSessionBindingEntryPassessecond pass still reads every JSONL file whensessionIdis known but the rollout path is not yet cached — Windows antivirus risk during initial session capture (mitigated by the listing cache, but second-pass dedup would eliminate redundant file opens)snapshotProbeCachecan hold TTL-valid entries from a rotated-away account if the cache window overlaps the pre-warm/rotation boundary; cleared correctly inmarkAccountUnavailablebut not flushed between outer restart iterationsQuotaProbeUnavailableError(full network outage),ensureLaunchableAccountspins through all 32 attempts with no feedback and no early exit — up to ~80s of silent blocking; no vitest coverage for this pathnpm testandnpm run buildare unchecked in the PR checklist; thelatestValidSnapshottiebreaker and the all-unavailable probe path lack targeted test coverageConfidence Score: 3/5
Important Files Changed
Sequence Diagram
sequenceDiagram participant U as User (CLI) participant CJ as codex.js participant SUP as codex-supervisor.js participant RT as Runtime (dist/) participant AM as AccountManager participant LOCK as .supervisor.lock participant CH as Child Codex Process participant FS as ~/.codex/sessions/ U->>CJ: codex [args] CJ->>SUP: runCodexSupervisorIfEnabled() SUP->>RT: loadSupervisorRuntime() RT-->>SUP: runtime (config, AccountManager, probe) SUP->>RT: getCodexCliSessionSupervisor(pluginConfig) alt supervisor disabled SUP-->>CJ: null CJ->>CH: forwardToRealCodex() else supervisor enabled SUP->>LOCK: withSupervisorStorageLock() LOCK-->>SUP: lease + heartbeat SUP->>AM: loadFromDisk() AM-->>SUP: manager + currentAccount LOCK-->>SUP: release lock SUP->>RT: ensureLaunchableAccount() RT->>AM: getProbeCandidateBatch() RT->>RT: probeAccountSnapshot() × batch RT->>LOCK: withLockedManager (mark unavailable / persist) RT-->>SUP: {ok, account, manager} SUP->>CH: spawnRealCodex(codexBin, args) loop monitor poll (300ms) SUP->>FS: waitForSessionBinding / listJsonlFiles FS-->>SUP: sessionId + rolloutPath SUP->>LOCK: withLockedManager → loadFromDisk LOCK-->>SUP: freshManager SUP->>RT: probeAccountSnapshot(currentAccount) RT-->>SUP: quotaSnapshot alt quota near exhaustion (prewarm) SUP->>SUP: maybeStartPreparedResumeSelection() SUP-->>SUP: preparedResumeSelectionPromise (background) end alt quota exhausted (rotate) SUP->>CH: requestChildRestart (SIGINT→SIGTERM→SIGKILL) CH-->>SUP: exit end end CH-->>SUP: exit(code) SUP->>LOCK: markCurrentAccountForRestart SUP->>SUP: commitPreparedSelection / ensureLaunchableAccount SUP->>CJ: buildResumeArgs(sessionId, args) Note over SUP,CH: restart loop (max 16 times) SUP-->>CJ: final exitCode CJ->>CJ: autoSyncManagerActiveSelectionIfEnabled() endComments Outside Diff (8)
lib/preemptive-quota-scheduler.ts, line 27-29 (link)default 5h threshold silently doubled for all users
DEFAULT_PREEMPTIVE_QUOTA_REMAINING_PERCENT_5His raised from5to10, and all three callsites that previously hard-coded5now pull this constant:PreemptiveQuotaSchedulerconstructor,DEFAULT_PLUGIN_CONFIG.preemptiveQuotaRemainingPercent5h, and the fallback ingetPreemptiveQuotaRemainingPercent5h. this means any user who has never explicitly setpreemptiveQuotaRemainingPercent5hwill now start preemptive rotation at 10% remaining instead of 5% — a 2× change in aggressiveness that is not gated behindcodexCliSessionSupervisorand will affect every existing deployment on upgrade.the
docs/features.mdchecklist entry is unchecked; if this is intentional, it should be documented and the upgrade guide should call it out explicitly. if the higher threshold is only needed by the supervisor's prewarm logic, consider keeping the scheduler default at5and only using the10%constant inside the supervisor'scomputeQuotaPressurecall, so the two behaviors remain independently tunable.scripts/codex-supervisor.js, line 695-700 (link)fn()outlives the lock after heartbeat rejectionwhen
heartbeat.promiserejects first (e.g. because Windows antivirus held the lock file long enough for the write inwriteSupervisorLockPayloadto fail, settingfailedError),Promise.racerejects and thefinallyblock deletes the lock. butfn()is still running in the background — there is no way to cancel it because it receives no signal.if
fn()ismarkCurrentAccountForRestartor the commit step insideensureLaunchableAccount(both of which callmanager.saveToDisk()after doing mutations), a second supervisor instance can acquire the lock and also callsaveToDisk()concurrently. the two writes race against the same JSON file and one will silently clobber the other.AGENTS.md requires concurrency issues to be explicitly called out. the minimum fix is to propagate the
signalintofnso it can abort in-flight storage writes when the heartbeat loses the lease:and thread
lockSignalthrough to everysaveToDisk()path inside the mutator.scripts/codex-supervisor.js, line 501-511 (link)two-pass path re-scans the same recent files
sortedEntriesis a superset ofrecentEntries— every file in the first pass also appears in the second. the calling loop infindSessionBindingrunsmatchSessionBindingEntry(which opens aReadStream+readlineinterface) for every entry in each pass, so files from the first pass get opened a second time immediately afterward.on Windows, re-opening the same file handle within the same tick/event loop tick can hit antivirus EBUSY locks, and AGENTS.md calls this exact risk out for rapid file-open patterns. with thousands of session files this creates visible lag even on unix.
fix: filter overlap from the second pass so recently-scanned entries are not re-read:
no existing vitest case catches this because the test at line 424 only validates the array structure (lengths) of the returned passes, not that the two arrays are disjoint.
scripts/codex-supervisor.js, line 2276-2315 (link)child SIGINT-exit triggers restart probe when supervisor signal is not aborted
result.exitCodeis 130 andresult.signalis"SIGINT"when the child was killed by SIGINT (e.g.kill -INT <child_pid>from a script, or the child re-raises SIGINT). in that case the supervisor's ownsignalmay not be aborted (theprocess.once("SIGINT", abort)only fires if the supervisor process itself receives the signal from the terminal process group). the restart block runs, acquires the storage lock, and probes the current account — treating an operator-killed child as a quota-exhaustion event.adding a
result.signalguard closes this gap:no vitest coverage for the case where
result.signal === "SIGINT"butsignal.aborted === false.scripts/codex-supervisor.js, line 2434-2467 (link)unexpected storage errors propagate as uncaught exceptions instead of a non-zero exit code
the top-level catch only intercepts
AbortErrorand re-throws everything else. ifwithLockedManager(inside the monitor loop or the post-exit restart probe) throws a non-abort error — e.g. storage path resolution fails,AccountManager.loadFromDisk()throwsEACCES, or a disk-full condition hits the lock-file write — the exception propagates throughrunCodexSupervisorWithRuntime, exits thetryblock here with thethrow errorbranch, and crashes the node process as an unhandled rejection rather than returning exit code 1 and printing a diagnostic.codex.js'smain()caller has no outertry/catcharoundrunCodexSupervisorIfEnabled, so the process terminates with a stack trace visible to the user.consider wrapping with a fallback:
no vitest coverage for this path.
scripts/codex-supervisor.js, line 1261-1263 (link)TOCTOU race: concurrent lock acquisition survives stale-lock cleanup
between
isSupervisorLockStalereturningtrueandsafeUnlinkremoving the file, a concurrent supervisor can:fs.open(lockPath, "wx")→ fails (old file still present)isSupervisorLockStale→ reads the same stale payload, also returnstrueownerIdandexpiresAt(viafs.open("wx"))safeUnlink→ENOENT→ returnstrue(treated as success by the code at line 965–967)fs.open("wx")→ succeeds — now both A and B believe they hold the lockthe fix is to re-read and verify the payload
ownerIdhasn't changed before unlinking, e.g. read the lock again afterisSupervisorLockStaleand only unlink if theownerIdstill matches the stale process (or the file is still expired). this is a concurrency issue explicitly called out in our AGENTS.md guidance.scripts/codex-supervisor.js, line 824-826 (link)Fallback ignores env-var override — supervisor silently disabled when dist is absent
the
getCodexCliSessionSupervisorfallback closure on line 826 checks onlypluginConfig.codexCliSessionSupervisor === true. the real implementation inlib/config.tscallsresolveBooleanSetting(...)which also reads the corresponding environment variable. ifdist/lib/config.jsis absent or stale (e.g. first install beforenpm run build), the fallback ignores the env-var entirely. a user who sets the env-var in their shell to opt in gets the supervisor silently skipped with no diagnostic output.the same gap exists in the
getRetryAllAccountsRateLimitedandgetPreemptiveQuotaEnabledfallbacks. at minimum, log a warning when falling back so users can diagnose stale builds, or replicate the env-var check inside each fallback lambda usingparseBooleanEnv.lib/storage.ts, line 827-838 (link)latestValidSnapshotfix may silently returnundefinedfor callers expecting a snapshotthe old implementation returned
BackupSnapshotMetadata | undefinedbut the new one chains[0]?.snapshot. this is safe when the filtered list is empty (both returnundefined). however, every call-site that previously assumed the return was always a fullBackupSnapshotMetadataobject when truthy will continue to work — but the type now explicitly goes through an intermediate{snapshot, index}object and the?.on line ending is the only guard.the real risk: if a future caller adds
const latest = latestValidSnapshot(snapshots); if (latest) doSomethingWith(latest.valid)—latestis now the rawBackupSnapshotMetadata(unwrapped via?.snapshot) so.validstill works. this is fine. but the PR checklist showsnpm testwas not run — so no test has validated the stable-sort tiebreaker behaviour end-to-end. thegetBackupMetadataexport added totest/storage.test.tswould allow a targeted unit test for the tie scenario (two valid snapshots with identicalmtimeMs). missing vitest coverage for this specific path.Prompt To Fix All With AI
Last reviewed commit: "fix remaining review..."