fix(sessions): durably write-ahead the start prompt so it can't be lost by pauldambra · Pull Request #2666 · PostHog/code

pauldambra · 2026-06-14T13:28:54Z

Problem

Starting a local task run can lose the user's prompt. The prompt only exists in renderer memory until the session has fully initialized and session/prompt has been delivered (a separate step after agent.start). If session init exceeds the 30s SESSION_VALIDATION_TIMEOUT_MS — common in large monorepos, where loading skills/commands blows the budget — or the app is reloaded/quit/crashes during the auto-retry window, the prompt is gone and the run is left stranded in queued.

This is the root cause behind "I lost my prompt while starting a task" reports in the monorepo: session init repeatedly times out and the only copy of the prompt is in-memory state tied to a session that never came up.

Why: the prompt is the one thing in the start-a-task flow a user can't cheaply reproduce, yet nothing durable owned it before delivery.

Changes

Persist the prompt as a durable write-ahead record before any session/agent work, and clear it only once it has actually been delivered.

New PendingPromptStore contract in @posthog/core and an electronStorage-backed zustand store in @posthog/ui (keyed by taskId, mirrors the existing sessionConfigStore).
createNewLocalSession now: writes the prompt ahead before createTaskRun/agent.start; records the taskRunId once the run exists; recovers a saved prompt when a later connect arrives without one; clears the record after sendPrompt resolves.
A persisted record means "this prompt is owed delivery" — the hook for a future server-side reconciler to re-drive orphaned queued runs.

Out of scope (follow-ups): bumping/adapting the 30s init timeout, and the server-side reconciler + startup sweep that extends the guarantee to the case where the device never returns.

How did you test this?

pnpm --filter @posthog/core typecheck and pnpm --filter @posthog/ui typecheck — clean.
biome check on all touched files — clean.
New unit tests for the store (pendingPromptStore.test.ts).
Full suites pass: @posthog/core 1504 tests, @posthog/ui 726 tests (including both modified host suites).

Automatic notifications

Publish to changelog?
Alert Sales and Marketing teams?

Created with PostHog Code from a Slack thread

Starting a local task run can lose the user's prompt: it only exists in memory until the session has fully initialized and `session/prompt` has been delivered. If `agent.start` exceeds the 30s `SESSION_VALIDATION_TIMEOUT_MS` (common in large monorepos) or the app is reloaded/quit/crashes during the retry window, the prompt is gone and the run is stranded in `queued`. Persist the prompt as a durable write-ahead record BEFORE any session/agent work, and clear it only once it has actually been delivered: - New `PendingPromptStore` contract in core and an `electronStorage`-backed zustand store in ui (keyed by taskId, mirrors sessionConfigStore). - `createNewLocalSession` writes the prompt ahead before `createTaskRun` / `agent.start`, records the taskRunId once the run exists, recovers a saved prompt when a later connect arrives without one, and clears the record after `sendPrompt` resolves. A persisted record means "this prompt is owed delivery", which is also the hook for a future server-side reconciler to re-drive orphaned runs. Generated-By: PostHog Code Task-Id: 22d61751-8d18-4d56-a79e-4741045320c1

github-actions · 2026-06-14T13:29:20Z

React Doctor found no issues in the changed files. 🎉

_{Reviewed by React Doctor for commit 9ec4d33.}

greptile-apps · 2026-06-14T13:33:05Z

Prompt To Fix All With AI

Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
packages/core/src/sessions/sessionService.ts:1085-1130
**Duplicate save object and drifting `createdAt`**

All nine fields in the first `pendingPrompts.save` call are repeated verbatim in the second — the only difference is `taskRunId`. Because both calls evaluate `recovered?.createdAt ?? Date.now()` independently, the second call's `createdAt` is captured *after* the `await client.createTaskRun(taskId)` round-trip, so the value that actually lands in persistent storage is later than "when the prompt was first written ahead". If this timestamp is ever used for staleness checks or a future reconciler, it will undercount age. Building the base record once before both saves fixes the duplication and pins the timestamp correctly.

### Issue 2 of 2
packages/ui/src/features/sessions/sessionServiceHost.test.ts:189-201
**Write-ahead behaviour is not asserted in the host integration tests**

The `pendingPromptStore` mock is wired in but its methods are never checked in any `connectToTask` test. The new critical-path behaviours — `save` called before `createTaskRun`, `save` called again with `taskRunId` after the run exists, `remove` called only after `sendPrompt` resolves, and the recovery path where `get` returns a saved record and re-drives the prompt — are all exercised by the production code under test but invisible to the suite. A future regression here would pass every existing test.

_{Reviews (1): Last reviewed commit: "fix(sessions): durably write-ahead the s..." | Re-trigger Greptile}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1e9cc0082f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

pauldambra

Note

🤖 Automated comment by QA Swarm — not written by a human

QA Swarm multi-perspective review (paul-reviewer, xp-reviewer, security-audit). qa-team was unavailable and skipped. See inline comments + the summary comment.

pauldambra · 2026-06-14T13:37:30Z

Note

🤖 Automated comment by QA Swarm — not written by a human

Multi-perspective review: paul-reviewer, xp-reviewer, security-audit (qa-team unavailable, skipped)

Verdict: 💬 APPROVE WITH NITS

Solid, well-motivated change that slots cleanly into the existing sessionConfigStore/adapterStore shape. No correctness or security blockers — all findings are non-blocking improvements. Both voice reviewers independently said "ship as you see fit".

Key findings

🟡 MEDIUM — duplicated write-ahead save block (convergent: paul + xp): the PendingPromptRecord is assembled twice, differing only by taskRunId. Build it once and re-save with { ...pending, taskRunId }.
🟡 MEDIUM — recovery is silent: no log/track when a prompt is actually recovered, so we can't tell the safety net is catching anything (or whether the 30s-timeout theory holds in the wild).
🟡 MEDIUM — recovery path untested: get() is mocked to undefined everywhere, so createNewLocalSession's recovery branch never executes in a test. Add one where get() returns a record and assert deliver + remove.
🟢 LOW — "impossible to lose" oversells it: persist is async fire-and-forget and can be dropped pre-hydration; soften the wording.
🟢 LOW — YAGNI: listPrompts() / taskRunId have no consumer until the future reconciler.
⚪ NIT: removePrompt guard diverges from the sibling removeConfigOptions.

Convergence

The duplicated save block was flagged independently by paul and xp → highest-confidence finding.

Reviewer summaries

Reviewer	Assessment
👤 paul	approve with comments — add a recovery log line + an integration test, soften "impossible", dedupe the save block
📐 xp	sound write-ahead design; main concern is the say-it-once duplication in an already-long method; minor YAGNI on `list()`/`taskRunId`
🛡 security-audit	no findings — key is a server-generated task UUID (no cross-tenant adoption), persisted via the same encrypted `electronStorage` the existing config store already uses

Automated by QA Swarm — not a human review

…overy, soften wording From qa-swarm review of this PR: - Build the PendingPromptRecord once and re-save with `{ ...pending, taskRunId }` instead of assembling the 9-field payload twice (convergent paul + xp). - Log when a written-ahead prompt is actually recovered, so the safety net is observable in the wild. - Soften the "impossible to lose" wording in the doc + comment: persistence is async/best-effort, so it's "very unlikely", not guaranteed. Deferred to a follow-up: an integration test exercising the recovery branch of createNewLocalSession (currently the store mock returns undefined everywhere). Generated-By: PostHog Code Task-Id: 22d61751-8d18-4d56-a79e-4741045320c1

pauldambra · 2026-06-14T13:39:32Z

Note

🤖 Automated triage by PR Shepherd — not written by a human

Actioned the qa-swarm findings in bc252db:

✅ Dedupe the write-ahead save (convergent paul + xp) — record built once, re-saved with { ...pending, taskRunId }.
✅ Recovery is observable — log.info("Recovered a written-ahead prompt…", { taskId, ageMs }) when the safety net catches a prompt.
✅ Softened "impossible to lose" → "very unlikely" in both the doc comment and the contract, noting persistence is async/best-effort.

Deferred (author's call):

⏭ Integration test for the recovery branch — worth pinning, but it means making the store mock return a record in the 950-line host harness; left for a focused follow-up rather than auto-applied.
⏭ YAGNI on listPrompts()/taskRunId and the removePrompt guard divergence — both are intentional groundwork for the server-side reconciler; keeping as-is.

Tests stay green: core 162 (sessions), ui host+store suites 122.

…elivering From Codex review of this PR: sendPrompt can resolve without actually delivering — "queued" (session busy) or "rate_limited" (usage-limit error swallowed by sendLocalPrompt) — but we cleared the write-ahead record unconditionally after it resolved, reintroducing a narrow loss window. Only clear the record on confirmed delivery. Generated-By: PostHog Code Task-Id: 22d61751-8d18-4d56-a79e-4741045320c1

pauldambra · 2026-06-14T13:42:53Z

Note

🤖 Automated triage by PR Shepherd — not written by a human

Triaged the Codex + Greptile findings (they converged with qa-swarm on the duplicate save, already fixed):

✅ Codex P2 — clear only on real delivery (ebc75e6): sendPrompt can resolve without delivering ("queued" when the session is busy, "rate_limited" when sendLocalPrompt swallows the usage-limit error). We now only remove() the write-ahead record when stopReason is neither — otherwise the prompt is kept. Good catch; this was a real (narrow) loss window I'd reintroduced.
✅ Greptile P2 — duplicate save / drifting createdAt (bc252db): record built once, so Date.now() is evaluated once too.
⏭ Codex P1 — recover owed prompts on the resume-existing route (deferred): correct and important — routeLocalConnect returns resume-existing whenever a run id + log URL exist, so a run stranded by a timed-out first attempt routes to reconnectToLocalSession, which doesn't consult pendingPrompts. The client-side recovery in this PR only covers the create-new route. Full coverage of the resume path belongs with the server-side reconciler / startup-sweep follow-up already called out in the PR description (it needs "resumed run that never received its prompt" detection + tests). Tracking it there rather than widening this PR.
⏭ Greptile P2 / paul — integration test for the recovery branch (deferred): same as noted above; follow-up.

Net: two real fixes applied, the deeper resume-path recovery consciously deferred to the reconciler follow-up. Re-pushed at ebc75e6.

stamphog

Two unresolved bot inline comments raise concrete behavioral gaps that remain in the current diff: (1) on a rate-limit response, sendPrompt resolves without throwing but the prompt was never delivered — the unconditional pendingPrompts.remove() afterwards silently discards the write-ahead record the PR exists to preserve; (2) recovery logic only runs in createNewLocalSession, so if the app restarts after createTaskRun succeeds but before sendPrompt delivers (the task then has a latest_run.id), doConnect takes the reconnectToLocalSession path and the pending-prompt store is never consulted. Neither issue is a regression, but both undercut the stated safety guarantee and warrant human review before merge.

stamphog

The PR adds write-ahead durability for prompts, but the recovery logic is only wired into createNewLocalSession. If the app restarts after createTaskRun succeeds but before sendPrompt delivers, the task already has a run ID and doConnect takes the resume-existing route into reconnectToLocalSession — which never checks pendingPrompts. The pending record sits in storage unrecovered. Bot reviewers raised this as P1 and it visibly remains unaddressed in the current diff. This is the crash scenario the PR most needs to handle.

…th tests Resolves the remaining review findings on this PR: - Codex P1: a run stranded before its prompt was delivered routes to the resume-existing path on the next connect, not createNewLocalSession. Deliver the owed write-ahead prompt there too, once the resumed log is confirmed to carry no prompt yet. Extracted a shared deliverWrittenAheadPrompt() helper so the deliver-then-clear-on-confirmed-delivery logic lives once. - xp YAGNI: dropped the speculative list()/taskRunId surface (and the second save that only existed to populate taskRunId) — the reconciler that would consume them isn't in this PR. This also removes the duplicated save entirely. - Tests: cover the recovery branch end-to-end against the real store — delivery on the create-new route, delivery on the resume-existing route, and no delivery when the resumed run already has a prompt event. removePrompt's guard divergence from sessionConfigStore is intentional (the guard is the safer side), left as-is. Generated-By: PostHog Code Task-Id: 22d61751-8d18-4d56-a79e-4741045320c1

pauldambra · 2026-06-14T14:00:41Z

Note

🤖 Automated triage by PR Shepherd — not written by a human

Closed out the remaining review threads in 9ec4d33:

✅ Codex P1 — recover owed prompts on the resume-existing route: a run stranded before delivery routes to reconnectToLocalSession on the next connect, so it now delivers the owed write-ahead prompt there too — guarded on the resumed log carrying no prompt event yet. Pulled the deliver-then-clear-on-confirmed-delivery logic into a shared deliverWrittenAheadPrompt() so both routes share it.
✅ xp YAGNI — listPrompts() / taskRunId: dropped the speculative surface (and the second save that only existed to populate taskRunId) since the reconciler that would consume them isn't in this PR. Bonus: this removes the duplicated save entirely rather than just deduping it.
✅ recovery now under test: three e2e tests against the real store — delivery on create-new, delivery on resume-existing, and no delivery when the resumed run already has a prompt event.
⚪ removePrompt guard divergence (NIT): intentional — the guard is the safer side; left as-is.

All review threads are now actioned or resolved-by-design. Local: core 162 + ui sessions 224 green; core + ui typecheck clean.

github-actions

All P1/P2 bot concerns are addressed in the current diff: recovery on the reconnect route is present, deliverWrittenAheadPrompt correctly retains the write-ahead record on rate-limit and queued outcomes, and the key recovery paths are covered by new integration tests. Architecture follows the platform-interface-in-core / implementation-in-ui / wired-at-composition-seam pattern correctly.

pauldambra marked this pull request as ready for review June 14, 2026 13:29

greptile-apps Bot reviewed Jun 14, 2026

View reviewed changes

Comment thread packages/core/src/sessions/sessionService.ts Outdated

Comment thread packages/ui/src/features/sessions/sessionServiceHost.test.ts

chatgpt-codex-connector Bot reviewed Jun 14, 2026

View reviewed changes

Comment thread packages/core/src/sessions/sessionService.ts

Comment thread packages/core/src/sessions/sessionService.ts Outdated

pauldambra commented Jun 14, 2026

View reviewed changes

pauldambra added the Stamphog This will request an autostamp by stamphog on small changes label Jun 14, 2026 — with PostHog

stamphog Bot reviewed Jun 14, 2026

View reviewed changes

stamphog Bot removed the Stamphog This will request an autostamp by stamphog on small changes label Jun 14, 2026

stamphog Bot reviewed Jun 14, 2026

View reviewed changes

pauldambra added the Stamphog This will request an autostamp by stamphog on small changes label Jun 14, 2026 — with PostHog

github-actions Bot approved these changes Jun 14, 2026

View reviewed changes

pauldambra requested a review from a team June 14, 2026 14:08

Conversation

pauldambra commented Jun 14, 2026

Problem

Changes

How did you test this?

Automatic notifications

Uh oh!

github-actions Bot commented Jun 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps Bot commented Jun 14, 2026

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

pauldambra left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pauldambra commented Jun 14, 2026

Verdict: 💬 APPROVE WITH NITS

Key findings

Convergence

Reviewer summaries

Uh oh!

pauldambra commented Jun 14, 2026

Uh oh!

pauldambra commented Jun 14, 2026

Uh oh!

stamphog Bot left a comment

Choose a reason for hiding this comment

Uh oh!

stamphog Bot left a comment

Choose a reason for hiding this comment

Uh oh!

pauldambra commented Jun 14, 2026

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions Bot commented Jun 14, 2026 •

edited

Loading