fix(backend): preserve snapshot events in loadEvents head+tail read for large JSONL files#1662
Conversation
…or large JSONL files The tail-read optimization (ffe4a21) only reads the last 2MB of event log files, truncating early snapshot/lifecycle events (MESSAGES_SNAPSHOT, RUN_STARTED, etc.) that the frontend needs for full conversation replay. This adds a head+tail read strategy: scan the file head for snapshot and lifecycle events, scan the tail for streaming events, and merge them with type+timestamp deduplication. Fixes chat history disappearing on navigation for sessions with large event logs. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
✅ Deploy Preview for cheerful-kitten-f556a0 canceled.
|
📝 WalkthroughWalkthroughEvent log replay for large ChangesEvent Log Head+Tail Loading
Sequence DiagramsequenceDiagram
participant loadEvents
participant scanHeadSnapshotEvents
participant fastExtractType
participant eventFile as agui-events.jsonl
loadEvents->>eventFile: check file size
alt file > replayMaxTailBytes
loadEvents->>scanHeadSnapshotEvents: scan head for snapshots
scanHeadSnapshotEvents->>eventFile: read from start
scanHeadSnapshotEvents->>fastExtractType: extract type field
fastExtractType-->>scanHeadSnapshotEvents: type value
scanHeadSnapshotEvents-->>loadEvents: head events
loadEvents->>eventFile: seek to tail offset
loadEvents->>eventFile: scan tail events
loadEvents->>loadEvents: merge by type|timestamp key
loadEvents-->>loadEvents: deduplicated events
else file <= replayMaxTailBytes
loadEvents->>eventFile: read full file
end
Possibly related PRs
🚥 Pre-merge checks | ✅ 7 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (7 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
✨ Simplify code
Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
components/backend/websocket/agui_store.go (1)
273-338:⚠️ Potential issue | 🟠 Major | ⚡ Quick winKeep the head scan out of the tail window.
For files in
(replayMaxTailBytes, 2*replayMaxTailBytes), these two reads overlap. AppendingheadEventsbeforetailEventsthen reorders overlap-range events and can replay lifecycle/snapshot events ahead of earlier stream events from the same file region. Clamp the head scan totailOffset(or otherwise preserve file order) before merging.Suggested fix
- headEvents := scanHeadSnapshotEvents(f) - tailOffset := fileSize - replayMaxTailBytes + headScanLimit := tailOffset + if headScanLimit > replayMaxTailBytes { + headScanLimit = replayMaxTailBytes + } + headEvents := scanHeadSnapshotEvents(f, headScanLimit) if _, err := f.Seek(tailOffset, 0); err != nil {-func scanHeadSnapshotEvents(f *os.File) []map[string]interface{} { +func scanHeadSnapshotEvents(f *os.File, maxBytes int64) []map[string]interface{} { + if maxBytes <= 0 { + return nil + } if _, err := f.Seek(0, 0); err != nil { return nil } @@ - for bytesRead < replayMaxTailBytes { + for bytesRead < maxBytes {🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@components/backend/websocket/agui_store.go` around lines 273 - 338, The head scan can overlap the tail window causing event reordering; clamp the head scan to the start of the tail window by ensuring scanHeadSnapshotEvents only reads up to tailOffset (or by changing its signature to accept an io.Reader/limit) so it doesn't include bytes at or after tailOffset, then merge headEvents and tailEvents as before; modify the call to scanHeadSnapshotEvents and/or its implementation (referencing scanHeadSnapshotEvents and the tailOffset variable) to use an io.LimitedReader or seek+read-to-boundary so headEvents never overlaps the tail region.
🧹 Nitpick comments (1)
components/backend/websocket/agui_store_test.go (1)
497-503: ⚡ Quick winAdd one overlap-range case and one numeric-timestamp case here.
These fixtures only write string timestamps and aim for a file well above 4MB, so the suite never covers the two brittle paths in this merge logic: numeric
timestampvalues fromtypes.BaseEventand overlapping head/tail windows for files just over 2MB. A subtest for each would pin both regressions.Also applies to: 536-664
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@components/backend/websocket/agui_store_test.go` around lines 497 - 503, The test fixtures only produce string timestamps and miss two brittle merge paths; update the padding loop that builds events (look for paddingCount, paddingContent and the map with keys "type", "messageId", "delta", "timestamp") to add two additional cases: (1) an extra subtest that produces overlapping head/tail windows by inserting events with timestamps that create an overlap around the ~2MB boundary (so the file is just over 2MB and triggers the overlap-merge logic), and (2) a subtest that writes at least one event using a numeric timestamp value (the numeric form used by types.BaseEvent) instead of a string to exercise the numeric-timestamp merge path; ensure these events use the same event shape (types.EventTypeTextMessageContent, unique messageId like "msg-pad-overlap-#", and delta content) so existing merge assertions still run.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@components/backend/websocket/agui_store.go`:
- Around line 273-338: The head scan can overlap the tail window causing event
reordering; clamp the head scan to the start of the tail window by ensuring
scanHeadSnapshotEvents only reads up to tailOffset (or by changing its signature
to accept an io.Reader/limit) so it doesn't include bytes at or after
tailOffset, then merge headEvents and tailEvents as before; modify the call to
scanHeadSnapshotEvents and/or its implementation (referencing
scanHeadSnapshotEvents and the tailOffset variable) to use an io.LimitedReader
or seek+read-to-boundary so headEvents never overlaps the tail region.
---
Nitpick comments:
In `@components/backend/websocket/agui_store_test.go`:
- Around line 497-503: The test fixtures only produce string timestamps and miss
two brittle merge paths; update the padding loop that builds events (look for
paddingCount, paddingContent and the map with keys "type", "messageId", "delta",
"timestamp") to add two additional cases: (1) an extra subtest that produces
overlapping head/tail windows by inserting events with timestamps that create an
overlap around the ~2MB boundary (so the file is just over 2MB and triggers the
overlap-merge logic), and (2) a subtest that writes at least one event using a
numeric timestamp value (the numeric form used by types.BaseEvent) instead of a
string to exercise the numeric-timestamp merge path; ensure these events use the
same event shape (types.EventTypeTextMessageContent, unique messageId like
"msg-pad-overlap-#", and delta content) so existing merge assertions still run.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 3c42f091-cf69-4c4c-b293-1e200d213eb2
📒 Files selected for processing (2)
components/backend/websocket/agui_store.gocomponents/backend/websocket/agui_store_test.go
Summary
Test plan
Generated with Claude Code
Summary by CodeRabbit
Bug Fixes
Tests