feat(ui): smooth + cheap agent streaming (adaptive reveal + block-split markdown)#2716
feat(ui): smooth + cheap agent streaming (adaptive reveal + block-split markdown)#2716KarloAldrete wants to merge 3 commits into
Conversation
Each streamed token re-parsed the entire accumulated markdown of the active agent message (react-markdown + remark-gfm) — O(n) per token, O(n^2) per message. Past ~8k chars a single token costs ~19ms of parsing, over the 16.6ms frame budget, so long answers stutter and saturate the main thread. StreamingMarkdown splits the active message into top-level blocks (fenced code kept intact). Completed blocks keep a stable string so the memoized MarkdownRenderer skips them and only the growing tail re-parses; an open code fence renders as plain text until it closes, so syntax highlighting runs once rather than per token. On a 3k-char answer this is ~10x less markdown CPU (878ms -> 88ms total; 3.5ms -> 0.35ms per token) and the per-token cost stays flat instead of growing with message length. Completed messages still render via a single full MarkdownRenderer parse, so output is byte-identical — no visual change, only less work per token. Complements PostHog#2685 (smooth token reveal): that PR eases when text appears, this one keeps rendering it cheap so the smooth reveal stays at 60fps even on long messages. Generated-By: PostHog Code Task-Id: 8e9f327d-84b1-4608-9f48-c3038dbf87ca
Prompt To Fix All With AIFix the following 3 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 3
packages/ui/src/features/editor/components/StreamingMarkdown.tsx:16-22
**`parseOpenFence` targets the first fence, not the unterminated one**
`/.exec(block)` finds the first fence marker in the block. If the tail block contains a completed fence followed by an open fence with no blank line between them — e.g. `` ```ts\ncode\n```\ntext\n```ts\npartial `` (no blank lines, so `splitMarkdownBlocks` keeps it as one block) — `before` becomes `""` and `code` absorbs everything from the first fence opener onward, including the intermediate closing fence and second opening marker. The result is garbled plain-text output during streaming (it self-corrects once the fence closes and `MarkdownRenderer` takes over, but the in-progress display is wrong).
A fix is to scan all lines with the same fence-tracking logic as `hasOpenCodeFence`, record the last fence-open line index, then slice `before` and `code` relative to that line.
### Issue 2 of 3
packages/ui/src/features/editor/components/splitMarkdownBlocks.ts:59-74
**Fence-tracking logic duplicated between the two exported functions**
`hasOpenCodeFence` re-implements the identical per-line regex match + `inFence`/`fenceChar` state machine that lives inside `splitMarkdownBlocks`. Extracting a shared helper (e.g. `parseFenceState(lines)` that returns `{ inFence, fenceChar }`) would satisfy the codebase's OnceAndOnlyOnce rule and make future spec-compliance fixes (e.g. requiring the closing fence to be at least as long as the opening fence) apply to both functions automatically.
### Issue 3 of 3
packages/ui/src/features/editor/components/splitMarkdownBlocks.test.ts:5-44
**Tests prefer parameterised style over manual loops and multi-assertion cases**
The "never drops text" case iterates over samples with a `for` loop, and the `hasOpenCodeFence` suite packs four independent assertions into one `it`. Vitest's `it.each` / `test.each` makes each sample a separate, named test entry — failures name the exact input and expected output, which saves time when something breaks. Converting both suites to parameterised form would align with the project's stated preference.
Reviews (1): Last reviewed commit: "perf(markdown): re-parse only the tail b..." | Re-trigger Greptile |
| const m = /(^|\n) {0,3}(`{3,}|~{3,})/.exec(block); | ||
| if (!m) return { before: "", code: block }; | ||
| const fenceLineStart = m.index + (block[m.index] === "\n" ? 1 : 0); | ||
| const before = block.slice(0, fenceLineStart); | ||
| const afterMarker = block.indexOf("\n", fenceLineStart); | ||
| const code = afterMarker === -1 ? "" : block.slice(afterMarker + 1); | ||
| return { before, code }; |
There was a problem hiding this comment.
parseOpenFence targets the first fence, not the unterminated one
/.exec(block) finds the first fence marker in the block. If the tail block contains a completed fence followed by an open fence with no blank line between them — e.g. ```ts\ncode\n```\ntext\n```ts\npartial (no blank lines, so splitMarkdownBlocks keeps it as one block) — before becomes "" and code absorbs everything from the first fence opener onward, including the intermediate closing fence and second opening marker. The result is garbled plain-text output during streaming (it self-corrects once the fence closes and MarkdownRenderer takes over, but the in-progress display is wrong).
A fix is to scan all lines with the same fence-tracking logic as hasOpenCodeFence, record the last fence-open line index, then slice before and code relative to that line.
Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/ui/src/features/editor/components/StreamingMarkdown.tsx
Line: 16-22
Comment:
**`parseOpenFence` targets the first fence, not the unterminated one**
`/.exec(block)` finds the first fence marker in the block. If the tail block contains a completed fence followed by an open fence with no blank line between them — e.g. `` ```ts\ncode\n```\ntext\n```ts\npartial `` (no blank lines, so `splitMarkdownBlocks` keeps it as one block) — `before` becomes `""` and `code` absorbs everything from the first fence opener onward, including the intermediate closing fence and second opening marker. The result is garbled plain-text output during streaming (it self-corrects once the fence closes and `MarkdownRenderer` takes over, but the in-progress display is wrong).
A fix is to scan all lines with the same fence-tracking logic as `hasOpenCodeFence`, record the last fence-open line index, then slice `before` and `code` relative to that line.
How can I resolve this? If you propose a fix, please make it concise.| export function hasOpenCodeFence(src: string): boolean { | ||
| let inFence = false; | ||
| let fenceChar = ""; | ||
| for (const line of src.split("\n")) { | ||
| const trimmed = line.replace(/^ {0,3}/, ""); | ||
| const fence = /^(`{3,}|~{3,})/.exec(trimmed); | ||
| if (!fence) continue; | ||
| if (!inFence) { | ||
| inFence = true; | ||
| fenceChar = fence[1][0]; | ||
| } else if (trimmed[0] === fenceChar) { | ||
| inFence = false; | ||
| } | ||
| } | ||
| return inFence; | ||
| } |
There was a problem hiding this comment.
Fence-tracking logic duplicated between the two exported functions
hasOpenCodeFence re-implements the identical per-line regex match + inFence/fenceChar state machine that lives inside splitMarkdownBlocks. Extracting a shared helper (e.g. parseFenceState(lines) that returns { inFence, fenceChar }) would satisfy the codebase's OnceAndOnlyOnce rule and make future spec-compliance fixes (e.g. requiring the closing fence to be at least as long as the opening fence) apply to both functions automatically.
Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/ui/src/features/editor/components/splitMarkdownBlocks.ts
Line: 59-74
Comment:
**Fence-tracking logic duplicated between the two exported functions**
`hasOpenCodeFence` re-implements the identical per-line regex match + `inFence`/`fenceChar` state machine that lives inside `splitMarkdownBlocks`. Extracting a shared helper (e.g. `parseFenceState(lines)` that returns `{ inFence, fenceChar }`) would satisfy the codebase's OnceAndOnlyOnce rule and make future spec-compliance fixes (e.g. requiring the closing fence to be at least as long as the opening fence) apply to both functions automatically.
How can I resolve this? If you propose a fix, please make it concise.Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
| it("never drops text — joining the blocks reproduces the input", () => { | ||
| const samples = [ | ||
| "", | ||
| "single line", | ||
| "para one\n\npara two\n\npara three", | ||
| "# Heading\n\nText with **bold**.\n\n- a\n- b\n", | ||
| "Intro\n\n```ts\nconst x = 1;\nconst y = 2;\n```\n\nOutro", | ||
| "trailing blanks\n\n\n\n", | ||
| ]; | ||
| for (const s of samples) { | ||
| expect(splitMarkdownBlocks(s).join("")).toBe(s); | ||
| } | ||
| }); | ||
|
|
||
| it("splits paragraphs at blank lines", () => { | ||
| expect(splitMarkdownBlocks("a\n\nb\n\nc")).toEqual(["a\n\n", "b\n\n", "c"]); | ||
| }); | ||
|
|
||
| it("keeps a fenced code block (with blank lines inside) as one block", () => { | ||
| const md = "```\nline1\n\nline2\n```\n\nafter"; | ||
| const blocks = splitMarkdownBlocks(md); | ||
| expect(blocks[0]).toBe("```\nline1\n\nline2\n```\n\n"); | ||
| expect(blocks[1]).toBe("after"); | ||
| }); | ||
|
|
||
| it("does not split inside an unterminated fence (the tail stays whole)", () => { | ||
| const md = "intro\n\n```ts\nconst a = 1;\n\nconst b = 2;"; | ||
| const blocks = splitMarkdownBlocks(md); | ||
| expect(blocks[blocks.length - 1]).toContain("const b = 2;"); | ||
| expect(blocks.join("")).toBe(md); | ||
| }); | ||
| }); | ||
|
|
||
| describe("hasOpenCodeFence", () => { | ||
| it("is true while a fence is open and false once it closes", () => { | ||
| expect(hasOpenCodeFence("```ts\nconst a = 1;")).toBe(true); | ||
| expect(hasOpenCodeFence("```ts\nconst a = 1;\n```")).toBe(false); | ||
| expect(hasOpenCodeFence("no code here")).toBe(false); | ||
| expect(hasOpenCodeFence("text\n\n```\npartial")).toBe(true); | ||
| }); |
There was a problem hiding this comment.
Tests prefer parameterised style over manual loops and multi-assertion cases
The "never drops text" case iterates over samples with a for loop, and the hasOpenCodeFence suite packs four independent assertions into one it. Vitest's it.each / test.each makes each sample a separate, named test entry — failures name the exact input and expected output, which saves time when something breaks. Converting both suites to parameterised form would align with the project's stated preference.
Context Used: Do not attempt to comment on incorrect alphabetica... (source)
Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/ui/src/features/editor/components/splitMarkdownBlocks.test.ts
Line: 5-44
Comment:
**Tests prefer parameterised style over manual loops and multi-assertion cases**
The "never drops text" case iterates over samples with a `for` loop, and the `hasOpenCodeFence` suite packs four independent assertions into one `it`. Vitest's `it.each` / `test.each` makes each sample a separate, named test entry — failures name the exact input and expected output, which saves time when something breaks. Converting both suites to parameterised form would align with the project's stated preference.
**Context Used:** Do not attempt to comment on incorrect alphabetica... ([source](https://app.greptile.com/review/custom-context?memory=instruction-0))
How can I resolve this? If you propose a fix, please make it concise.Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
…fence, it.each - Extract a single stepFence state machine shared by splitMarkdownBlocks, hasOpenCodeFence and parseOpenFence (OnceAndOnlyOnce — no duplicated fence-tracking logic). - parseOpenFence now targets the LAST unterminated fence, so a completed fence earlier in the same block stays in `before` and renders normally instead of being swallowed as plain text mid-stream. - Parameterize the splitter tests with it.each and add parseOpenFence cases, including the completed-then-open fence edge case. Generated-By: PostHog Code Task-Id: 8e9f327d-84b1-4608-9f48-c3038dbf87ca
|
Thanks for the review! Addressed all three in 3f6bc14:
|
Smooth the reveal of streamed tokens on top of the block-split renderer. useSmoothedText eases the displayed prefix toward the accumulated text via requestAnimationFrame with a step proportional to the backlog, so it tracks a fast token stream (catching up in ~6 frames) instead of a fixed chars/sec that lags behind and then snaps. It reveals on word boundaries so partial markdown tokens never flash, and shows completed (non-streaming) messages in full immediately. Pairs with the block-split renderer: only the growing tail re-parses each frame, so the reveal holds ~60fps even on long messages — which a fixed-rate reveal over a full markdown re-parse can't. Generated-By: PostHog Code Task-Id: 8e9f327d-84b1-4608-9f48-c3038dbf87ca
Problem
Streamed agent replies have two issues:
AgentMessage→MarkdownRendererre-parses the entire accumulated markdown (react-markdown + remark-gfm) on every token: O(n) per token, O(n²) per message. Past ~8k chars a single token costs ~19 ms of parsing, over the 16.6 ms frame budget, so long answers stutter and saturate the main thread.Addresses #2517 (smooth token streaming).
Changes
Two pieces that reinforce each other.
1. Block-split rendering (
StreamingMarkdown+splitMarkdownBlocks) — the active message is split into top-level blocks (fenced code kept intact). Completed blocks keep a stable string, so the memoizedMarkdownRendererskips them and only the growing tail re-parses. An open code fence renders as plain text until it closes, so syntax highlighting runs once, not per token.Measured on a 3,000-char answer streamed in ~250 tokens (jsdom +
react-dom/server, default components — a lower bound):~10× less, and the per-token cost stays flat instead of growing with length.
2. Adaptive reveal (
useSmoothedText) — eases the displayed prefix toward the accumulated text viarequestAnimationFrame, with a step proportional to the backlog so it tracks the token rate (catching up in ~6 frames) rather than a fixed chars/sec. Reveals on word boundaries so partial markdown tokens never flash. Completed messages show in full immediately.The two are complementary: the adaptive reveal makes streaming look smooth, and the block-split keeps each frame cheap so it stays ~60 fps even on long messages — which a reveal over a full per-frame re-parse can't hold. Completed messages render via a single full
MarkdownRendererparse, so the final output is byte-identical.Relationship to #2685
#2685 also implements smooth token streaming (a fixed ~120 chars/sec reveal). This PR overlaps on that goal but takes a different approach — an adaptive reveal rate (no lag-then-snap on fast streams) plus the block-split renderer so the reveal holds up on long messages. Flagging the overlap so it's not a surprise — happy to coordinate or fold these together however the team prefers.
How did you test this?
splitMarkdownBlocks.test.ts(round-trip / no text dropped, paragraph + fenced-block splitting,parseOpenFenceincl. the completed-then-open-fence edge case) anduseSmoothedText.test.ts(immediate on mount, gradual reveal, convergence, prefix-only).pnpm --filter @posthog/ui test→ 763 pass.pnpm --filter @posthog/ui typecheckpasses; pre-commitpnpm typecheckpasses.biome checkclean on changed files;node scripts/check-host-boundaries.mjsreports no new violations.Automatic notifications
Created with PostHog Code