Summary
Make Session Replay data in sentry-cli highly actionable for coding agents by adding first-class replay segment fetching, local caching, normalized event extraction, and inspection commands.
The CLI should not try to be the agent or own an ask command as the primary interface. Instead, it should become the replay data plane: a reliable way for agents to fetch replay segments, cache them, inspect DOM/rrweb/custom events, search timelines, and pull evidence windows around user actions. Agents can then compose those tools, plus a bundled skill that explains the replay data model, to answer questions such as:
- "When the user clicked X, what happened next?"
- "Where did the user spend the most time?"
- "What hangups caused the user to struggle?"
- "Were there failed requests, console errors, dead clicks, rage clicks, or DOM changes around this action?"
Current State
The CLI currently has basic replay support:
sentry replay list queries replay metadata.
sentry replay view fetches replay detail, related issues/traces, and a very small activity preview from recording segments.
sentry explore --dataset replays exposes replay index fields.
Relevant CLI files:
src/commands/replay/list.ts
src/commands/replay/view.ts
src/lib/api/replays.ts
src/lib/formatters/replay.ts
src/lib/replay-search.ts
src/types/replay.ts
The main limitation is that replay segments are treated as display garnish. replay view currently extracts only a handful of activity events from raw segments, capped to a tiny preview. There is no replay-specific local cache, no normalized event stream, no segment index, and no way for agents to inspect all DOM/rrweb/custom events.
There is also a likely correctness gap: the Sentry recording-segments endpoint is paginated, while the CLI currently downloads it as a single request. The frontend fetches segment pages with per_page=100 until count_segments is exhausted. The CLI should mirror that behavior so long replays are fully available.
Relevant Sentry/rrweb Model
Sentry replay data has several useful layers:
- Replay metadata from org/project replay endpoints: duration, urls, counts, errors, traces, clicks, user/browser/os/sdk/device, tags, etc.
- Recording segments from the project-scoped
recording-segments endpoint. These are compressed/packed storage blobs returned as rrweb/custom JSON when downloaded.
- rrweb events: full snapshots, incremental DOM mutations, mouse interactions, inputs, scrolls, viewport changes, media interactions, console logs, and more.
- Sentry custom replay frames: breadcrumbs, performance spans, options, video, web vitals, network breadcrumbs, console breadcrumbs, mobile events, etc.
- Related data: error events, feedback, trace ids, logs, click selector endpoints, and existing Seer replay summary APIs.
Useful Sentry references:
getsentry/sentry/src/sentry/replays/endpoints/project_replay_recording_segment_index.py
getsentry/sentry/src/sentry/replays/endpoints/project_replay_recording_segment_details.py
getsentry/sentry/src/sentry/replays/usecases/reader.py
getsentry/sentry/src/sentry/replays/usecases/pack.py
getsentry/sentry/src/sentry/replays/post_process.py
getsentry/sentry/src/sentry/replays/usecases/ingest/event_parser.py
getsentry/sentry/static/app/utils/replays/hooks/useReplayData.tsx
getsentry/sentry/static/app/utils/replays/hydrateFrames.tsx
getsentry/sentry/static/app/utils/replays/replayReader.tsx
@sentry-internal/rrweb-types, especially EventType, IncrementalSource, MouseInteractions, mutation/input/scroll/viewport payloads.
Proposal
Add a replay evidence system to the CLI with three parts:
- A local replay bundle cache.
- A normalized replay event model.
- Agent-friendly inspection commands and generated/bundled skill docs.
1. Replay Bundle Cache
Add a replay-specific cache under something like:
~/.sentry/cache/replays/{identity}/{org}/{project}/{replayId}/
Suggested contents:
metadata.json
segments/{segmentId}.json.gz
index/events.jsonl
index/navigation.json
index/interactions.json
index/network.json
index/problems.json
index/dom-summary.json
The raw segment payloads should live on disk, not in SQLite. SQLite can track manifests/cache lookup if useful, but segment blobs can be large and should be stored as private files.
Security/privacy requirements:
- Cache directory should be
0700; files should be 0600.
- Provide a way to bypass or clear replay cache.
- Clear replay cache on auth/logout flows if appropriate.
- Treat replay data as sensitive; do not attempt to unmask data that rrweb/Sentry masked.
- Be explicit in outputs when text/DOM data is unavailable because it was masked or not captured.
Caching behavior:
- Finished replays can generally be treated as immutable, subject to retention/privacy constraints.
- Live or recently active replays should be refreshable because segment count may grow.
- Avoid duplicating huge segment payloads in the generic HTTP response cache once replay-specific caching exists.
2. Normalized Replay Event Model
Introduce a normalized event schema that agents can rely on, regardless of whether the source was rrweb, a Sentry custom frame, a breadcrumb, a perf span, or related event data.
Example event:
{
"replayId": "abc123",
"segmentId": 12,
"frameIndex": 184,
"offsetMs": 83421,
"timestamp": "2026-05-03T18:42:11.421Z",
"kind": "click",
"category": "interaction",
"label": "button.checkout",
"url": "/checkout",
"selector": "button[data-test-id=checkout]",
"nodeId": 982,
"rawType": "IncrementalSnapshot",
"rawSource": "MouseInteraction"
}
Initial event kinds should include:
navigation
click
tap
input
focus
blur
scroll
viewport
mutation
dom-snapshot
breadcrumb
network
console
error
span
web-vital
memory
video
mobile
Important implementation detail: centralize timestamp normalization. rrweb event timestamps are milliseconds, while breadcrumb/performance payload fields may use seconds. Sentry has existing frontend/backend logic for this; the CLI should port the relevant normalization and test it with fixtures.
3. Agent-Friendly Commands
sentry replay fetch <replay>
Fetch replay metadata, all recording segment pages, and optionally related errors/traces/logs. Build/update the local replay bundle and indexes.
Useful flags:
--force Refresh cached data
--no-cache Fetch but do not persist
--segments <range> Fetch all or a subset for debugging
--include <list> metadata,segments,errors,traces,logs,clicks
--json Emit manifest and cache paths
sentry replay events <replay>
Primary agent primitive. Emit normalized replay events.
Useful flags:
--kind click,network,console,error,mutation
--from 01:20
--to 01:45
--contains checkout
--selector button.checkout
--url /checkout
--limit 200
--json
--jsonl
--raw Include raw frame payload pointer or payload snippet
This command should make DOM/rrweb activity inspectable without requiring agents to know the raw segment layout.
sentry replay window <replay>
Return an evidence slice around a timestamp or event match.
Examples:
sentry replay window org/project/abc123 --at 01:23 --before 10s --after 30s
sentry replay window org/project/abc123 --contains checkout --before 5s --after 20s
Output should group nearby activity by category: interaction, navigation, network, console, errors, DOM mutations, spans, web vitals.
sentry replay search <replay> <query>
Fuzzy search over normalized event fields:
- selectors
- visible/unmasked text
- URLs
- network URLs
- breadcrumb messages
- console messages
- error titles/messages
- span descriptions
Return matching events with stable pointers: replay id, segment id, frame index, timestamp, offset.
sentry replay dom <replay>
Inspect DOM-related replay data.
Initial version can be event-based rather than full reconstruction:
sentry replay dom org/project/abc123 --at 01:23
sentry replay dom org/project/abc123 --from 01:20 --to 01:30 --kind mutation,input,scroll
Later versions can add best-effort DOM reconstruction from full snapshots + incremental mutations. A browser-backed rrweb player should be optional, not required for the core CLI flow.
sentry replay stats <replay>
Deterministic summary useful for orientation:
- total duration
- route/screen time
- active vs idle time
- most clicked selectors
- slowest network calls
- failed requests
- console error count
- rage/dead click count
- largest DOM mutation bursts
- poor web vital events
sentry replay struggles <replay>
Deterministic friction analysis. Rank likely struggle windows using signals such as:
- dead clicks
- rage clicks
- repeated clicks on same element
- clicks followed by no navigation/network/DOM change
- failed fetch/xhr/resource requests
- slow network requests
- console errors
- hydration errors
- poor LCP/CLS
- large mutation bursts
- long idle periods immediately after interaction
- repeated input/focus without success
- back-and-forth navigation
Each finding should include evidence pointers and a recommended follow-up command.
Example output shape:
{
"finding": "Repeated checkout clicks did not produce navigation",
"severity": "medium",
"window": {"fromOffsetMs": 81200, "toOffsetMs": 94600},
"evidence": [
{"kind": "click", "offsetMs": 83421, "segmentId": 12, "frameIndex": 184},
{"kind": "click", "offsetMs": 87820, "segmentId": 12, "frameIndex": 211},
{"kind": "network", "offsetMs": 88110, "status": 500, "url": "/api/checkout"}
],
"nextCommand": "sentry replay window org/project/abc123 --at 01:23 --before 10s --after 30s"
}
Bundled Skill / Agent Documentation
Add generated or maintained skill docs that teach agents how to use these replay commands.
The skill should explain:
- Start with
sentry replay fetch.
- Use
sentry replay stats for orientation.
- Use
sentry replay struggles to find likely friction.
- Use
sentry replay search to locate user actions.
- Use
sentry replay window around relevant timestamps.
- Use
sentry replay events --kind ... for detailed evidence.
- Use
sentry replay dom for DOM/mutation/input/scroll inspection.
- Cite
segmentId, frameIndex, and offsets in conclusions.
- Treat masked text and absent DOM data as uncertainty.
- Do not invent user-visible text or DOM state when it was not captured.
This keeps the reasoning layer outside the CLI while making the CLI highly usable by agents.
Implementation Plan
Phase 1: Correct segment fetching
- Add paginated segment download support with
per_page=100.
- Use
count_segments from replay detail when available.
- Add tests for multi-page segment responses.
- Preserve current
replay view behavior, but ensure it uses complete segments or explicitly reports partial data.
Phase 2: Add replay cache and fetch command
- Add replay bundle storage helpers.
- Add
sentry replay fetch.
- Store replay metadata and raw segment pages/segments.
- Add cache manifest/versioning.
- Add privacy-safe permissions.
- Decide how replay cache interacts with generic response cache.
Phase 3: Normalize events
- Define typed rrweb/custom event constants and minimal schemas.
- Flatten segment frames into sorted normalized events.
- Port relevant hydration behavior from Sentry frontend where appropriate.
- Add fixtures for rrweb snapshots, mutations, inputs, clicks, network breadcrumbs, console breadcrumbs, perf spans, and web vitals.
Phase 4: Add inspection commands
- Add
replay events.
- Add
replay window.
- Add
replay search.
- Add
replay stats.
- Add
replay struggles.
- Keep JSON/JSONL output stable and evidence-first.
Phase 5: DOM inspection
- Start with DOM event summaries: snapshots, mutations, inputs, scrolls, viewport changes, node ids/selectors where available.
- Later add best-effort DOM reconstruction from rrweb snapshots and mutations.
- Avoid making a browser/rrweb replayer mandatory for core CLI use.
Phase 6: Agent skill/docs
- Add or update Sentry CLI skill documentation for replay investigation workflows.
- Ensure generated command docs include examples for agent workflows.
- Include sample workflows for common questions.
Open Questions
- What should the default replay cache TTL be, especially for sensitive replay payloads?
- Should replay cache be cleared automatically on logout, or should it be scoped only by identity fingerprint?
- Should generic response-cache skip replay segment payloads once replay-specific cache exists?
- How much related data should
replay fetch include by default: errors, traces, logs, click selectors?
- Should
replay events --raw include payload snippets or only stable pointers by default?
- Do we want JSONL as a first-class output mode for very large event streams?
- How much DOM reconstruction should be implemented in the CLI before delegating to browser-backed tooling?
Acceptance Criteria
- Agents can download and cache a full replay locally.
- Long replays with multiple segment pages are handled correctly.
- Agents can list normalized replay events by kind/time/search query.
- Agents can pull a compact evidence window around an action or timestamp.
- Agents can identify likely user struggle windows without needing an LLM.
- Outputs include stable evidence pointers: replay id, segment id, frame index, timestamp/offset.
- Masked or unavailable DOM/text data is represented honestly.
- The CLI remains the data/inspection layer; agent reasoning is documented in bundled skill guidance.
Summary
Make Session Replay data in
sentry-clihighly actionable for coding agents by adding first-class replay segment fetching, local caching, normalized event extraction, and inspection commands.The CLI should not try to be the agent or own an
askcommand as the primary interface. Instead, it should become the replay data plane: a reliable way for agents to fetch replay segments, cache them, inspect DOM/rrweb/custom events, search timelines, and pull evidence windows around user actions. Agents can then compose those tools, plus a bundled skill that explains the replay data model, to answer questions such as:Current State
The CLI currently has basic replay support:
sentry replay listqueries replay metadata.sentry replay viewfetches replay detail, related issues/traces, and a very small activity preview from recording segments.sentry explore --dataset replaysexposes replay index fields.Relevant CLI files:
src/commands/replay/list.tssrc/commands/replay/view.tssrc/lib/api/replays.tssrc/lib/formatters/replay.tssrc/lib/replay-search.tssrc/types/replay.tsThe main limitation is that replay segments are treated as display garnish.
replay viewcurrently extracts only a handful of activity events from raw segments, capped to a tiny preview. There is no replay-specific local cache, no normalized event stream, no segment index, and no way for agents to inspect all DOM/rrweb/custom events.There is also a likely correctness gap: the Sentry recording-segments endpoint is paginated, while the CLI currently downloads it as a single request. The frontend fetches segment pages with
per_page=100untilcount_segmentsis exhausted. The CLI should mirror that behavior so long replays are fully available.Relevant Sentry/rrweb Model
Sentry replay data has several useful layers:
recording-segmentsendpoint. These are compressed/packed storage blobs returned as rrweb/custom JSON when downloaded.Useful Sentry references:
getsentry/sentry/src/sentry/replays/endpoints/project_replay_recording_segment_index.pygetsentry/sentry/src/sentry/replays/endpoints/project_replay_recording_segment_details.pygetsentry/sentry/src/sentry/replays/usecases/reader.pygetsentry/sentry/src/sentry/replays/usecases/pack.pygetsentry/sentry/src/sentry/replays/post_process.pygetsentry/sentry/src/sentry/replays/usecases/ingest/event_parser.pygetsentry/sentry/static/app/utils/replays/hooks/useReplayData.tsxgetsentry/sentry/static/app/utils/replays/hydrateFrames.tsxgetsentry/sentry/static/app/utils/replays/replayReader.tsx@sentry-internal/rrweb-types, especiallyEventType,IncrementalSource,MouseInteractions, mutation/input/scroll/viewport payloads.Proposal
Add a replay evidence system to the CLI with three parts:
1. Replay Bundle Cache
Add a replay-specific cache under something like:
Suggested contents:
The raw segment payloads should live on disk, not in SQLite. SQLite can track manifests/cache lookup if useful, but segment blobs can be large and should be stored as private files.
Security/privacy requirements:
0700; files should be0600.Caching behavior:
2. Normalized Replay Event Model
Introduce a normalized event schema that agents can rely on, regardless of whether the source was rrweb, a Sentry custom frame, a breadcrumb, a perf span, or related event data.
Example event:
{ "replayId": "abc123", "segmentId": 12, "frameIndex": 184, "offsetMs": 83421, "timestamp": "2026-05-03T18:42:11.421Z", "kind": "click", "category": "interaction", "label": "button.checkout", "url": "/checkout", "selector": "button[data-test-id=checkout]", "nodeId": 982, "rawType": "IncrementalSnapshot", "rawSource": "MouseInteraction" }Initial event kinds should include:
navigationclicktapinputfocusblurscrollviewportmutationdom-snapshotbreadcrumbnetworkconsoleerrorspanweb-vitalmemoryvideomobileImportant implementation detail: centralize timestamp normalization. rrweb event timestamps are milliseconds, while breadcrumb/performance payload fields may use seconds. Sentry has existing frontend/backend logic for this; the CLI should port the relevant normalization and test it with fixtures.
3. Agent-Friendly Commands
sentry replay fetch <replay>Fetch replay metadata, all recording segment pages, and optionally related errors/traces/logs. Build/update the local replay bundle and indexes.
Useful flags:
sentry replay events <replay>Primary agent primitive. Emit normalized replay events.
Useful flags:
This command should make DOM/rrweb activity inspectable without requiring agents to know the raw segment layout.
sentry replay window <replay>Return an evidence slice around a timestamp or event match.
Examples:
Output should group nearby activity by category: interaction, navigation, network, console, errors, DOM mutations, spans, web vitals.
sentry replay search <replay> <query>Fuzzy search over normalized event fields:
Return matching events with stable pointers: replay id, segment id, frame index, timestamp, offset.
sentry replay dom <replay>Inspect DOM-related replay data.
Initial version can be event-based rather than full reconstruction:
Later versions can add best-effort DOM reconstruction from full snapshots + incremental mutations. A browser-backed rrweb player should be optional, not required for the core CLI flow.
sentry replay stats <replay>Deterministic summary useful for orientation:
sentry replay struggles <replay>Deterministic friction analysis. Rank likely struggle windows using signals such as:
Each finding should include evidence pointers and a recommended follow-up command.
Example output shape:
{ "finding": "Repeated checkout clicks did not produce navigation", "severity": "medium", "window": {"fromOffsetMs": 81200, "toOffsetMs": 94600}, "evidence": [ {"kind": "click", "offsetMs": 83421, "segmentId": 12, "frameIndex": 184}, {"kind": "click", "offsetMs": 87820, "segmentId": 12, "frameIndex": 211}, {"kind": "network", "offsetMs": 88110, "status": 500, "url": "/api/checkout"} ], "nextCommand": "sentry replay window org/project/abc123 --at 01:23 --before 10s --after 30s" }Bundled Skill / Agent Documentation
Add generated or maintained skill docs that teach agents how to use these replay commands.
The skill should explain:
sentry replay fetch.sentry replay statsfor orientation.sentry replay strugglesto find likely friction.sentry replay searchto locate user actions.sentry replay windowaround relevant timestamps.sentry replay events --kind ...for detailed evidence.sentry replay domfor DOM/mutation/input/scroll inspection.segmentId,frameIndex, and offsets in conclusions.This keeps the reasoning layer outside the CLI while making the CLI highly usable by agents.
Implementation Plan
Phase 1: Correct segment fetching
per_page=100.count_segmentsfrom replay detail when available.replay viewbehavior, but ensure it uses complete segments or explicitly reports partial data.Phase 2: Add replay cache and fetch command
sentry replay fetch.Phase 3: Normalize events
Phase 4: Add inspection commands
replay events.replay window.replay search.replay stats.replay struggles.Phase 5: DOM inspection
Phase 6: Agent skill/docs
Open Questions
replay fetchinclude by default: errors, traces, logs, click selectors?replay events --rawinclude payload snippets or only stable pointers by default?Acceptance Criteria