Skip to content

Make session replay data agent-actionable #907

@dcramer

Description

@dcramer

Summary

Make Session Replay data in sentry-cli highly actionable for coding agents by adding first-class replay segment fetching, local caching, normalized event extraction, and inspection commands.

The CLI should not try to be the agent or own an ask command as the primary interface. Instead, it should become the replay data plane: a reliable way for agents to fetch replay segments, cache them, inspect DOM/rrweb/custom events, search timelines, and pull evidence windows around user actions. Agents can then compose those tools, plus a bundled skill that explains the replay data model, to answer questions such as:

  • "When the user clicked X, what happened next?"
  • "Where did the user spend the most time?"
  • "What hangups caused the user to struggle?"
  • "Were there failed requests, console errors, dead clicks, rage clicks, or DOM changes around this action?"

Current State

The CLI currently has basic replay support:

  • sentry replay list queries replay metadata.
  • sentry replay view fetches replay detail, related issues/traces, and a very small activity preview from recording segments.
  • sentry explore --dataset replays exposes replay index fields.

Relevant CLI files:

  • src/commands/replay/list.ts
  • src/commands/replay/view.ts
  • src/lib/api/replays.ts
  • src/lib/formatters/replay.ts
  • src/lib/replay-search.ts
  • src/types/replay.ts

The main limitation is that replay segments are treated as display garnish. replay view currently extracts only a handful of activity events from raw segments, capped to a tiny preview. There is no replay-specific local cache, no normalized event stream, no segment index, and no way for agents to inspect all DOM/rrweb/custom events.

There is also a likely correctness gap: the Sentry recording-segments endpoint is paginated, while the CLI currently downloads it as a single request. The frontend fetches segment pages with per_page=100 until count_segments is exhausted. The CLI should mirror that behavior so long replays are fully available.

Relevant Sentry/rrweb Model

Sentry replay data has several useful layers:

  1. Replay metadata from org/project replay endpoints: duration, urls, counts, errors, traces, clicks, user/browser/os/sdk/device, tags, etc.
  2. Recording segments from the project-scoped recording-segments endpoint. These are compressed/packed storage blobs returned as rrweb/custom JSON when downloaded.
  3. rrweb events: full snapshots, incremental DOM mutations, mouse interactions, inputs, scrolls, viewport changes, media interactions, console logs, and more.
  4. Sentry custom replay frames: breadcrumbs, performance spans, options, video, web vitals, network breadcrumbs, console breadcrumbs, mobile events, etc.
  5. Related data: error events, feedback, trace ids, logs, click selector endpoints, and existing Seer replay summary APIs.

Useful Sentry references:

  • getsentry/sentry/src/sentry/replays/endpoints/project_replay_recording_segment_index.py
  • getsentry/sentry/src/sentry/replays/endpoints/project_replay_recording_segment_details.py
  • getsentry/sentry/src/sentry/replays/usecases/reader.py
  • getsentry/sentry/src/sentry/replays/usecases/pack.py
  • getsentry/sentry/src/sentry/replays/post_process.py
  • getsentry/sentry/src/sentry/replays/usecases/ingest/event_parser.py
  • getsentry/sentry/static/app/utils/replays/hooks/useReplayData.tsx
  • getsentry/sentry/static/app/utils/replays/hydrateFrames.tsx
  • getsentry/sentry/static/app/utils/replays/replayReader.tsx
  • @sentry-internal/rrweb-types, especially EventType, IncrementalSource, MouseInteractions, mutation/input/scroll/viewport payloads.

Proposal

Add a replay evidence system to the CLI with three parts:

  1. A local replay bundle cache.
  2. A normalized replay event model.
  3. Agent-friendly inspection commands and generated/bundled skill docs.

1. Replay Bundle Cache

Add a replay-specific cache under something like:

~/.sentry/cache/replays/{identity}/{org}/{project}/{replayId}/

Suggested contents:

metadata.json
segments/{segmentId}.json.gz
index/events.jsonl
index/navigation.json
index/interactions.json
index/network.json
index/problems.json
index/dom-summary.json

The raw segment payloads should live on disk, not in SQLite. SQLite can track manifests/cache lookup if useful, but segment blobs can be large and should be stored as private files.

Security/privacy requirements:

  • Cache directory should be 0700; files should be 0600.
  • Provide a way to bypass or clear replay cache.
  • Clear replay cache on auth/logout flows if appropriate.
  • Treat replay data as sensitive; do not attempt to unmask data that rrweb/Sentry masked.
  • Be explicit in outputs when text/DOM data is unavailable because it was masked or not captured.

Caching behavior:

  • Finished replays can generally be treated as immutable, subject to retention/privacy constraints.
  • Live or recently active replays should be refreshable because segment count may grow.
  • Avoid duplicating huge segment payloads in the generic HTTP response cache once replay-specific caching exists.

2. Normalized Replay Event Model

Introduce a normalized event schema that agents can rely on, regardless of whether the source was rrweb, a Sentry custom frame, a breadcrumb, a perf span, or related event data.

Example event:

{
  "replayId": "abc123",
  "segmentId": 12,
  "frameIndex": 184,
  "offsetMs": 83421,
  "timestamp": "2026-05-03T18:42:11.421Z",
  "kind": "click",
  "category": "interaction",
  "label": "button.checkout",
  "url": "/checkout",
  "selector": "button[data-test-id=checkout]",
  "nodeId": 982,
  "rawType": "IncrementalSnapshot",
  "rawSource": "MouseInteraction"
}

Initial event kinds should include:

  • navigation
  • click
  • tap
  • input
  • focus
  • blur
  • scroll
  • viewport
  • mutation
  • dom-snapshot
  • breadcrumb
  • network
  • console
  • error
  • span
  • web-vital
  • memory
  • video
  • mobile

Important implementation detail: centralize timestamp normalization. rrweb event timestamps are milliseconds, while breadcrumb/performance payload fields may use seconds. Sentry has existing frontend/backend logic for this; the CLI should port the relevant normalization and test it with fixtures.

3. Agent-Friendly Commands

sentry replay fetch <replay>

Fetch replay metadata, all recording segment pages, and optionally related errors/traces/logs. Build/update the local replay bundle and indexes.

Useful flags:

--force              Refresh cached data
--no-cache           Fetch but do not persist
--segments <range>   Fetch all or a subset for debugging
--include <list>     metadata,segments,errors,traces,logs,clicks
--json               Emit manifest and cache paths

sentry replay events <replay>

Primary agent primitive. Emit normalized replay events.

Useful flags:

--kind click,network,console,error,mutation
--from 01:20
--to 01:45
--contains checkout
--selector button.checkout
--url /checkout
--limit 200
--json
--jsonl
--raw                Include raw frame payload pointer or payload snippet

This command should make DOM/rrweb activity inspectable without requiring agents to know the raw segment layout.

sentry replay window <replay>

Return an evidence slice around a timestamp or event match.

Examples:

sentry replay window org/project/abc123 --at 01:23 --before 10s --after 30s
sentry replay window org/project/abc123 --contains checkout --before 5s --after 20s

Output should group nearby activity by category: interaction, navigation, network, console, errors, DOM mutations, spans, web vitals.

sentry replay search <replay> <query>

Fuzzy search over normalized event fields:

  • selectors
  • visible/unmasked text
  • URLs
  • network URLs
  • breadcrumb messages
  • console messages
  • error titles/messages
  • span descriptions

Return matching events with stable pointers: replay id, segment id, frame index, timestamp, offset.

sentry replay dom <replay>

Inspect DOM-related replay data.

Initial version can be event-based rather than full reconstruction:

sentry replay dom org/project/abc123 --at 01:23
sentry replay dom org/project/abc123 --from 01:20 --to 01:30 --kind mutation,input,scroll

Later versions can add best-effort DOM reconstruction from full snapshots + incremental mutations. A browser-backed rrweb player should be optional, not required for the core CLI flow.

sentry replay stats <replay>

Deterministic summary useful for orientation:

  • total duration
  • route/screen time
  • active vs idle time
  • most clicked selectors
  • slowest network calls
  • failed requests
  • console error count
  • rage/dead click count
  • largest DOM mutation bursts
  • poor web vital events

sentry replay struggles <replay>

Deterministic friction analysis. Rank likely struggle windows using signals such as:

  • dead clicks
  • rage clicks
  • repeated clicks on same element
  • clicks followed by no navigation/network/DOM change
  • failed fetch/xhr/resource requests
  • slow network requests
  • console errors
  • hydration errors
  • poor LCP/CLS
  • large mutation bursts
  • long idle periods immediately after interaction
  • repeated input/focus without success
  • back-and-forth navigation

Each finding should include evidence pointers and a recommended follow-up command.

Example output shape:

{
  "finding": "Repeated checkout clicks did not produce navigation",
  "severity": "medium",
  "window": {"fromOffsetMs": 81200, "toOffsetMs": 94600},
  "evidence": [
    {"kind": "click", "offsetMs": 83421, "segmentId": 12, "frameIndex": 184},
    {"kind": "click", "offsetMs": 87820, "segmentId": 12, "frameIndex": 211},
    {"kind": "network", "offsetMs": 88110, "status": 500, "url": "/api/checkout"}
  ],
  "nextCommand": "sentry replay window org/project/abc123 --at 01:23 --before 10s --after 30s"
}

Bundled Skill / Agent Documentation

Add generated or maintained skill docs that teach agents how to use these replay commands.

The skill should explain:

  1. Start with sentry replay fetch.
  2. Use sentry replay stats for orientation.
  3. Use sentry replay struggles to find likely friction.
  4. Use sentry replay search to locate user actions.
  5. Use sentry replay window around relevant timestamps.
  6. Use sentry replay events --kind ... for detailed evidence.
  7. Use sentry replay dom for DOM/mutation/input/scroll inspection.
  8. Cite segmentId, frameIndex, and offsets in conclusions.
  9. Treat masked text and absent DOM data as uncertainty.
  10. Do not invent user-visible text or DOM state when it was not captured.

This keeps the reasoning layer outside the CLI while making the CLI highly usable by agents.

Implementation Plan

Phase 1: Correct segment fetching

  • Add paginated segment download support with per_page=100.
  • Use count_segments from replay detail when available.
  • Add tests for multi-page segment responses.
  • Preserve current replay view behavior, but ensure it uses complete segments or explicitly reports partial data.

Phase 2: Add replay cache and fetch command

  • Add replay bundle storage helpers.
  • Add sentry replay fetch.
  • Store replay metadata and raw segment pages/segments.
  • Add cache manifest/versioning.
  • Add privacy-safe permissions.
  • Decide how replay cache interacts with generic response cache.

Phase 3: Normalize events

  • Define typed rrweb/custom event constants and minimal schemas.
  • Flatten segment frames into sorted normalized events.
  • Port relevant hydration behavior from Sentry frontend where appropriate.
  • Add fixtures for rrweb snapshots, mutations, inputs, clicks, network breadcrumbs, console breadcrumbs, perf spans, and web vitals.

Phase 4: Add inspection commands

  • Add replay events.
  • Add replay window.
  • Add replay search.
  • Add replay stats.
  • Add replay struggles.
  • Keep JSON/JSONL output stable and evidence-first.

Phase 5: DOM inspection

  • Start with DOM event summaries: snapshots, mutations, inputs, scrolls, viewport changes, node ids/selectors where available.
  • Later add best-effort DOM reconstruction from rrweb snapshots and mutations.
  • Avoid making a browser/rrweb replayer mandatory for core CLI use.

Phase 6: Agent skill/docs

  • Add or update Sentry CLI skill documentation for replay investigation workflows.
  • Ensure generated command docs include examples for agent workflows.
  • Include sample workflows for common questions.

Open Questions

  • What should the default replay cache TTL be, especially for sensitive replay payloads?
  • Should replay cache be cleared automatically on logout, or should it be scoped only by identity fingerprint?
  • Should generic response-cache skip replay segment payloads once replay-specific cache exists?
  • How much related data should replay fetch include by default: errors, traces, logs, click selectors?
  • Should replay events --raw include payload snippets or only stable pointers by default?
  • Do we want JSONL as a first-class output mode for very large event streams?
  • How much DOM reconstruction should be implemented in the CLI before delegating to browser-backed tooling?

Acceptance Criteria

  • Agents can download and cache a full replay locally.
  • Long replays with multiple segment pages are handled correctly.
  • Agents can list normalized replay events by kind/time/search query.
  • Agents can pull a compact evidence window around an action or timestamp.
  • Agents can identify likely user struggle windows without needing an LLM.
  • Outputs include stable evidence pointers: replay id, segment id, frame index, timestamp/offset.
  • Masked or unavailable DOM/text data is represented honestly.
  • The CLI remains the data/inspection layer; agent reasoning is documented in bundled skill guidance.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions