Skip to content

feat: bridge web provider to agent-browser#826

Merged
thymikee merged 1 commit into
mainfrom
codex/agent-browser-web-provider
Jun 19, 2026
Merged

feat: bridge web provider to agent-browser#826
thymikee merged 1 commit into
mainfrom
codex/agent-browser-web-provider

Conversation

@thymikee

@thymikee thymikee commented Jun 19, 2026

Copy link
Copy Markdown
Member

Summary

Bridge the semantic web provider seam to the local agent-browser CLI with session-scoped command execution, JSON envelope/error handling, and snapshot normalization for refs, labels, textbox values, parents, rects, and selector-compatible roles.

flowchart TD
  AD["agent-device web command\nopen / snapshot / press / fill / close"] --> Daemon["daemon routing and session lease"]
  Daemon --> Interactor["createWebInteractor\ncoordinate-first Interactor contract"]
  Interactor --> Provider["WebProvider seam"]
  Provider --> Adapter["AgentBrowserWebProvider"]

  Adapter --> AB["agent-browser CLI\n--json --session <name>"]
  AB --> Chrome["local Chrome session"]
  AB --> Envelope["success/error JSON envelope"]
  Envelope --> Adapter

  Adapter --> Snapshot["normalizeAgentBrowserSnapshot\naria snapshot text + refs map + bounded get box calls"]
  Snapshot --> Nodes["RawSnapshotNode\ntype, role, label, value, parent, rect"]
  Nodes --> ADOutput["agent-device user output\nPage/Snapshot header + @eN [role] lines"]
  Nodes --> Selectors["shared selector runtime\nrole selectors and replay healing"]

  AB -. direct agent-browser usage .-> ABOutput["agent-browser native output\naria tree lines + native refs"]
  Daemon --> Cleanup["web close cleanup"]
  Cleanup --> Adapter
Loading

User-visible output is intentionally agent-device output, not raw agent-browser output. Direct agent-browser prints aria-tree lines such as - textbox "Email" [ref=e3]: ada@example.com; agent-device reprojects the same state as Snapshot: 3 nodes and @e2 [text-field] "ada@example.com" [editable] so refs/selectors fit the existing agent-device command model.

Add normal web session close cleanup so agent-device close calls agent-browser close even without a positional target.

Closes #820

Touched files: 12. Scope stayed within the web provider adapter, web close cleanup, and web snapshot text presentation.

Validation

Format passed with pnpm format. Typecheck passed with pnpm typecheck. Focused tests passed for src/platforms/web, session-close-shutdown, and output formatting. Fallow passed with pnpm check:fallow --base origin/main. Build passed.

Manual comparison passed on https://vercel.com/: agent-browser returned native aria-tree refs, while agent-device returned normalized Page: https://vercel.com/, Snapshot: 15 nodes, and agent-device refs for the same visible controls. Manual comparison passed on https://example.com/ with 2 normalized nodes. Manual form fixture verified textbox value projection: agent-browser showed [ref=e3]: ada@example.com; agent-device showed @e2 [text-field] "ada@example.com" [editable]. Sessions codex-web-vercel-817 and codex-web-example-817 were closed.

@github-actions

github-actions Bot commented Jun 19, 2026

Copy link
Copy Markdown

Size Report

Metric Base Current Diff
JS raw 1.3 MB 1.3 MB +5.5 kB
JS gzip 417.3 kB 419.3 kB +2.0 kB
npm tarball 552.0 kB 554.2 kB +2.1 kB
npm unpacked 1.9 MB 1.9 MB +5.5 kB

Startup median (7 runs, lower is better):

Scenario Base Current Diff
CLI --version 26.1 ms 25.3 ms -0.8 ms
CLI --help 44.0 ms 43.3 ms -0.7 ms

Top changed chunks:

Chunk Raw diff Gzip diff
dist/src/9722.js +5.9 kB +2.1 kB
dist/src/6311.js +118 B +27 B
dist/src/5299.js +4 B +15 B

@thymikee

Copy link
Copy Markdown
Member Author

CI finding: Fallow Code Quality is failing on this PR.

Root cause from the failing job log:

  • Duplication: 13-line clone group duplicated between src/platforms/web/agent-browser-provider.ts:216-228 and src/platforms/web/agent-browser-snapshot.ts:161-173.
  • Complexity: src/platforms/web/agent-browser-provider.ts:195 parseRect is over threshold (12 cyclomatic, 9 cognitive, CRAP 43.1).

The rest of the core CI checks shown for #826 are green; this is the blocker to address next. Please run pnpm check:fallow --base 0a29a3a2b884ee7427e218606c7305e92a2d82a2 locally after extracting/deduplicating the repeated parsing block and reducing parseRect complexity.

@thymikee thymikee force-pushed the codex/semantic-web-provider branch from 817d991 to 77377d8 Compare June 19, 2026 10:01
@thymikee thymikee force-pushed the codex/agent-browser-web-provider branch from d064823 to ed957f3 Compare June 19, 2026 10:03
@thymikee

Copy link
Copy Markdown
Member Author

Review findings from delegated pass:

[P1] Parse JSON envelopes from non-zero agent-browser exits before mapping process failure. In src/platforms/web/agent-browser-provider.ts:106, runCmd is called without allowFailure, so a real agent-browser --json failure exits 1 and never reaches the { success: false } parser below. I reproduced this with agent-browser 0.27.1: get box @e999 --json --session codex-review-missing prints a JSON {"success":false,...} envelope and exits 1. With the current code, missing-box errors degrade to "agent-browser exited with code 1" and fetchRefRect cannot recognize/ignore normal no-box/not-found responses. Please allow non-zero command results through to the JSON-envelope parser while still preserving TOOL_MISSING/timeouts, then throw the normalized envelope error.

[P1] Populate RawSnapshotNode.type for web roles. In src/platforms/web/agent-browser-snapshot.ts:73 and :85, the adapter stores the browser role in node.role, but the shared selector engine and selector-chain builder match role=... against node.type. A normalized button "Save" node currently fails role=button label=Save through resolveSelectorChain. This breaks role-based click/fill/get/wait targeting and replay-healing selector chains for web snapshots. Please set type, or both type and role, from the agent-browser role/type metadata and add a test that resolves a normalized web snapshot through the selector runtime or resolveSelectorChain.

@thymikee thymikee force-pushed the codex/agent-browser-web-provider branch from ed957f3 to 7754b98 Compare June 19, 2026 10:12
Base automatically changed from codex/semantic-web-provider to main June 19, 2026 10:26
@thymikee thymikee force-pushed the codex/agent-browser-web-provider branch 2 times, most recently from 04a7b54 to b2da269 Compare June 19, 2026 11:38
@thymikee

Copy link
Copy Markdown
Member Author

Addressed the remaining review comments in b2da269:

  • agent-browser JSON commands now run with allowFailure true, so non-zero JSON envelopes are parsed before normalized errors are thrown while TOOL_MISSING and timeouts still go through the run-error mapper.
  • Web snapshot roles now populate both type and role, with coverage that builds and resolves role=button label=Save through the shared selector path.
  • Local validation passed: pnpm format, pnpm typecheck, focused Vitest for src/platforms/web plus session close shutdown, and pnpm check:fallow --base origin/main.

@thymikee thymikee force-pushed the codex/agent-browser-web-provider branch from b2da269 to 2139438 Compare June 19, 2026 14:15
@thymikee

Copy link
Copy Markdown
Member Author

Addressed the pasted architecture review in 2139438:

  • Parsed real Playwright aria-snapshot textbox values from the trailing [ref=eN]: value form, and changed the snapshot fixture so refs only carry name/role like live agent-browser output.
  • Kept web fill on the current coordinate-first seam, but fixed select-all to use Meta+a on macOS and Control+a elsewhere, with a comment naming the intentional coordinate bridge.
  • Changed snapshot rect lookup from sequential per-ref calls to bounded concurrent batches.
  • Re-ran validation: pnpm format, pnpm typecheck, focused Vitest for src/platforms/web plus session close shutdown, and pnpm check:fallow --base origin/main all pass.

@thymikee thymikee force-pushed the codex/agent-browser-web-provider branch from 2139438 to 40fb0e9 Compare June 19, 2026 14:23
@thymikee thymikee merged commit 87a6ac7 into main Jun 19, 2026
19 checks passed
@thymikee thymikee deleted the codex/agent-browser-web-provider branch June 19, 2026 14:47
@github-actions

Copy link
Copy Markdown
PR Preview Action v1.8.1
Preview removed because the pull request was closed.
2026-06-19 14:47 UTC

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: bridge web provider to agent-browser

1 participant