PDX-0: chore(scripts): add token-measure-vs-playwright.cjs by mrdailey99 · Pull Request #175 · ProvarTesting/provardx-cli

mrdailey99 · 2026-05-15T15:43:45Z

Summary

Promotes scripts/token-measure-vs-playwright.cjs from the PDX-482 worktree to develop so it lives as a permanent, discoverable artifact.
The script is referenced as the reproduction recipe in the methodology appendix of the published Provar MCP vs. Playwright + AI Coding Agents comparison deck. Without this PR, the methodology slide points at a path that does not exist on develop — undermining the deck's credibility.

Why a separate PR

This is a docs / marketing-support tool, not part of PDX-482's tool-description hardening. Bundling it would muddle PR scope. PDX-0 fits because there is no observable user or system behaviour change — this is purely a scripts/ addition.

What the script does

Spawns Provar MCP locally (via bin/mcp-start.js) and Playwright MCP via npx -y @playwright/mcp.
Sends the standard MCP initialize → tools/list JSON-RPC pair to each.
Serializes the tools[] array — exactly what a real MCP client sends to its LLM as the tool catalog — and counts characters → tokens at chars / 4.
Runs Provar MCP in three configurations: STANDARD, COMPACT (PROVAR_MCP_SCHEMA_MODE=compact), AUTHORING (compact + PROVAR_MCP_TOOLS=authoring,inspect,connection,validation).
For Playwright MCP, additionally issues browser_navigate + browser_snapshot against example.com to capture a representative per-interaction baseline.

Sample output

═════════════════════════════════════════════════════════════════════════
Scenario                                              Tools  ~Tokens
═════════════════════════════════════════════════════════════════════════
Provar MCP — STANDARD                                    41   18346
Provar MCP — COMPACT                                     41   11750
Provar MCP — AUTHORING                                   21    7897
Playwright MCP — DEFAULT (out-of-the-box)                23    4271
═════════════════════════════════════════════════════════════════════════

Test plan

yarn install, yarn compile, yarn lint all clean against the chore worktree
Script runs successfully against both servers and produces the table above
No source-tree changes; no behavioural impact; no test suite changes required (script is invocation-time only, never imported)
Reviewer can reproduce by running node scripts/token-measure-vs-playwright.cjs from the repo root after yarn install && yarn compile

Jira

PDX-0 — chore (no Jira ticket). See the comparison deck methodology slide for the customer-facing context this script supports.

🤖 Generated with Claude Code

RCA: The published Provar-vs-Playwright comparison deck references this script in its methodology appendix as the reproduction recipe for the catalog-token figures. The script lived only in the PDX-482 worktree, so external readers (analysts, customers, prospects) could not actually reproduce the numbers — undermining the methodology slide's credibility. Fix: Promote scripts/token-measure-vs-playwright.cjs to develop as an independent chore. Script spawns both MCP servers (Provar local via bin/mcp-start.js, Playwright via npx -y @playwright/mcp), sends identical initialize → tools/list JSON-RPC pairs, and reports catalog size in characters and approximate tokens (chars/4). Provar MCP runs three configurations (STANDARD / COMPACT / AUTHORING) to demonstrate the PROVAR_MCP_SCHEMA_MODE + PROVAR_MCP_TOOLS levers. Also issues a representative browser_navigate + browser_snapshot against example.com to capture Playwright's per-interaction baseline. No source-tree changes; no test or behaviour impact.

github-actions · 2026-05-15T15:44:06Z

Quality Orchestrator

🟢 LOW · 0 / 100 · All changed files have mapped tests.

No test files mapped.

_{⚡ quality-orchestrator · /qo stub <file> · qo analyze-local}

Copilot

Pull request overview

Adds a one-off measurement script that compares MCP tool-catalog token sizes between Provar MCP (in three configurations) and Playwright MCP, used as the reproduction recipe for a marketing comparison deck.

Changes:

New scripts/token-measure-vs-playwright.cjs spawns each MCP server over stdio, drives initialize → tools/list, and reports catalog chars/tokens.
For Playwright MCP, additionally measures a browser_navigate + browser_snapshot round trip against example.com as a per-interaction baseline.
Prints a formatted comparison table, Playwright/Provar ratios, and the top-5 heaviest tools per server.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot AI review requested due to automatic review settings May 15, 2026 15:43

Copilot started reviewing on behalf of mrdailey99 May 15, 2026 15:44 View session

Copilot AI reviewed May 15, 2026

View reviewed changes

mrdailey99 merged commit f639754 into develop May 15, 2026
8 checks passed

mrdailey99 deleted the chore/token-measure-script branch May 15, 2026 15:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PDX-0: chore(scripts): add token-measure-vs-playwright.cjs#175

PDX-0: chore(scripts): add token-measure-vs-playwright.cjs#175
mrdailey99 merged 1 commit into
developfrom
chore/token-measure-script

mrdailey99 commented May 15, 2026

Uh oh!

github-actions Bot commented May 15, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mrdailey99 commented May 15, 2026

Summary

Why a separate PR

What the script does

Sample output

Test plan

Jira

Uh oh!

github-actions Bot commented May 15, 2026

Quality Orchestrator

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants