Skip to content

PDX-0: chore(scripts): add token-measure-vs-playwright.cjs#175

Merged
mrdailey99 merged 1 commit into
developfrom
chore/token-measure-script
May 15, 2026
Merged

PDX-0: chore(scripts): add token-measure-vs-playwright.cjs#175
mrdailey99 merged 1 commit into
developfrom
chore/token-measure-script

Conversation

@mrdailey99
Copy link
Copy Markdown
Collaborator

Summary

  • Promotes scripts/token-measure-vs-playwright.cjs from the PDX-482 worktree to develop so it lives as a permanent, discoverable artifact.
  • The script is referenced as the reproduction recipe in the methodology appendix of the published Provar MCP vs. Playwright + AI Coding Agents comparison deck. Without this PR, the methodology slide points at a path that does not exist on develop — undermining the deck's credibility.

Why a separate PR

This is a docs / marketing-support tool, not part of PDX-482's tool-description hardening. Bundling it would muddle PR scope. PDX-0 fits because there is no observable user or system behaviour change — this is purely a scripts/ addition.

What the script does

  1. Spawns Provar MCP locally (via bin/mcp-start.js) and Playwright MCP via npx -y @playwright/mcp.
  2. Sends the standard MCP initializetools/list JSON-RPC pair to each.
  3. Serializes the tools[] array — exactly what a real MCP client sends to its LLM as the tool catalog — and counts characters → tokens at chars / 4.
  4. Runs Provar MCP in three configurations: STANDARD, COMPACT (PROVAR_MCP_SCHEMA_MODE=compact), AUTHORING (compact + PROVAR_MCP_TOOLS=authoring,inspect,connection,validation).
  5. For Playwright MCP, additionally issues browser_navigate + browser_snapshot against example.com to capture a representative per-interaction baseline.

Sample output

═════════════════════════════════════════════════════════════════════════
Scenario                                              Tools  ~Tokens
═════════════════════════════════════════════════════════════════════════
Provar MCP — STANDARD                                    41   18346
Provar MCP — COMPACT                                     41   11750
Provar MCP — AUTHORING                                   21    7897
Playwright MCP — DEFAULT (out-of-the-box)                23    4271
═════════════════════════════════════════════════════════════════════════

Test plan

  • yarn install, yarn compile, yarn lint all clean against the chore worktree
  • Script runs successfully against both servers and produces the table above
  • No source-tree changes; no behavioural impact; no test suite changes required (script is invocation-time only, never imported)
  • Reviewer can reproduce by running node scripts/token-measure-vs-playwright.cjs from the repo root after yarn install && yarn compile

Jira

PDX-0 — chore (no Jira ticket). See the comparison deck methodology slide for the customer-facing context this script supports.

🤖 Generated with Claude Code

RCA: The published Provar-vs-Playwright comparison deck references this script in its methodology appendix as the reproduction recipe for the catalog-token figures. The script lived only in the PDX-482 worktree, so external readers (analysts, customers, prospects) could not actually reproduce the numbers — undermining the methodology slide's credibility.
Fix: Promote scripts/token-measure-vs-playwright.cjs to develop as an independent chore. Script spawns both MCP servers (Provar local via bin/mcp-start.js, Playwright via npx -y @playwright/mcp), sends identical initialize → tools/list JSON-RPC pairs, and reports catalog size in characters and approximate tokens (chars/4). Provar MCP runs three configurations (STANDARD / COMPACT / AUTHORING) to demonstrate the PROVAR_MCP_SCHEMA_MODE + PROVAR_MCP_TOOLS levers. Also issues a representative browser_navigate + browser_snapshot against example.com to capture Playwright's per-interaction baseline. No source-tree changes; no test or behaviour impact.
Copilot AI review requested due to automatic review settings May 15, 2026 15:43
@github-actions
Copy link
Copy Markdown

Quality Orchestrator

🟢 LOW · 0 / 100 · All changed files have mapped tests.


No test files mapped.


⚡ quality-orchestrator  ·  /qo stub <file>  ·  qo analyze-local

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a one-off measurement script that compares MCP tool-catalog token sizes between Provar MCP (in three configurations) and Playwright MCP, used as the reproduction recipe for a marketing comparison deck.

Changes:

  • New scripts/token-measure-vs-playwright.cjs spawns each MCP server over stdio, drives initializetools/list, and reports catalog chars/tokens.
  • For Playwright MCP, additionally measures a browser_navigate + browser_snapshot round trip against example.com as a per-interaction baseline.
  • Prints a formatted comparison table, Playwright/Provar ratios, and the top-5 heaviest tools per server.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@mrdailey99 mrdailey99 merged commit f639754 into develop May 15, 2026
8 checks passed
@mrdailey99 mrdailey99 deleted the chore/token-measure-script branch May 15, 2026 15:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants