ProvarTesting · mrdailey99 · May 15, 2026 · May 15, 2026 · May 15, 2026
diff --git a/docs/PROVAR_TOOL_GUIDE.md b/docs/PROVAR_TOOL_GUIDE.md
@@ -77,15 +77,27 @@ provar_properties_set  { file_path: "<output_path>", key: "connectionName", valu
 
 ## "I want to write a new test"
 
+A Provar test case is a tree (scenarios → UI screens → asserts), not a flat list of steps. The agent that calls `provar_testcase_generate` is responsible for constructing the full tree in **one** call. Splitting authoring across many tool calls causes scenario numbering drift, flat asserts, and inconsistent step types — `provar_testcase_step_edit` is for **amending** an existing test case, not for **constructing** one.
+
+Recommended sequence:
+
 ```
-1. provar_project_inspect    { project_path }           ← find coverage gaps first
-2. provar_testcase_generate  { project_path, name, ... }
-3. provar_testcase_step_edit { test_case_path, ... }    ← repeat per step
-4. provar_testcase_validate  { file_path }              ← must pass before adding to plan
-5. provar_testplan_add-instance  { project_path, plan_name, test_case_path }
-6. provar_testplan_validate      { project_path, plan_name }
+1. provar_project_inspect              { project_path }                        ← find coverage gaps first
+2. provar_qualityhub_examples_retrieve { object_or_scenario }                  ← ground in corpus examples for the step types you need
+3. provar_testcase_generate            { test_case_name, steps: [<ALL steps>] } ← single call, full step tree in one payload
+4. provar_testcase_validate            { file_path }                            ← must pass before adding to plan
+5. provar_testplan_add-instance        { project_path, plan_name, test_case_path }
+6. provar_testplan_validate            { project_path, plan_name }
 ```
 
+Use `provar_testcase_step_edit` only when:
+
+- Adding a single step to an existing, already-validated test case
+- Fixing a step's attributes after a validation finding
+- Targeted edits during debugging
+
+Do **not** use `provar_testcase_step_edit` to construct a test case step-by-step from an empty skeleton — the LLM loses scenario context between calls and the resulting structure is unreliable.
+
 ---
 
 ## "I want to work with Salesforce metadata"

diff --git a/docs/mcp-pilot-guide.md b/docs/mcp-pilot-guide.md
@@ -439,6 +439,40 @@ NitroX is Provar's Hybrid Model for locators — it maps Salesforce component-ba
 
 ---
 
+### Scenario 12: Construct a Multi-Scenario Test Case in a Single Call
+
+**Goal:** Confirm the AI authors a multi-scenario test case by passing the full step tree to `provar_testcase_generate` in **one** call — not by generating an empty skeleton and looping `provar_testcase_step_edit` per step.
+
+**Background:** A regression in 1.5.0 (PDX-479) traced to authoring guidance that steered LLMs toward a per-step construction pattern. Multi-call construction drops scenario numbers (e.g. Scenario 1 → Scenario 3, no Scenario 2), flattens asserts that should be nested inside `UiWithScreen` clauses, and produces inconsistent assert API IDs across the case. This scenario exists so the regression class is exercised in pilot evaluation and cannot recur silently.
+
+**Prompt:**
+
+> "Create a Provar test case `AccountFlow.testcase` that covers three scenarios:
+>
+> 1. **Create Account** — navigate to the Account home, click New, set Name = `{AccountName}` and Phone = `{AccountPhone}`, click Save
+> 2. **Verify Account on List** — navigate back to the Account list view, assert the Name and Phone values
+> 3. **Open Account Detail** — open the just-created Account, assert all saved field values
+>
+> Use UI On Screen wrappers, AssertValues for value assertions, and reference SetValues variables with `{Name}`. Write to `<project-path>/tests/AccountFlow.testcase`."
+
+**What to look for (PASS):**
+
+- Exactly **one** call to `provar_testcase_generate` with a populated `steps[]` array — not a call with `steps: []` followed by N `step_edit` calls
+- The generated XML lists three scenarios numbered consecutively (1, 2, 3 — no skipped numbers)
+- Each scenario's UI actions and asserts are nested inside the appropriate `UiWithScreen` clause (or its equivalent grouping element) — not flat siblings under `<steps>`
+- Assert step types are consistent across the case (e.g. all `AssertValues`, not mixed `AssertValues` + `UiAssert` for the same purpose)
+- `provar_testcase_validate` on the result returns `is_valid: true`
+
+**What to look for (FAIL — regression indicator):**
+
+- Two or more calls to `provar_testcase_generate` for the same file
+- A call to `provar_testcase_generate` with `steps: []` followed by `provar_testcase_step_edit` calls
+- The generated case skips a scenario number, mixes assert API IDs for similar assertions, or emits asserts as flat siblings rather than nested inside the screen wrapper
+
+If any FAIL indicator appears, file against PDX-479 (or its successor) with the prompt and the generated XML attached.
+
+---
+
 ## Security Model
 
 ### What the server does

diff --git a/docs/mcp.md b/docs/mcp.md
@@ -703,6 +703,8 @@ Validates a Java Page Object source file against 30+ quality rules (structural c
 
 Generates an XML test case skeleton with UUID v4 guids and sequential `testItemId` values.
 
+> **Construction pattern (read first).** Pass the FULL step tree for the test case in a single call via the `steps[]` array. Do **not** call this tool with `steps: []` and then append steps via repeated `provar_testcase_step_edit` calls — that pattern drops scenarios, flattens nesting, and produces inconsistent step types. `provar_testcase_step_edit` is for **amending** an already-validated test case (single-step add, attribute fix, debug edit), not for **constructing** one from scratch.
+
 **Generated `<testCase>` element structure (Provar requirements):**
 
 ```xml
@@ -1545,6 +1547,8 @@ Salesforce DML error categories (`SALESFORCE_*`) represent test-data failures
 
 Atomically add or remove a single step (`<apiCall>`) in a Provar XML test case file. Writes a `.bak` backup before mutating, runs structural validation after the edit, and automatically restores the backup if validation fails.
 
+> **When to use.** This tool is for **amending** an existing, already-validated test case (single-step add, attribute fix, debug edit). It is **not** for constructing a test case from scratch by calling it repeatedly after a `steps: []` `provar_testcase_generate`. Building a case step-by-step via repeated `step_edit` calls produces structurally invalid test cases (dropped scenarios, flat asserts, inconsistent step types). For new test cases, pass the full step tree to `provar_testcase_generate` in a single call.
+
 Prerequisites: the test case file must exist and be valid XML with a `<testCase><steps>` structure.
 
 | Input                 | Type    | Required       | Description                                                                                                             |

diff --git a/scripts/pdx-481-trace.cjs b/scripts/pdx-481-trace.cjs
@@ -0,0 +1,251 @@
+// PDX-481 prompt-flow trace.
+//
+// Drives the patched MCP server over JSON-RPC stdio and captures the EXACT
+// bytes that an MCP client (Claude Desktop / Cursor / etc.) would surface to
+// its LLM at every decision point in the test-authoring flow:
+//
+//   1. The orchestration prompt the LLM reads when planning ("I want to author a new test case")
+//   2. The tool-guide resource the LLM reads when picking the right tool
+//   3. The provar_testcase_generate tool description the LLM reads at the call site
+//   4. The provar_testcase_step_edit tool description (amend-only contract)
+//   5. The actual XML the tool emits when given a real multi-scenario payload
+//
+// Run from the worktree root after `yarn compile`:
+//   node scripts/pdx-481-trace.cjs
+
+'use strict';
+
+const { spawn } = require('child_process');
+const os = require('os');
+const path = require('path');
+
+const TMP = os.tmpdir();
+const entry = path.resolve(__dirname, '..', 'bin', 'mcp-start.js');
+
+const server = spawn(process.execPath, [entry, 'mcp', 'start', '--allowed-paths', TMP, '--no-update-check'], {
+  stdio: ['pipe', 'pipe', 'inherit'],
+});
+
+let nextId = 1;
+const pending = new Map();
+let buf = '';
+
+server.stdout.on('data', (chunk) => {
+  buf += chunk.toString('utf-8');
+  let nl;
+  while ((nl = buf.indexOf('\n')) !== -1) {
+    const line = buf.slice(0, nl).trim();
+    buf = buf.slice(nl + 1);
+    if (!line) continue;
+    try {
+      const msg = JSON.parse(line);
+      const cb = pending.get(msg.id);
+      if (cb) {
+        pending.delete(msg.id);
+        cb(msg);
+      }
+    } catch {
+      /* ignore non-JSON */
+    }
+  }
+});
+
+function rpc(method, params) {
+  const id = nextId++;
+  const req = JSON.stringify({ jsonrpc: '2.0', id, method, params }) + '\n';
+  return new Promise((resolve, reject) => {
+    pending.set(id, resolve);
+    setTimeout(() => {
+      if (pending.has(id)) {
+        pending.delete(id);
+        reject(new Error(`Timeout waiting for ${method}`));
+      }
+    }, 10000);
+    server.stdin.write(req);
+  });
+}
+
+function divider(label) {
+  console.log('\n' + '═'.repeat(78));
+  console.log('  ' + label);
+  console.log('═'.repeat(78));
+}
+
+function subdivider(label) {
+  console.log('\n' + '─'.repeat(78));
+  console.log('  ' + label);
+  console.log('─'.repeat(78));
+}
+
+function indent(text, prefix = '    ') {
+  return text
+    .split('\n')
+    .map((l) => prefix + l)
+    .join('\n');
+}
+
+function extractSection(text, headerRegex, nextHeaderRegex) {
+  const startMatch = headerRegex.exec(text);
+  if (!startMatch) return '<section not found>';
+  const start = startMatch.index;
+  const tail = text.slice(start);
+  const endMatch = nextHeaderRegex.exec(tail.slice(headerRegex.source.length));
+  return endMatch ? tail.slice(0, endMatch.index + headerRegex.source.length) : tail;
+}
+
+(async () => {
+  await rpc('initialize', {
+    protocolVersion: '2024-11-05',
+    capabilities: {},
+    clientInfo: { name: 'pdx-481-trace', version: '1.0.0' },
+  });
+
+  // ── 1. The orchestration prompt's author-test flow ────────────────────────
+  divider('TRACE 1 — what the LLM reads when "planning a test-case authoring task"');
+  console.log('Tool call simulated: prompts/get(provar.guide.orchestration, task=author-test)');
+  console.log('This is what an MCP client surfaces to the LLM as the planning brief.\n');
+
+  const orch = await rpc('prompts/get', {
+    name: 'provar.guide.orchestration',
+    arguments: { task: 'author-test' },
+  });
+  const orchText = orch.result?.messages?.[0]?.content?.text ?? '<empty>';
+  console.log(indent(orchText));
+
+  // ── 2. The tool-guide resource ────────────────────────────────────────────
+  divider('TRACE 2 — what the LLM reads when "picking the right tool to author a test"');
+  console.log('Tool call simulated: resources/read(provar://docs/tool-guide)');
+  console.log('Excerpting the "I want to write a new test" section only.\n');
+
+  const guide = await rpc('resources/read', { uri: 'provar://docs/tool-guide' });
+  const guideText = guide.result?.contents?.[0]?.text ?? '<empty>';
+  const section = extractSection(guideText, /## "I want to write a new test"/, /\n## "/);
+  console.log(indent(section));
+
+  // ── 3. The provar_testcase_generate tool description ──────────────────────
+  divider('TRACE 3 — what the LLM reads at the call site of provar_testcase_generate');
+  console.log('Tool call simulated: tools/list (filtered to provar_testcase_generate)');
+  console.log('First 1000 chars of the description string surfaced to the model.\n');
+
+  const tools = await rpc('tools/list', {});
+  const toolList = tools.result?.tools ?? [];
+  const gen = toolList.find((t) => t.name === 'provar_testcase_generate');
+  console.log(
+    indent(
+      (gen?.description ?? '<not found>').slice(0, 1000) + (gen?.description?.length > 1000 ? '… (truncated)' : '')
+    )
+  );
+
+  subdivider('steps[] field description (read by the LLM when filling the argument)');
+  const stepsField = gen?.inputSchema?.properties?.steps;
+  console.log(indent(stepsField?.description ?? '<no field description>'));
+
+  // ── 4. The provar_testcase_step_edit tool description ─────────────────────
+  divider('TRACE 4 — what the LLM reads at the call site of provar_testcase_step_edit');
+  console.log('Tool call simulated: tools/list (filtered to provar_testcase_step_edit)\n');
+
+  const edit = toolList.find((t) => t.name === 'provar_testcase_step_edit');
+  console.log(
+    indent(
+      (edit?.description ?? '<not found>').slice(0, 1000) + (edit?.description?.length > 1000 ? '… (truncated)' : '')
+    )
+  );
+
+  // ── 5. Real tool call — multi-scenario single-call generate ───────────────
+  divider('TRACE 5 — real tool call: provar_testcase_generate with a 3-scenario payload');
+  console.log("Tool call simulated: an LLM that follows TRACE 1-3's guidance constructs");
+  console.log('the full step tree and passes it in ONE call. We capture the output:\n');
+
+  const callResult = await rpc('tools/call', {
+    name: 'provar_testcase_generate',
+    arguments: {
+      // eslint-disable-next-line camelcase
+      test_case_name: 'AccountFlow',
+      steps: [
+        // Scenario 1 — Create Account
+        { api_id: 'UiConnect', name: 'Salesforce Connect: AdminOauth', attributes: {} },
+        {
+          api_id: 'SetValues',
+          name: 'Set Account Test Data',
+          attributes: { AccountName: 'Acme', AccountPhone: '555-0100' },
+        },
+        { api_id: 'UiNavigate', name: 'Scenario 1 - When: navigate to Account home', attributes: {} },
+        { api_id: 'UiDoAction', name: 'Scenario 1 - When: click New', attributes: {} },
+        {
+          api_id: 'SetValues',
+          name: 'Scenario 1 - When: fill form',
+          attributes: { Name: '{AccountName}', Phone: '{AccountPhone}' },
+        },
+        { api_id: 'UiDoAction', name: 'Scenario 1 - When: click Save', attributes: {} },
+        // Scenario 2 — Verify on list view (the scenario that went missing on 1.5.0)
+        { api_id: 'UiNavigate', name: 'Scenario 2 - Then: go to Account list', attributes: {} },
+        {
+          api_id: 'AssertValues',
+          name: 'Scenario 2 - Then: assert Name on list',
+          attributes: { expectedValue: '{AccountName}', actualValue: 'Name', comparisonType: 'EqualTo' },
+        },
+        {
+          api_id: 'AssertValues',
+          name: 'Scenario 2 - Then: assert Phone on list',
+          attributes: { expectedValue: '{AccountPhone}', actualValue: 'Phone', comparisonType: 'EqualTo' },
+        },
+        // Scenario 3 — Open detail and assert all
+        { api_id: 'UiDoAction', name: 'Scenario 3 - When: open Account detail', attributes: {} },
+        {
+          api_id: 'AssertValues',
+          name: 'Scenario 3 - Then: assert Name on detail',
+          attributes: { expectedValue: '{AccountName}', actualValue: 'Name', comparisonType: 'EqualTo' },
+        },
+        {
+          api_id: 'AssertValues',
+          name: 'Scenario 3 - Then: assert Phone on detail',
+          attributes: { expectedValue: '{AccountPhone}', actualValue: 'Phone', comparisonType: 'EqualTo' },
+        },
+      ],
+      dry_run: true,
+      overwrite: false,
+    },
+  });
+
+  const content = callResult.result?.content?.[0]?.text ?? '{}';
+  const body = JSON.parse(content);
+
+  subdivider('Tool response — top-level fields');
+  console.log(indent(`step_count: ${body.step_count}`));
+  console.log(indent(`written:    ${body.written}`));
+  console.log(indent(`is_valid:   ${body.validation?.is_valid}`));
+  console.log(indent(`validity:   ${body.validation?.validity_score}`));
+  console.log(indent(`quality:    ${body.validation?.quality_score}`));
+  console.log(indent(`errors:     ${body.validation?.error_count}`));
+
+  subdivider('Generated XML — assertions a reviewer can run by eye');
+  const xml = body.xml_content;
+
+  const checks = [
+    [
+      'Sequential testItemIds 1..12, no gaps',
+      [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12].every((n) => xml.includes(`testItemId="${n}"`)),
+    ],
+    ['No spurious testItemId="13"', !xml.includes('testItemId="13"')],
+    ['Scenario 1 - When marker present', xml.includes('Scenario 1 - When: navigate to Account home')],
+    ['Scenario 2 - Then marker present (the one 1.5.0 dropped)', xml.includes('Scenario 2 - Then: go to Account list')],
+    ['Scenario 3 - When marker present', xml.includes('Scenario 3 - When: open Account detail')],
+    ['All 4 AssertValues steps emitted', (xml.match(/AssertValues/g) ?? []).length >= 4],
+    ['No silent UiAssert substitution', !xml.includes('com.provar.plugins.forcedotcom.core.ui.UiAssert')],
+    ['{VarName} placeholders emit class="variable"', xml.includes('class="variable"')],
+  ];
+  for (const [label, ok] of checks) {
+    console.log(indent(`${ok ? '✅' : '❌'} ${label}`));
+  }
+
+  subdivider('Raw XML — first 80 lines of what the LLM gets back');
+  const xmlLines = xml.split('\n').slice(0, 80);
+  console.log(indent(xmlLines.join('\n')));
+
+  server.stdin.end();
+  process.exit(0);
+})().catch((err) => {
+  console.error('trace error:', err);
+  server.kill();
+  process.exit(1);
+});