diff --git a/docs/PROVAR_TOOL_GUIDE.md b/docs/PROVAR_TOOL_GUIDE.md index 651e1bc7..1f62a823 100644 --- a/docs/PROVAR_TOOL_GUIDE.md +++ b/docs/PROVAR_TOOL_GUIDE.md @@ -77,15 +77,27 @@ provar_properties_set { file_path: "", key: "connectionName", valu ## "I want to write a new test" +A Provar test case is a tree (scenarios → UI screens → asserts), not a flat list of steps. The agent that calls `provar_testcase_generate` is responsible for constructing the full tree in **one** call. Splitting authoring across many tool calls causes scenario numbering drift, flat asserts, and inconsistent step types — `provar_testcase_step_edit` is for **amending** an existing test case, not for **constructing** one. + +Recommended sequence: + ``` -1. provar_project_inspect { project_path } ← find coverage gaps first -2. provar_testcase_generate { project_path, name, ... } -3. provar_testcase_step_edit { test_case_path, ... } ← repeat per step -4. provar_testcase_validate { file_path } ← must pass before adding to plan -5. provar_testplan_add-instance { project_path, plan_name, test_case_path } -6. provar_testplan_validate { project_path, plan_name } +1. provar_project_inspect { project_path } ← find coverage gaps first +2. provar_qualityhub_examples_retrieve { object_or_scenario } ← ground in corpus examples for the step types you need +3. provar_testcase_generate { test_case_name, steps: [] } ← single call, full step tree in one payload +4. provar_testcase_validate { file_path } ← must pass before adding to plan +5. provar_testplan_add-instance { project_path, plan_name, test_case_path } +6. provar_testplan_validate { project_path, plan_name } ``` +Use `provar_testcase_step_edit` only when: + +- Adding a single step to an existing, already-validated test case +- Fixing a step's attributes after a validation finding +- Targeted edits during debugging + +Do **not** use `provar_testcase_step_edit` to construct a test case step-by-step from an empty skeleton — the LLM loses scenario context between calls and the resulting structure is unreliable. + --- ## "I want to work with Salesforce metadata" diff --git a/docs/mcp-pilot-guide.md b/docs/mcp-pilot-guide.md index c5d2085b..9fbaa55b 100644 --- a/docs/mcp-pilot-guide.md +++ b/docs/mcp-pilot-guide.md @@ -439,6 +439,40 @@ NitroX is Provar's Hybrid Model for locators — it maps Salesforce component-ba --- +### Scenario 12: Construct a Multi-Scenario Test Case in a Single Call + +**Goal:** Confirm the AI authors a multi-scenario test case by passing the full step tree to `provar_testcase_generate` in **one** call — not by generating an empty skeleton and looping `provar_testcase_step_edit` per step. + +**Background:** A regression in 1.5.0 (PDX-479) traced to authoring guidance that steered LLMs toward a per-step construction pattern. Multi-call construction drops scenario numbers (e.g. Scenario 1 → Scenario 3, no Scenario 2), flattens asserts that should be nested inside `UiWithScreen` clauses, and produces inconsistent assert API IDs across the case. This scenario exists so the regression class is exercised in pilot evaluation and cannot recur silently. + +**Prompt:** + +> "Create a Provar test case `AccountFlow.testcase` that covers three scenarios: +> +> 1. **Create Account** — navigate to the Account home, click New, set Name = `{AccountName}` and Phone = `{AccountPhone}`, click Save +> 2. **Verify Account on List** — navigate back to the Account list view, assert the Name and Phone values +> 3. **Open Account Detail** — open the just-created Account, assert all saved field values +> +> Use UI On Screen wrappers, AssertValues for value assertions, and reference SetValues variables with `{Name}`. Write to `/tests/AccountFlow.testcase`." + +**What to look for (PASS):** + +- Exactly **one** call to `provar_testcase_generate` with a populated `steps[]` array — not a call with `steps: []` followed by N `step_edit` calls +- The generated XML lists three scenarios numbered consecutively (1, 2, 3 — no skipped numbers) +- Each scenario's UI actions and asserts are nested inside the appropriate `UiWithScreen` clause (or its equivalent grouping element) — not flat siblings under `` +- Assert step types are consistent across the case (e.g. all `AssertValues`, not mixed `AssertValues` + `UiAssert` for the same purpose) +- `provar_testcase_validate` on the result returns `is_valid: true` + +**What to look for (FAIL — regression indicator):** + +- Two or more calls to `provar_testcase_generate` for the same file +- A call to `provar_testcase_generate` with `steps: []` followed by `provar_testcase_step_edit` calls +- The generated case skips a scenario number, mixes assert API IDs for similar assertions, or emits asserts as flat siblings rather than nested inside the screen wrapper + +If any FAIL indicator appears, file against PDX-479 (or its successor) with the prompt and the generated XML attached. + +--- + ## Security Model ### What the server does diff --git a/docs/mcp.md b/docs/mcp.md index 07d9b3e0..7a08fed2 100644 --- a/docs/mcp.md +++ b/docs/mcp.md @@ -703,6 +703,8 @@ Validates a Java Page Object source file against 30+ quality rules (structural c Generates an XML test case skeleton with UUID v4 guids and sequential `testItemId` values. +> **Construction pattern (read first).** Pass the FULL step tree for the test case in a single call via the `steps[]` array. Do **not** call this tool with `steps: []` and then append steps via repeated `provar_testcase_step_edit` calls — that pattern drops scenarios, flattens nesting, and produces inconsistent step types. `provar_testcase_step_edit` is for **amending** an already-validated test case (single-step add, attribute fix, debug edit), not for **constructing** one from scratch. + **Generated `` element structure (Provar requirements):** ```xml @@ -1545,6 +1547,8 @@ Salesforce DML error categories (`SALESFORCE_*`) represent test-data failures Atomically add or remove a single step (``) in a Provar XML test case file. Writes a `.bak` backup before mutating, runs structural validation after the edit, and automatically restores the backup if validation fails. +> **When to use.** This tool is for **amending** an existing, already-validated test case (single-step add, attribute fix, debug edit). It is **not** for constructing a test case from scratch by calling it repeatedly after a `steps: []` `provar_testcase_generate`. Building a case step-by-step via repeated `step_edit` calls produces structurally invalid test cases (dropped scenarios, flat asserts, inconsistent step types). For new test cases, pass the full step tree to `provar_testcase_generate` in a single call. + Prerequisites: the test case file must exist and be valid XML with a `` structure. | Input | Type | Required | Description | diff --git a/scripts/pdx-481-trace.cjs b/scripts/pdx-481-trace.cjs new file mode 100644 index 00000000..8e5296aa --- /dev/null +++ b/scripts/pdx-481-trace.cjs @@ -0,0 +1,251 @@ +// PDX-481 prompt-flow trace. +// +// Drives the patched MCP server over JSON-RPC stdio and captures the EXACT +// bytes that an MCP client (Claude Desktop / Cursor / etc.) would surface to +// its LLM at every decision point in the test-authoring flow: +// +// 1. The orchestration prompt the LLM reads when planning ("I want to author a new test case") +// 2. The tool-guide resource the LLM reads when picking the right tool +// 3. The provar_testcase_generate tool description the LLM reads at the call site +// 4. The provar_testcase_step_edit tool description (amend-only contract) +// 5. The actual XML the tool emits when given a real multi-scenario payload +// +// Run from the worktree root after `yarn compile`: +// node scripts/pdx-481-trace.cjs + +'use strict'; + +const { spawn } = require('child_process'); +const os = require('os'); +const path = require('path'); + +const TMP = os.tmpdir(); +const entry = path.resolve(__dirname, '..', 'bin', 'mcp-start.js'); + +const server = spawn(process.execPath, [entry, 'mcp', 'start', '--allowed-paths', TMP, '--no-update-check'], { + stdio: ['pipe', 'pipe', 'inherit'], +}); + +let nextId = 1; +const pending = new Map(); +let buf = ''; + +server.stdout.on('data', (chunk) => { + buf += chunk.toString('utf-8'); + let nl; + while ((nl = buf.indexOf('\n')) !== -1) { + const line = buf.slice(0, nl).trim(); + buf = buf.slice(nl + 1); + if (!line) continue; + try { + const msg = JSON.parse(line); + const cb = pending.get(msg.id); + if (cb) { + pending.delete(msg.id); + cb(msg); + } + } catch { + /* ignore non-JSON */ + } + } +}); + +function rpc(method, params) { + const id = nextId++; + const req = JSON.stringify({ jsonrpc: '2.0', id, method, params }) + '\n'; + return new Promise((resolve, reject) => { + pending.set(id, resolve); + setTimeout(() => { + if (pending.has(id)) { + pending.delete(id); + reject(new Error(`Timeout waiting for ${method}`)); + } + }, 10000); + server.stdin.write(req); + }); +} + +function divider(label) { + console.log('\n' + '═'.repeat(78)); + console.log(' ' + label); + console.log('═'.repeat(78)); +} + +function subdivider(label) { + console.log('\n' + '─'.repeat(78)); + console.log(' ' + label); + console.log('─'.repeat(78)); +} + +function indent(text, prefix = ' ') { + return text + .split('\n') + .map((l) => prefix + l) + .join('\n'); +} + +function extractSection(text, headerRegex, nextHeaderRegex) { + const startMatch = headerRegex.exec(text); + if (!startMatch) return '
'; + const start = startMatch.index; + const tail = text.slice(start); + const endMatch = nextHeaderRegex.exec(tail.slice(headerRegex.source.length)); + return endMatch ? tail.slice(0, endMatch.index + headerRegex.source.length) : tail; +} + +(async () => { + await rpc('initialize', { + protocolVersion: '2024-11-05', + capabilities: {}, + clientInfo: { name: 'pdx-481-trace', version: '1.0.0' }, + }); + + // ── 1. The orchestration prompt's author-test flow ──────────────────────── + divider('TRACE 1 — what the LLM reads when "planning a test-case authoring task"'); + console.log('Tool call simulated: prompts/get(provar.guide.orchestration, task=author-test)'); + console.log('This is what an MCP client surfaces to the LLM as the planning brief.\n'); + + const orch = await rpc('prompts/get', { + name: 'provar.guide.orchestration', + arguments: { task: 'author-test' }, + }); + const orchText = orch.result?.messages?.[0]?.content?.text ?? ''; + console.log(indent(orchText)); + + // ── 2. The tool-guide resource ──────────────────────────────────────────── + divider('TRACE 2 — what the LLM reads when "picking the right tool to author a test"'); + console.log('Tool call simulated: resources/read(provar://docs/tool-guide)'); + console.log('Excerpting the "I want to write a new test" section only.\n'); + + const guide = await rpc('resources/read', { uri: 'provar://docs/tool-guide' }); + const guideText = guide.result?.contents?.[0]?.text ?? ''; + const section = extractSection(guideText, /## "I want to write a new test"/, /\n## "/); + console.log(indent(section)); + + // ── 3. The provar_testcase_generate tool description ────────────────────── + divider('TRACE 3 — what the LLM reads at the call site of provar_testcase_generate'); + console.log('Tool call simulated: tools/list (filtered to provar_testcase_generate)'); + console.log('First 1000 chars of the description string surfaced to the model.\n'); + + const tools = await rpc('tools/list', {}); + const toolList = tools.result?.tools ?? []; + const gen = toolList.find((t) => t.name === 'provar_testcase_generate'); + console.log( + indent( + (gen?.description ?? '').slice(0, 1000) + (gen?.description?.length > 1000 ? '… (truncated)' : '') + ) + ); + + subdivider('steps[] field description (read by the LLM when filling the argument)'); + const stepsField = gen?.inputSchema?.properties?.steps; + console.log(indent(stepsField?.description ?? '')); + + // ── 4. The provar_testcase_step_edit tool description ───────────────────── + divider('TRACE 4 — what the LLM reads at the call site of provar_testcase_step_edit'); + console.log('Tool call simulated: tools/list (filtered to provar_testcase_step_edit)\n'); + + const edit = toolList.find((t) => t.name === 'provar_testcase_step_edit'); + console.log( + indent( + (edit?.description ?? '').slice(0, 1000) + (edit?.description?.length > 1000 ? '… (truncated)' : '') + ) + ); + + // ── 5. Real tool call — multi-scenario single-call generate ─────────────── + divider('TRACE 5 — real tool call: provar_testcase_generate with a 3-scenario payload'); + console.log("Tool call simulated: an LLM that follows TRACE 1-3's guidance constructs"); + console.log('the full step tree and passes it in ONE call. We capture the output:\n'); + + const callResult = await rpc('tools/call', { + name: 'provar_testcase_generate', + arguments: { + // eslint-disable-next-line camelcase + test_case_name: 'AccountFlow', + steps: [ + // Scenario 1 — Create Account + { api_id: 'UiConnect', name: 'Salesforce Connect: AdminOauth', attributes: {} }, + { + api_id: 'SetValues', + name: 'Set Account Test Data', + attributes: { AccountName: 'Acme', AccountPhone: '555-0100' }, + }, + { api_id: 'UiNavigate', name: 'Scenario 1 - When: navigate to Account home', attributes: {} }, + { api_id: 'UiDoAction', name: 'Scenario 1 - When: click New', attributes: {} }, + { + api_id: 'SetValues', + name: 'Scenario 1 - When: fill form', + attributes: { Name: '{AccountName}', Phone: '{AccountPhone}' }, + }, + { api_id: 'UiDoAction', name: 'Scenario 1 - When: click Save', attributes: {} }, + // Scenario 2 — Verify on list view (the scenario that went missing on 1.5.0) + { api_id: 'UiNavigate', name: 'Scenario 2 - Then: go to Account list', attributes: {} }, + { + api_id: 'AssertValues', + name: 'Scenario 2 - Then: assert Name on list', + attributes: { expectedValue: '{AccountName}', actualValue: 'Name', comparisonType: 'EqualTo' }, + }, + { + api_id: 'AssertValues', + name: 'Scenario 2 - Then: assert Phone on list', + attributes: { expectedValue: '{AccountPhone}', actualValue: 'Phone', comparisonType: 'EqualTo' }, + }, + // Scenario 3 — Open detail and assert all + { api_id: 'UiDoAction', name: 'Scenario 3 - When: open Account detail', attributes: {} }, + { + api_id: 'AssertValues', + name: 'Scenario 3 - Then: assert Name on detail', + attributes: { expectedValue: '{AccountName}', actualValue: 'Name', comparisonType: 'EqualTo' }, + }, + { + api_id: 'AssertValues', + name: 'Scenario 3 - Then: assert Phone on detail', + attributes: { expectedValue: '{AccountPhone}', actualValue: 'Phone', comparisonType: 'EqualTo' }, + }, + ], + dry_run: true, + overwrite: false, + }, + }); + + const content = callResult.result?.content?.[0]?.text ?? '{}'; + const body = JSON.parse(content); + + subdivider('Tool response — top-level fields'); + console.log(indent(`step_count: ${body.step_count}`)); + console.log(indent(`written: ${body.written}`)); + console.log(indent(`is_valid: ${body.validation?.is_valid}`)); + console.log(indent(`validity: ${body.validation?.validity_score}`)); + console.log(indent(`quality: ${body.validation?.quality_score}`)); + console.log(indent(`errors: ${body.validation?.error_count}`)); + + subdivider('Generated XML — assertions a reviewer can run by eye'); + const xml = body.xml_content; + + const checks = [ + [ + 'Sequential testItemIds 1..12, no gaps', + [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12].every((n) => xml.includes(`testItemId="${n}"`)), + ], + ['No spurious testItemId="13"', !xml.includes('testItemId="13"')], + ['Scenario 1 - When marker present', xml.includes('Scenario 1 - When: navigate to Account home')], + ['Scenario 2 - Then marker present (the one 1.5.0 dropped)', xml.includes('Scenario 2 - Then: go to Account list')], + ['Scenario 3 - When marker present', xml.includes('Scenario 3 - When: open Account detail')], + ['All 4 AssertValues steps emitted', (xml.match(/AssertValues/g) ?? []).length >= 4], + ['No silent UiAssert substitution', !xml.includes('com.provar.plugins.forcedotcom.core.ui.UiAssert')], + ['{VarName} placeholders emit class="variable"', xml.includes('class="variable"')], + ]; + for (const [label, ok] of checks) { + console.log(indent(`${ok ? '✅' : '❌'} ${label}`)); + } + + subdivider('Raw XML — first 80 lines of what the LLM gets back'); + const xmlLines = xml.split('\n').slice(0, 80); + console.log(indent(xmlLines.join('\n'))); + + server.stdin.end(); + process.exit(0); +})().catch((err) => { + console.error('trace error:', err); + server.kill(); + process.exit(1); +}); diff --git a/scripts/pdx-481-validate.cjs b/scripts/pdx-481-validate.cjs new file mode 100644 index 00000000..98aa6f61 --- /dev/null +++ b/scripts/pdx-481-validate.cjs @@ -0,0 +1,157 @@ +// PDX-481: server-side validation that the rewritten author-test guidance is +// reachable and contains the canonical single-call construction copy. Runs +// without requiring sf CLI to be linked to the local plugin. +// +// yarn compile +// node scripts/pdx-481-validate.cjs + +'use strict'; + +const { spawn } = require('child_process'); +const os = require('os'); +const path = require('path'); + +const TMP = os.tmpdir(); +const entry = path.resolve(__dirname, '..', 'bin', 'mcp-start.js'); + +const server = spawn(process.execPath, [entry, 'mcp', 'start', '--allowed-paths', TMP, '--no-update-check'], { + stdio: ['pipe', 'pipe', 'inherit'], +}); + +let nextId = 1; +const pending = new Map(); +let buf = ''; + +server.stdout.on('data', (chunk) => { + buf += chunk.toString('utf-8'); + let nl; + while ((nl = buf.indexOf('\n')) !== -1) { + const line = buf.slice(0, nl).trim(); + buf = buf.slice(nl + 1); + if (!line) continue; + try { + const msg = JSON.parse(line); + const cb = pending.get(msg.id); + if (cb) { + pending.delete(msg.id); + cb(msg); + } + } catch { + /* ignore */ + } + } +}); + +function rpc(method, params) { + const id = nextId++; + const req = JSON.stringify({ jsonrpc: '2.0', id, method, params }) + '\n'; + return new Promise((resolve, reject) => { + pending.set(id, resolve); + setTimeout(() => { + if (pending.has(id)) { + pending.delete(id); + reject(new Error(`Timeout waiting for ${method}`)); + } + }, 5000); + server.stdin.write(req); + }); +} + +const results = []; +function record(label, ok, detail) { + results.push({ label, ok, detail }); +} + +(async () => { + await rpc('initialize', { + protocolVersion: '2024-11-05', + capabilities: {}, + clientInfo: { name: 'pdx-481-validate', version: '1.0.0' }, + }); + + // The orchestration prompt should still be registered (PDX-481 keeps it, + // unlike PDX-480 which disabled it). + const orch = await rpc('prompts/get', { + name: 'provar.guide.orchestration', + arguments: { task: 'author-test' }, + }); + const text = orch.result?.messages?.[0]?.content?.text ?? ''; + + record( + 'orchestration(author-test) is reachable', + text.length > 0, + text.length > 0 ? `received ${text.length} chars` : `no text returned` + ); + + // Canonical single-call construction copy + const mustInclude = ['single call', 'ALL steps', 'amend']; + for (const phrase of mustInclude) { + const present = text.includes(phrase); + record( + `author-test includes "${phrase}"`, + present, + present ? `present` : `MISSING — fix would not stop the regression` + ); + } + + // PDX-479 anti-patterns + const mustExclude = ['repeat per step']; + for (const phrase of mustExclude) { + const present = text.includes(phrase); + record(`author-test excludes "${phrase}"`, !present, present ? `STILL PRESENT — regression risk` : `removed`); + } + + // General orchestration flow's prerequisite graph + const general = await rpc('prompts/get', { + name: 'provar.guide.orchestration', + arguments: {}, + }); + const gtext = general.result?.messages?.[0]?.content?.text ?? ''; + record( + 'prerequisite graph splits generate and step_edit', + !gtext.includes('provar_testcase_generate OR provar_testcase_step_edit'), + gtext.includes('provar_testcase_generate OR provar_testcase_step_edit') + ? `STILL CONFLATED — fix incomplete` + : `split confirmed` + ); + + // Tool-guide resource should still serve content (PDX-481 keeps it). + const guide = await rpc('resources/read', { uri: 'provar://docs/tool-guide' }); + const gcontent = guide.result?.contents?.[0]?.text ?? ''; + record( + 'tool-guide resource is reachable', + gcontent.length > 0, + gcontent.length > 0 ? `received ${gcontent.length} chars` : `not served` + ); + record( + 'tool-guide author-test section recommends single call', + gcontent.includes('single call') || gcontent.includes('one payload'), + gcontent.includes('single call') || gcontent.includes('one payload') + ? `recommended phrasing found` + : `MISSING canonical phrasing in resource` + ); + record( + 'tool-guide author-test section excludes "repeat per step"', + !gcontent.includes('repeat per step'), + gcontent.includes('repeat per step') ? `STILL PRESENT — regression risk` : `removed` + ); + + let pass = 0; + let fail = 0; + for (const r of results) { + console.log(`${r.ok ? '[PASS]' : '[FAIL]'} ${r.label} — ${r.detail}`); + if (r.ok) { + pass++; + } else { + fail++; + } + } + console.log(`\nPDX-481 validation: ${pass} passed, ${fail} failed`); + + server.stdin.end(); + process.exit(fail > 0 ? 1 : 0); +})().catch((err) => { + console.error('Validation script error:', err); + server.kill(); + process.exit(2); +}); diff --git a/src/mcp/prompts/guidePrompts.ts b/src/mcp/prompts/guidePrompts.ts index 855ac585..14fde43f 100644 --- a/src/mcp/prompts/guidePrompts.ts +++ b/src/mcp/prompts/guidePrompts.ts @@ -263,16 +263,23 @@ Required sequence — do not skip steps: 'author-test': `## Author a New Test Case -1. provar_project_inspect → find coverage gaps before writing -2. provar_automation_metadata_download → if SF metadata is stale (missing fields/objects) -3. provar_pageobject_generate → if a new page object is needed -4. provar_pageobject_validate → validate before compile -5. provar_automation_compile → after any page object change -6. provar_testcase_generate → create the test case file -7. provar_testcase_step_edit → add steps (repeat as needed) -8. provar_testcase_validate → MUST pass before adding to a plan -9. provar_testplan_add-instance → add to an existing plan -10. provar_testplan_validate → validate the plan`, +Construct the full step tree in a single \`provar_testcase_generate\` call. +\`provar_testcase_step_edit\` is for amending an existing case, not for +building one step-by-step (that pattern drops scenarios and flattens nesting). + +1. provar_project_inspect → find coverage gaps before writing +2. provar_qualityhub_examples_retrieve → ground in corpus examples for the step types you need +3. provar_automation_metadata_download → if SF metadata is stale (missing fields/objects) +4. provar_pageobject_generate → only if a new page object is needed +5. provar_pageobject_validate → validate before compile +6. provar_automation_compile → after any page object change +7. provar_testcase_generate → single call, pass ALL steps in one payload +8. provar_testcase_validate → MUST pass before adding to a plan +9. provar_testplan_add-instance → add to an existing plan +10. provar_testplan_validate → validate the plan + +Use provar_testcase_step_edit only to amend an existing validated test case +(single-step add, attribute fix, debug edit) — never to construct one from scratch.`, 'debug-failures': `## Debug Failing Tests @@ -319,11 +326,14 @@ provar_pageobject_validate provar_nitrox_generate OR provar_nitrox_patch └── provar_nitrox_validate (always validate after) -provar_testcase_generate OR provar_testcase_step_edit +provar_testcase_generate (construct full case — pass ALL steps in one call) └── provar_testcase_validate └── provar_testplan_add-instance └── provar_testplan_validate +provar_testcase_step_edit (amend an existing validated case only — never construct) + └── provar_testcase_validate + ### Safe to run in parallel (no dependency between them) - provar_project_inspect + provar_connection_list - provar_pageobject_validate on multiple files diff --git a/test/unit/mcp/guidePrompts.test.ts b/test/unit/mcp/guidePrompts.test.ts new file mode 100644 index 00000000..c11e190c --- /dev/null +++ b/test/unit/mcp/guidePrompts.test.ts @@ -0,0 +1,146 @@ +/* + * Copyright (c) 2024 Provar Limited. + * All rights reserved. + * Licensed under the BSD 3-Clause license. + * For full license text, see LICENSE.md file in the repo root or https://opensource.org/licenses/BSD-3-Clause + */ + +import { strict as assert } from 'node:assert'; +import { describe, it, beforeEach } from 'mocha'; +import { + registerOnboardingPrompt, + registerTroubleshootPrompt, + registerOrchestrationPrompt, +} from '../../../src/mcp/prompts/guidePrompts.js'; + +// ── Minimal McpServer mock ───────────────────────────────────────────────────── + +type PromptHandler = (args: Record) => { + messages: Array<{ role: string; content: { type: string; text: string } }>; +}; + +interface PromptRegistration { + name: string; + description: string; + handler: PromptHandler; +} + +class MockMcpServer { + public registrations: PromptRegistration[] = []; + + public prompt(name: string, description: string, _schema: unknown, handler: PromptHandler): void { + this.registrations.push({ name, description, handler }); + } + + public call(name: string, args: Record): ReturnType { + const reg = this.registrations.find((r) => r.name === name); + if (!reg) throw new Error(`Prompt not registered: ${name}`); + return reg.handler(args); + } +} + +function getMessageText(result: ReturnType): string { + assert.ok(result.messages.length > 0, 'Expected at least one message'); + assert.equal(result.messages[0].role, 'user'); + assert.equal(result.messages[0].content.type, 'text'); + return result.messages[0].content.text; +} + +// ── Tests ────────────────────────────────────────────────────────────────────── + +let server: MockMcpServer; + +beforeEach(() => { + server = new MockMcpServer(); + registerOnboardingPrompt(server as never); + registerTroubleshootPrompt(server as never); + registerOrchestrationPrompt(server as never); +}); + +describe('guidePrompts — registration', () => { + it('registers all 3 guide prompts', () => { + assert.equal(server.registrations.length, 3); + }); + + it('registers provar.guide.onboarding', () => { + const reg = server.registrations.find((r) => r.name === 'provar.guide.onboarding'); + assert.ok(reg, 'provar.guide.onboarding should be registered'); + }); + + it('registers provar.guide.troubleshoot', () => { + const reg = server.registrations.find((r) => r.name === 'provar.guide.troubleshoot'); + assert.ok(reg, 'provar.guide.troubleshoot should be registered'); + }); + + it('registers provar.guide.orchestration', () => { + const reg = server.registrations.find((r) => r.name === 'provar.guide.orchestration'); + assert.ok(reg, 'provar.guide.orchestration should be registered'); + }); +}); + +// ── Regression guard: the PDX-481 single-call construction copy ──────────────── +// These assertions protect the canonical phrasing that fixes PDX-479. If you +// rewrite the author-test flow in guidePrompts.ts, you MUST keep equivalent +// guidance — otherwise the 1.5.0 regression returns. + +describe('guidePrompts — author-test flow (PDX-481 regression guard)', () => { + it('author-test flow recommends single-call construction', () => { + const text = getMessageText(server.call('provar.guide.orchestration', { task: 'author-test' })); + assert.ok( + text.includes('single call') || text.includes('one call') || text.includes('in one payload'), + 'author-test flow must recommend single-call construction (search: "single call" / "one call" / "in one payload")' + ); + assert.ok( + text.includes('ALL steps') || text.includes('full step tree') || text.includes('full tree'), + 'author-test flow must call out passing the full step tree at once' + ); + }); + + it('author-test flow does NOT recommend per-step construction', () => { + const text = getMessageText(server.call('provar.guide.orchestration', { task: 'author-test' })); + assert.ok( + !text.includes('repeat per step'), + 'author-test flow must not say "repeat per step" — that pattern caused PDX-479' + ); + // Unconditional check — the old OR-clause "|| text.includes('amend')" short-circuited to pass + // (because "amend" appears repeatedly elsewhere in the flow), so it provided no real protection + // against the "repeat as needed" phrasing being reintroduced. + assert.ok( + !text.includes('repeat as needed'), + 'author-test flow must not say "repeat as needed" — that pattern caused PDX-479' + ); + }); + + it('author-test flow marks step_edit as amendment-only', () => { + const text = getMessageText(server.call('provar.guide.orchestration', { task: 'author-test' })); + assert.ok( + text.includes('amend') || text.includes('Amend') || text.includes('AMENDING'), + 'author-test flow must mark provar_testcase_step_edit as for amending existing test cases' + ); + }); +}); + +describe('guidePrompts — orchestration general flow (PDX-481 regression guard)', () => { + it('prerequisite graph splits generate and step_edit into distinct entry points', () => { + const text = getMessageText(server.call('provar.guide.orchestration', {})); + // The pre-fix string was: "provar_testcase_generate OR provar_testcase_step_edit" + // The post-fix split lists them on separate lines with distinct annotations. + assert.ok( + !text.includes('provar_testcase_generate OR provar_testcase_step_edit'), + 'prerequisite graph must not equate generate and step_edit — they have different purposes' + ); + // Bounded regex tied to the exact annotation punctuation used in the prompt body — + // "provar_testcase_generate (construct …" / "provar_testcase_step_edit (amend …". + // Bounding the gap to ≤8 chars (i.e. the single " (" that should appear before the + // annotation) avoids the loose-`[^\n]*` false-positive where unrelated tokens between + // the two words on the same line would still match. + assert.ok( + /provar_testcase_generate\s*\(construct/i.test(text), + 'prerequisite graph must annotate provar_testcase_generate as the construct entry point' + ); + assert.ok( + /provar_testcase_step_edit\s*\(amend/i.test(text), + 'prerequisite graph must annotate provar_testcase_step_edit as the amend entry point' + ); + }); +}); diff --git a/test/unit/mcp/testCaseGenerate.test.ts b/test/unit/mcp/testCaseGenerate.test.ts index c6ba33df..31290067 100644 --- a/test/unit/mcp/testCaseGenerate.test.ts +++ b/test/unit/mcp/testCaseGenerate.test.ts @@ -951,4 +951,168 @@ describe('provar_testcase_generate', () => { assert.ok(!xml.includes('class="compound"'), 'Pure {VarName} must NOT use class="compound"'); }); }); + + // ── PDX-481 regression guard ───────────────────────────────────────────────── + // The 1.5.0 regression (PDX-479) happened when agents authored test cases + // step-by-step via repeated tool calls instead of constructing the full step + // tree in a single provar_testcase_generate call. This block proves that + // when the full tree IS passed in one call, the output is structurally clean: + // scenarios numbered consecutively, asserts emitted with consistent types, + // and testItemIds sequential. + + describe('multi-scenario single-call construction (PDX-481 regression guard)', () => { + it('emits consecutive testItemIds across a 3-scenario, multi-step payload', () => { + const result = server.call('provar_testcase_generate', { + test_case_name: 'AccountFlow', + steps: [ + // Scenario 1 — Create Account + { api_id: 'UiConnect', name: 'Salesforce Connect', attributes: {} }, + { + api_id: 'SetValues', + name: 'Set Account Test Data', + attributes: { AccountName: 'Acme', AccountPhone: '555-0100' }, + }, + { api_id: 'UiNavigate', name: 'Scenario 1: navigate to Account home', attributes: {} }, + { api_id: 'UiDoAction', name: 'Scenario 1: click New', attributes: {} }, + { + api_id: 'SetValues', + name: 'Scenario 1: fill form', + attributes: { Name: '{AccountName}', Phone: '{AccountPhone}' }, + }, + { api_id: 'UiDoAction', name: 'Scenario 1: click Save', attributes: {} }, + // Scenario 2 — Verify on list view (the scenario that went missing on 1.5.0) + { api_id: 'UiNavigate', name: 'Scenario 2: go to Account list', attributes: {} }, + { + api_id: 'AssertValues', + name: 'Scenario 2: assert Name on list', + attributes: { expectedValue: '{AccountName}', actualValue: 'Name', comparisonType: 'EqualTo' }, + }, + { + api_id: 'AssertValues', + name: 'Scenario 2: assert Phone on list', + attributes: { expectedValue: '{AccountPhone}', actualValue: 'Phone', comparisonType: 'EqualTo' }, + }, + // Scenario 3 — Open detail and assert all + { api_id: 'UiDoAction', name: 'Scenario 3: open Account detail', attributes: {} }, + { + api_id: 'AssertValues', + name: 'Scenario 3: assert Name on detail', + attributes: { expectedValue: '{AccountName}', actualValue: 'Name', comparisonType: 'EqualTo' }, + }, + { + api_id: 'AssertValues', + name: 'Scenario 3: assert Phone on detail', + attributes: { expectedValue: '{AccountPhone}', actualValue: 'Phone', comparisonType: 'EqualTo' }, + }, + ], + dry_run: true, + overwrite: false, + }); + + assert.equal(isError(result), false, 'single-call multi-scenario generate must succeed'); + const body = parseText(result); + assert.equal(body['step_count'], 12, 'all 12 steps must be present (no scenarios dropped)'); + + const xml = body['xml_content'] as string; + // testItemIds must be exactly 1..12 — gaps indicate dropped steps. + for (let i = 1; i <= 12; i++) { + assert.ok( + xml.includes(`testItemId="${i}"`), + `expected sequential testItemId="${i}" — gap means a scenario step was dropped` + ); + } + // No higher testItemIds emitted (would indicate spurious appends from an internal step_edit loop). + assert.ok(!xml.includes('testItemId="13"'), 'no spurious testItemIds beyond the payload count'); + }); + + it('preserves every step name from the payload — no scenario marker is silently dropped', () => { + const result = server.call('provar_testcase_generate', { + test_case_name: 'ScenarioMarkers', + steps: [ + { api_id: 'UiDoAction', name: 'Scenario 1: When create', attributes: {} }, + { api_id: 'UiDoAction', name: 'Scenario 1: Then verify', attributes: {} }, + { api_id: 'UiDoAction', name: 'Scenario 2: When edit', attributes: {} }, + { api_id: 'UiDoAction', name: 'Scenario 2: Then verify', attributes: {} }, + { api_id: 'UiDoAction', name: 'Scenario 3: When delete', attributes: {} }, + { api_id: 'UiDoAction', name: 'Scenario 3: Then absent', attributes: {} }, + ], + dry_run: true, + overwrite: false, + }); + + assert.equal(isError(result), false); + const xml = parseText(result)['xml_content'] as string; + for (const marker of [ + 'Scenario 1: When create', + 'Scenario 1: Then verify', + 'Scenario 2: When edit', + 'Scenario 2: Then verify', + 'Scenario 3: When delete', + 'Scenario 3: Then absent', + ]) { + assert.ok(xml.includes(marker), `scenario marker "${marker}" must be preserved verbatim`); + } + }); + + it('emits consistent assert API IDs for repeated AssertValues — no drift between calls', () => { + const result = server.call('provar_testcase_generate', { + test_case_name: 'AssertConsistency', + steps: [ + { + api_id: 'AssertValues', + name: 'Assert 1', + attributes: { expectedValue: '{a}', actualValue: 'x', comparisonType: 'EqualTo' }, + }, + { + api_id: 'AssertValues', + name: 'Assert 2', + attributes: { expectedValue: '{b}', actualValue: 'y', comparisonType: 'EqualTo' }, + }, + { + api_id: 'AssertValues', + name: 'Assert 3', + attributes: { expectedValue: '{c}', actualValue: 'z', comparisonType: 'EqualTo' }, + }, + ], + dry_run: true, + overwrite: false, + }); + + assert.equal(isError(result), false); + const xml = parseText(result)['xml_content'] as string; + const assertValuesMatches = xml.match(/apiId="com\.provar\.plugins\.bundled\.apis\.AssertValues"/g) ?? []; + assert.equal(assertValuesMatches.length, 3, 'all 3 asserts must use AssertValues — no API ID drift'); + // None of them should silently become UiAssert. + assert.ok( + !xml.includes('apiId="com.provar.plugins.forcedotcom.core.ui.UiAssert"'), + 'no AssertValues should be substituted with UiAssert' + ); + }); + + it('wraps a non-SF target_uri in UiWithScreen with nested steps — full tree in one call', () => { + const result = server.call('provar_testcase_generate', { + test_case_name: 'PageObjectNested', + target_uri: 'ui:pageobject:target?pageId=pageobjects.AccountPage', + steps: [ + { api_id: 'UiDoAction', name: 'Click new', attributes: {} }, + { + api_id: 'AssertValues', + name: 'Assert created', + attributes: { expectedValue: '{x}', actualValue: 'y', comparisonType: 'EqualTo' }, + }, + ], + dry_run: true, + overwrite: false, + }); + + assert.equal(isError(result), false); + const xml = parseText(result)['xml_content'] as string; + assert.ok(xml.includes('UiWithScreen'), 'non-SF target_uri must wrap in UiWithScreen'); + assert.ok(xml.includes(''), 'wrapper must contain '); + assert.ok(xml.includes(''), 'substeps clause must have testItemId="2"'); + // Inner steps start at testItemId=3 per builder convention. + assert.ok(xml.includes('testItemId="3"'), 'first nested step must have testItemId="3"'); + assert.ok(xml.includes('testItemId="4"'), 'second nested step must have testItemId="4"'); + }); + }); });