From 590c3fcf32d6a111198f0f527e204fdb7f21d6b1 Mon Sep 17 00:00:00 2001
From: Michael Dailey <michael.dailey@provartesting.com>
Date: Fri, 15 May 2026 11:03:47 -0500
Subject: [PATCH 1/2] PDX-483: feat(mcp): add STEPS_REQUIRED runtime guard on
 testcase_generate
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

RCA: PDX-482 hardened the testcase_generate tool description, but the contract is passive — an LLM that ignores it pays no price. Calling generate with steps:[]+dry_run:false+output_path still wrote a TODO-only skeleton file, reproducing the PDX-479 regression class. The contract was enforced only by the agent's reading comprehension.
Fix: Add an active runtime guard that rejects the exact shape (steps:[] + non-dry-run + output_path) with STEPS_REQUIRED before any side effects. details.suggestion tells the LLM to pass the FULL step tree in one call and notes the dry_run=true escape hatch for skeleton inspection. Other empty-steps shapes (dry-run preview, no output_path) remain allowed. Docs, smoke entry, and pdx-482-validate.cjs all updated.
---
 docs/mcp-pilot-guide.md                |   8 ++
 docs/mcp.md                            |  20 ++-
 scripts/mcp-smoke.cjs                  |  13 ++
 scripts/pdx-482-validate.cjs           | 169 +++++++++++++++++++++++--
 src/mcp/tools/testCaseGenerate.ts      |  26 ++++
 test/unit/mcp/testCaseGenerate.test.ts | 145 ++++++++++++++++++++-
 6 files changed, 364 insertions(+), 17 deletions(-)

diff --git a/docs/mcp-pilot-guide.md b/docs/mcp-pilot-guide.md
index 9fbaa55b..50cb1ae7 100644
--- a/docs/mcp-pilot-guide.md
+++ b/docs/mcp-pilot-guide.md
@@ -445,6 +445,14 @@ NitroX is Provar's Hybrid Model for locators — it maps Salesforce component-ba
 
 **Background:** A regression in 1.5.0 (PDX-479) traced to authoring guidance that steered LLMs toward a per-step construction pattern. Multi-call construction drops scenario numbers (e.g. Scenario 1 → Scenario 3, no Scenario 2), flattens asserts that should be nested inside `UiWithScreen` clauses, and produces inconsistent assert API IDs across the case. This scenario exists so the regression class is exercised in pilot evaluation and cannot recur silently.
 
+**Defense in depth.** Three layers protect against the regression class:
+
+1. **Prompt + resource guidance** (PDX-481) — upstream authoring prompts no longer steer toward per-step construction.
+2. **Tool-description contract** (PDX-482) — `provar_testcase_generate` and `provar_testcase_step_edit` descriptions explicitly mark generate as constructor-only and step_edit as amendment-only, so the LLM reads the contract at every call site (including compact schema mode).
+3. **Runtime guard** (PDX-483) — `provar_testcase_generate` actively rejects the exact shape that produces the bad file: `steps:[]` + `dry_run:false` + `output_path`. The rejection returns `STEPS_REQUIRED` with `details.suggestion` telling the LLM to pass the full step tree in one call. Empty-steps shapes that don't write a file (dry-run preview, no output_path) remain allowed.
+
+If a pilot LLM falls into the multi-call pattern despite the description contract, the runtime guard converts the failure into an actionable error rather than a silently broken file on disk.
+
 **Prompt:**
 
 > "Create a Provar test case `AccountFlow.testcase` that covers three scenarios:
diff --git a/docs/mcp.md b/docs/mcp.md
index 7a08fed2..437a8a6e 100644
--- a/docs/mcp.md
+++ b/docs/mcp.md
@@ -765,10 +765,22 @@ AssertValues uses **flat** argument structure (`expectedValue`, `actualValue`, `
 
 **Error codes**
 
-| Code               | Meaning                                                               |
-| ------------------ | --------------------------------------------------------------------- |
-| `TESTCASE_INVALID` | Generated XML failed structural validation (see `details.validation`) |
-| `FILE_EXISTS`      | `output_path` already exists and `overwrite=false`                    |
+| Code               | Meaning                                                                                                                                                            |
+| ------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
+| `TESTCASE_INVALID` | Generated XML failed structural validation (see `details.validation`)                                                                                              |
+| `FILE_EXISTS`      | `output_path` already exists and `overwrite=false`                                                                                                                 |
+| `STEPS_REQUIRED`   | Called with `steps:[]` + `dry_run:false` + `output_path` — the PDX-479 multi-call construction pattern. `details.suggestion` tells the caller how to self-correct. |
+
+**`STEPS_REQUIRED` (PDX-483 runtime guard).** The rejected shape is `steps:[]` + `dry_run:false` + `output_path` — the exact call signature that, before this guard, produced a contract-violating skeleton on disk (the PDX-479 regression class). All other empty-steps shapes remain allowed:
+
+| `steps.length` | `dry_run`     | `output_path` | Result                                                  |
+| -------------- | ------------- | ------------- | ------------------------------------------------------- |
+| 0              | `true`        | any           | Allowed — preserves skeleton inspection / IDE preview   |
+| 0              | `false`       | absent        | Allowed — no file would be written anyway               |
+| 0              | `false`       | **present**   | **Rejected** with `STEPS_REQUIRED` (no file is written) |
+| ≥ 1            | true or false | any           | Allowed — normal happy path                             |
+
+`details.suggestion` instructs the caller to pass the FULL step tree in a single call, clarifies that `provar_testcase_step_edit` is for amendment-only, and notes the `dry_run=true` escape hatch for skeleton inspection.
 
 ---
 
diff --git a/scripts/mcp-smoke.cjs b/scripts/mcp-smoke.cjs
index 7d138dc1..50cf628d 100644
--- a/scripts/mcp-smoke.cjs
+++ b/scripts/mcp-smoke.cjs
@@ -177,6 +177,19 @@ async function runTests() {
       dry_run: true,
     });
 
+  // ── 6b. provar_testcase_generate STEPS_REQUIRED runtime guard (PDX-483) ───
+  // Drives the rejected shape (steps:[] + dry_run:false + output_path) so the
+  // PDX-479 regression-class shape is exercised on every smoke run. The smoke
+  // framework counts any JSON-RPC response as PASS; the assertion that the
+  // body carries error_code='STEPS_REQUIRED' lives in scripts/pdx-482-validate.cjs.
+  if (inGroup('authoring'))
+    await callTool('provar_testcase_generate', {
+      test_case_name: 'PDX-483 Guard Smoke',
+      steps: [],
+      dry_run: false,
+      output_path: path.join(TMP, 'pdx483-smoke-rejected.testcase'),
+    });
+
   // ── 7. provar_testcase_validate ───────────────────────────────────────────
   if (inGroup('validation')) await callTool('provar_testcase_validate', { content: '<testCase/>' });
 
diff --git a/scripts/pdx-482-validate.cjs b/scripts/pdx-482-validate.cjs
index 5b4e5473..8685e9d6 100644
--- a/scripts/pdx-482-validate.cjs
+++ b/scripts/pdx-482-validate.cjs
@@ -1,16 +1,26 @@
-// PDX-482 validation: confirm the construct/amend contract is reachable at the
-// MCP protocol surface in BOTH standard and compact schema modes.
+// PDX-482 / PDX-483 validation: confirm the construct/amend contract is reachable
+// at the MCP protocol surface and that the PDX-483 runtime guard rejects the
+// PDX-479 multi-call pattern shape.
 //
-// The LLM reads tools/list before every tool call, so every assertion here is
-// on bytes the LLM literally sees at the call site. Compact mode coverage is
-// critical because the adversarial review identified that PROVAR_MCP_SCHEMA_MODE=compact
-// silently swapped the description for a contract-free one-liner.
+// PDX-482 (standard + compact modes): assertions on tools/list — every byte the
+// LLM literally sees at the call site. Compact mode coverage is critical because
+// the adversarial review identified that PROVAR_MCP_SCHEMA_MODE=compact silently
+// swapped the description for a contract-free one-liner.
+//
+// PDX-483 (runtime-guard mode): drives a real tools/call with the rejected shape
+// (steps:[]+dry_run:false+output_path) and asserts the response is a structured
+// STEPS_REQUIRED error with a non-empty details.suggestion. This catches a
+// regression class that the tools/list assertions cannot reach: the passive
+// contract surviving in the description while the active guard silently
+// regresses (e.g. a refactor reorders the handler so writes happen before the
+// check).
 //
 //   yarn compile
 //   node scripts/pdx-482-validate.cjs
 
 'use strict';
 
+const fs = require('fs');
 const { spawn } = require('child_process');
 const os = require('os');
 const path = require('path');
@@ -238,13 +248,156 @@ function compactAssertions(toolList, record) {
   }
 }
 
+// ── PDX-483 runtime guard: tools/call assertion ─────────────────────────────
+// Drives a real tools/call(provar_testcase_generate, ...) with the rejected
+// shape (steps:[] + dry_run:false + output_path) and asserts the response is
+// a structured STEPS_REQUIRED error. This is the only check that catches a
+// silent regression where the passive description survives but the active
+// runtime guard is removed or reordered after a side effect.
+function runRuntimeGuardValidation() {
+  return new Promise((resolve, reject) => {
+    const server = spawn(process.execPath, [entry, 'mcp', 'start', '--allowed-paths', TMP, '--no-update-check'], {
+      stdio: ['pipe', 'pipe', 'inherit'],
+      env: { ...process.env },
+    });
+
+    let nextId = 1;
+    const pending = new Map();
+    let buf = '';
+
+    server.stdout.on('data', (chunk) => {
+      buf += chunk.toString('utf-8');
+      let nl;
+      while ((nl = buf.indexOf('\n')) !== -1) {
+        const line = buf.slice(0, nl).trim();
+        buf = buf.slice(nl + 1);
+        if (!line) continue;
+        try {
+          const msg = JSON.parse(line);
+          const cb = pending.get(msg.id);
+          if (cb) {
+            pending.delete(msg.id);
+            cb(msg);
+          }
+        } catch {
+          /* ignore */
+        }
+      }
+    });
+
+    const rpc = (method, params) => {
+      const id = nextId++;
+      const req = JSON.stringify({ jsonrpc: '2.0', id, method, params }) + '\n';
+      return new Promise((rpcResolve, rpcReject) => {
+        pending.set(id, rpcResolve);
+        setTimeout(() => {
+          if (pending.has(id)) {
+            pending.delete(id);
+            rpcReject(new Error(`Timeout waiting for ${method}`));
+          }
+        }, 10000);
+        server.stdin.write(req);
+      });
+    };
+
+    const results = [];
+    const record = (label, ok, detail) => {
+      results.push({ label: `[runtime-guard] ${label}`, ok, detail });
+    };
+
+    (async () => {
+      await rpc('initialize', {
+        protocolVersion: '2024-11-05',
+        capabilities: {},
+        clientInfo: { name: 'pdx-483-validate', version: '1.0.0' },
+      });
+
+      // Use a unique tmp path so a leftover file from a prior run can't mask the assertion.
+      const outPath = path.join(TMP, `pdx483-validate-${Date.now()}.testcase`);
+      try {
+        if (fs.existsSync(outPath)) fs.unlinkSync(outPath);
+      } catch {
+        /* best-effort */
+      }
+
+      const callRes = await rpc('tools/call', {
+        name: 'provar_testcase_generate',
+        arguments: {
+          test_case_name: 'PDX-483 validate',
+          steps: [],
+          dry_run: false,
+          output_path: outPath,
+        },
+      });
+
+      // MCP tools/call returns { result: { content: [{ type, text }], isError? } }.
+      // The tool's error body is JSON-encoded in content[0].text.
+      const result = callRes.result;
+      record(
+        'tools/call returned a result (no protocol-level error)',
+        !!result && !callRes.error,
+        callRes.error ? JSON.stringify(callRes.error).slice(0, 120) : 'protocol OK'
+      );
+      record(
+        'result.isError === true (tool-level rejection)',
+        result?.isError === true,
+        `isError: ${String(result?.isError)} — rejection must surface at content level`
+      );
+
+      let body = null;
+      try {
+        body = JSON.parse(result?.content?.[0]?.text ?? '{}');
+      } catch (parseErr) {
+        record('content[0].text parses as JSON', false, parseErr.message);
+      }
+      record(
+        'error_code === "STEPS_REQUIRED"',
+        body?.error_code === 'STEPS_REQUIRED',
+        `error_code: ${body?.error_code} — must match the documented code from docs/mcp.md`
+      );
+      record(
+        'retryable === false',
+        body?.retryable === false,
+        'STEPS_REQUIRED is a contract violation — retrying with the same payload would never succeed'
+      );
+      record(
+        'details.suggestion is a non-empty string',
+        typeof body?.details?.suggestion === 'string' && body.details.suggestion.length > 0,
+        'details.suggestion must tell the LLM how to self-correct (canonical multi-call rejection text)'
+      );
+      record(
+        'details.suggestion mentions "FULL step tree"',
+        typeof body?.details?.suggestion === 'string' && body.details.suggestion.includes('FULL step tree'),
+        'suggestion must point the LLM at the single-call pattern'
+      );
+      record(
+        'details.suggestion mentions dry_run=true escape hatch',
+        typeof body?.details?.suggestion === 'string' && body.details.suggestion.includes('dry_run=true'),
+        'suggestion must mention dry_run=true for legitimate skeleton-inspection callers'
+      );
+      record(
+        'no file written at output_path (zero side effects)',
+        !fs.existsSync(outPath),
+        'STEPS_REQUIRED must run BEFORE fs.writeFileSync — no skeleton on disk'
+      );
+
+      server.stdin.end();
+      resolve(results);
+    })().catch((err) => {
+      server.kill();
+      reject(err);
+    });
+  });
+}
+
 (async () => {
   const standardResults = await runValidation('standard', {}, standardAssertions);
   // Explicitly null out the env var on the standard pass to ensure no leakage.
   // For compact, set PROVAR_MCP_SCHEMA_MODE=compact via the spawn env.
   const compactResults = await runValidation('compact', { PROVAR_MCP_SCHEMA_MODE: 'compact' }, compactAssertions);
+  const runtimeGuardResults = await runRuntimeGuardValidation();
 
-  const allResults = [...standardResults, ...compactResults];
+  const allResults = [...standardResults, ...compactResults, ...runtimeGuardResults];
 
   let pass = 0;
   let fail = 0;
@@ -256,7 +409,7 @@ function compactAssertions(toolList, record) {
       fail++;
     }
   }
-  console.log(`\nPDX-482 validation: ${pass} passed, ${fail} failed`);
+  console.log(`\nPDX-482/PDX-483 validation: ${pass} passed, ${fail} failed`);
   process.exit(fail > 0 ? 1 : 0);
 })().catch((err) => {
   console.error('Validation script error:', err);
diff --git a/src/mcp/tools/testCaseGenerate.ts b/src/mcp/tools/testCaseGenerate.ts
index 4d7d64dc..3249cb3d 100644
--- a/src/mcp/tools/testCaseGenerate.ts
+++ b/src/mcp/tools/testCaseGenerate.ts
@@ -256,6 +256,32 @@ export function registerTestCaseGenerate(server: McpServer, config: ServerConfig
         target_uri: input.target_uri,
       });
 
+      // PDX-483: active runtime guard for the PDX-479 regression pattern.
+      // Rejects the exact shape that produces a contract-violating skeleton on
+      // disk: empty steps[] + non-dry-run + persistence target. Other empty-
+      // steps shapes (dry_run preview, no output_path) remain allowed.
+      if (input.steps.length === 0 && !input.dry_run && input.output_path) {
+        const err = makeError(
+          'STEPS_REQUIRED',
+          'provar_testcase_generate was called with an empty steps[] array and a target output_path. ' +
+            'This produces a contract-violating skeleton (the PDX-479 regression pattern) and is rejected.',
+          requestId,
+          false,
+          {
+            suggestion:
+              'Pass the FULL step tree to provar_testcase_generate in a single call. ' +
+              'provar_testcase_step_edit is for amending an already-validated test case ' +
+              '(single-step add, attribute fix, debug edit), not for constructing one from scratch. ' +
+              'If you genuinely want a skeleton for inspection, set dry_run=true.',
+          }
+        );
+        log('warn', 'provar_testcase_generate: STEPS_REQUIRED', {
+          requestId,
+          output_path: input.output_path,
+        });
+        return { isError: true, content: [{ type: 'text' as const, text: JSON.stringify(err) }] };
+      }
+
       try {
         const xmlContent = buildTestCaseXml(input);
         const filePath: string | undefined = input.output_path ? path.resolve(input.output_path) : undefined;
diff --git a/test/unit/mcp/testCaseGenerate.test.ts b/test/unit/mcp/testCaseGenerate.test.ts
index b2049463..af65578d 100644
--- a/test/unit/mcp/testCaseGenerate.test.ts
+++ b/test/unit/mcp/testCaseGenerate.test.ts
@@ -382,11 +382,16 @@ describe('provar_testcase_generate', () => {
   });
 
   describe('writing to disk', () => {
+    // Each disk-write test uses a non-empty steps[] so the PDX-483 STEPS_REQUIRED
+    // guard (which rejects steps:[]+dry_run:false+output_path) does not fire.
+    // These tests assert *other* behaviour: file write, overwrite, mkdirp, path policy.
+    const SMOKE_STEPS = [{ api_id: 'UiConnect', name: 'Connect', attributes: {} }];
+
     it('writes file when dry_run=false and output_path provided', () => {
       const outPath = path.join(tmpDir, 'Login.testcase');
       const result = server.call('provar_testcase_generate', {
         test_case_name: 'Login',
-        steps: [],
+        steps: SMOKE_STEPS,
         output_path: outPath,
         dry_run: false,
         overwrite: false,
@@ -415,7 +420,7 @@ describe('provar_testcase_generate', () => {
 
       const result = server.call('provar_testcase_generate', {
         test_case_name: 'Existing',
-        steps: [],
+        steps: SMOKE_STEPS,
         output_path: outPath,
         dry_run: false,
         overwrite: false,
@@ -431,7 +436,7 @@ describe('provar_testcase_generate', () => {
 
       const result = server.call('provar_testcase_generate', {
         test_case_name: 'Existing',
-        steps: [],
+        steps: SMOKE_STEPS,
         output_path: outPath,
         dry_run: false,
         overwrite: true,
@@ -446,7 +451,7 @@ describe('provar_testcase_generate', () => {
       const outPath = path.join(tmpDir, 'tests', 'suite', 'Login.testcase');
       server.call('provar_testcase_generate', {
         test_case_name: 'Login',
-        steps: [],
+        steps: SMOKE_STEPS,
         output_path: outPath,
         dry_run: false,
         overwrite: false,
@@ -456,14 +461,144 @@ describe('provar_testcase_generate', () => {
     });
   });
 
+  // ── PDX-483 runtime guard: reject empty steps[] on non-dry-run with output_path ──
+  // The PDX-479 regression class arose from agents calling generate with steps:[]
+  // intending to append later via step_edit. The passive contract (PDX-482) lives in
+  // the description; the active runtime guard rejects the exact shape that produces
+  // a contract-violating file on disk. The 6 edge cases below pin down which empty-
+  // steps shapes are allowed (dry-run preview, inspection-only) vs rejected (file write).
+  describe('STEPS_REQUIRED runtime guard (PDX-483)', () => {
+    const SINGLE_STEP = [{ api_id: 'UiConnect', name: 'Connect', attributes: {} }];
+
+    it('allows steps:[] + dry_run:true + no output_path (skeleton inspection)', () => {
+      const result = server.call('provar_testcase_generate', {
+        test_case_name: 'Skeleton Inspect',
+        steps: [],
+        dry_run: true,
+        overwrite: false,
+      });
+      assert.equal(isError(result), false, 'dry-run skeleton inspection must remain allowed');
+      assert.equal(parseText(result)['written'], false);
+    });
+
+    it('allows steps:[] + dry_run:true + output_path provided (dry-run preview wins)', () => {
+      const outPath = path.join(tmpDir, 'DryRunWithPath.testcase');
+      const result = server.call('provar_testcase_generate', {
+        test_case_name: 'DryRun With Path',
+        steps: [],
+        output_path: outPath,
+        dry_run: true,
+        overwrite: false,
+      });
+      assert.equal(isError(result), false, 'dry-run wins over output_path — no file is written');
+      assert.equal(fs.existsSync(outPath), false, 'file must not be written in dry_run mode');
+    });
+
+    it('allows steps:[] + dry_run:false + no output_path (no persistence target)', () => {
+      const result = server.call('provar_testcase_generate', {
+        test_case_name: 'No Output Path',
+        steps: [],
+        dry_run: false,
+        overwrite: false,
+      });
+      assert.equal(isError(result), false, 'no output_path means no file write — TODO-only XML is harmless');
+      assert.equal(parseText(result)['written'], false);
+    });
+
+    it('REJECTS steps:[] + dry_run:false + output_path with STEPS_REQUIRED', () => {
+      const outPath = path.join(tmpDir, 'Empty.testcase');
+      const result = server.call('provar_testcase_generate', {
+        test_case_name: 'Empty Build',
+        steps: [],
+        output_path: outPath,
+        dry_run: false,
+        overwrite: false,
+      });
+      assert.equal(isError(result), true, 'multi-call construction pattern must be rejected');
+      const body = parseText(result);
+      assert.equal(body['error_code'], 'STEPS_REQUIRED');
+      assert.equal(body['retryable'], false);
+      const details = body['details'] as Record<string, unknown>;
+      assert.ok(details, 'error must include details');
+      const suggestion = details['suggestion'];
+      assert.ok(typeof suggestion === 'string', 'details.suggestion must be a string');
+      assert.ok(suggestion.length > 0, 'details.suggestion must be non-empty');
+      assert.ok(
+        suggestion.includes('FULL step tree'),
+        'suggestion must instruct passing the FULL step tree in a single call'
+      );
+      assert.ok(
+        suggestion.includes('dry_run=true'),
+        'suggestion must mention the dry_run=true escape hatch for skeleton inspection'
+      );
+    });
+
+    it('STEPS_REQUIRED rejection writes NO file (assertion: fs.existsSync === false)', () => {
+      const outPath = path.join(tmpDir, 'NeverWritten.testcase');
+      server.call('provar_testcase_generate', {
+        test_case_name: 'Never Written',
+        steps: [],
+        output_path: outPath,
+        dry_run: false,
+        overwrite: false,
+      });
+      assert.equal(
+        fs.existsSync(outPath),
+        false,
+        'STEPS_REQUIRED rejection must run BEFORE fs.writeFileSync — no skeleton on disk'
+      );
+    });
+
+    it('allows non-empty steps + dry_run:false + output_path (happy path — normal write)', () => {
+      const outPath = path.join(tmpDir, 'HappyPath.testcase');
+      const result = server.call('provar_testcase_generate', {
+        test_case_name: 'Happy Path',
+        steps: SINGLE_STEP,
+        output_path: outPath,
+        dry_run: false,
+        overwrite: false,
+      });
+      assert.equal(isError(result), false, 'normal write path must remain unchanged');
+      assert.equal(parseText(result)['written'], true);
+      assert.equal(fs.existsSync(outPath), true, 'happy-path file must be written');
+    });
+
+    // Path-policy ordering check: the guard must fire BEFORE assertPathAllowed
+    // so that a caller in the rejected shape gets STEPS_REQUIRED (the actionable
+    // root-cause error), not PATH_NOT_ALLOWED (which would mislead about the fix).
+    it('STEPS_REQUIRED fires BEFORE path policy when both would reject', () => {
+      const strictServer = new MockMcpServer();
+      registerTestCaseGenerate(strictServer as never, { allowedPaths: [tmpDir] });
+      const result = strictServer.call('provar_testcase_generate', {
+        test_case_name: 'Outside And Empty',
+        steps: [],
+        // Path outside allowedPaths AND empty steps — STEPS_REQUIRED must win
+        // because its suggestion is the actionable one (path is moot if no steps).
+        output_path: path.join(os.tmpdir(), 'outside-and-empty.testcase'),
+        dry_run: false,
+        overwrite: false,
+      });
+      assert.equal(isError(result), true);
+      assert.equal(
+        parseText(result)['error_code'],
+        'STEPS_REQUIRED',
+        'STEPS_REQUIRED must fire before assertPathAllowed — the empty-payload root cause is what the LLM needs to see'
+      );
+    });
+  });
+
   describe('path policy', () => {
+    // Uses a non-empty steps[] to bypass the PDX-483 STEPS_REQUIRED guard so
+    // the assertion targets the PATH_NOT_ALLOWED branch specifically.
+    const SMOKE_STEPS = [{ api_id: 'UiConnect', name: 'Connect', attributes: {} }];
+
     it('returns PATH_NOT_ALLOWED when output_path is outside allowedPaths', () => {
       const strictServer = new MockMcpServer();
       registerTestCaseGenerate(strictServer as never, { allowedPaths: [tmpDir] });
 
       const result = strictServer.call('provar_testcase_generate', {
         test_case_name: 'Evil',
-        steps: [],
+        steps: SMOKE_STEPS,
         output_path: path.join(os.tmpdir(), 'evil.testcase'),
         dry_run: false,
         overwrite: false,

From da43f1be1222f08d43de1e2bd2649f6c59563d15 Mon Sep 17 00:00:00 2001
From: Michael Dailey <michael.dailey@provartesting.com>
Date: Fri, 15 May 2026 11:12:37 -0500
Subject: [PATCH 2/2] PDX-483: docs(mcp): scrub PDX-XXX refs and internal dev
 notes from customer docs
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

RCA: The initial PDX-483 commit added internal Jira IDs (PDX-479, PDX-481, PDX-482, PDX-483) and engineering-process phrasing ("regression class", "before this guard") to docs/mcp.md and docs/mcp-pilot-guide.md. These are customer-facing surfaces — pilot guides and MCP reference docs — and must not leak internal ticket plumbing or dev-process language to integrators.
Fix: Rewrite the new STEPS_REQUIRED section in docs/mcp.md and the Defense-in-depth block in docs/mcp-pilot-guide.md in terms of observable behaviour and the API contract. Pre-existing PDX-479 references in the Scenario 12 background and bug-filing line are out of scope for this PR and left as-is.
---
 docs/mcp-pilot-guide.md |  8 ++++----
 docs/mcp.md             | 12 ++++++------
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/docs/mcp-pilot-guide.md b/docs/mcp-pilot-guide.md
index 50cb1ae7..45dd7502 100644
--- a/docs/mcp-pilot-guide.md
+++ b/docs/mcp-pilot-guide.md
@@ -445,11 +445,11 @@ NitroX is Provar's Hybrid Model for locators — it maps Salesforce component-ba
 
 **Background:** A regression in 1.5.0 (PDX-479) traced to authoring guidance that steered LLMs toward a per-step construction pattern. Multi-call construction drops scenario numbers (e.g. Scenario 1 → Scenario 3, no Scenario 2), flattens asserts that should be nested inside `UiWithScreen` clauses, and produces inconsistent assert API IDs across the case. This scenario exists so the regression class is exercised in pilot evaluation and cannot recur silently.
 
-**Defense in depth.** Three layers protect against the regression class:
+**Defense in depth.** Three layers protect against the multi-call construction pattern:
 
-1. **Prompt + resource guidance** (PDX-481) — upstream authoring prompts no longer steer toward per-step construction.
-2. **Tool-description contract** (PDX-482) — `provar_testcase_generate` and `provar_testcase_step_edit` descriptions explicitly mark generate as constructor-only and step_edit as amendment-only, so the LLM reads the contract at every call site (including compact schema mode).
-3. **Runtime guard** (PDX-483) — `provar_testcase_generate` actively rejects the exact shape that produces the bad file: `steps:[]` + `dry_run:false` + `output_path`. The rejection returns `STEPS_REQUIRED` with `details.suggestion` telling the LLM to pass the full step tree in one call. Empty-steps shapes that don't write a file (dry-run preview, no output_path) remain allowed.
+1. **Prompt and resource guidance** — authoring prompts and the MCP step-reference resource describe single-call construction as the contract.
+2. **Tool-description contract** — `provar_testcase_generate` and `provar_testcase_step_edit` descriptions explicitly mark generate as constructor-only and step_edit as amendment-only, so the LLM reads the contract at every call site (including compact schema mode).
+3. **Runtime guard** — `provar_testcase_generate` rejects the exact shape that would produce a skeleton-only file: `steps:[]` + `dry_run:false` + `output_path`. The rejection returns `STEPS_REQUIRED` with `details.suggestion` telling the LLM to pass the full step tree in one call. Empty-steps shapes that don't write a file (dry-run preview, no `output_path`) remain allowed.
 
 If a pilot LLM falls into the multi-call pattern despite the description contract, the runtime guard converts the failure into an actionable error rather than a silently broken file on disk.
 
diff --git a/docs/mcp.md b/docs/mcp.md
index 437a8a6e..45a2ec51 100644
--- a/docs/mcp.md
+++ b/docs/mcp.md
@@ -765,13 +765,13 @@ AssertValues uses **flat** argument structure (`expectedValue`, `actualValue`, `
 
 **Error codes**
 
-| Code               | Meaning                                                                                                                                                            |
-| ------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
-| `TESTCASE_INVALID` | Generated XML failed structural validation (see `details.validation`)                                                                                              |
-| `FILE_EXISTS`      | `output_path` already exists and `overwrite=false`                                                                                                                 |
-| `STEPS_REQUIRED`   | Called with `steps:[]` + `dry_run:false` + `output_path` — the PDX-479 multi-call construction pattern. `details.suggestion` tells the caller how to self-correct. |
+| Code               | Meaning                                                                                                                                                                              |
+| ------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
+| `TESTCASE_INVALID` | Generated XML failed structural validation (see `details.validation`)                                                                                                                |
+| `FILE_EXISTS`      | `output_path` already exists and `overwrite=false`                                                                                                                                   |
+| `STEPS_REQUIRED`   | Called with `steps:[]` + `dry_run:false` + `output_path` — constructing a test case requires the full step tree on the write path. `details.suggestion` tells the caller how to fix. |
 
-**`STEPS_REQUIRED` (PDX-483 runtime guard).** The rejected shape is `steps:[]` + `dry_run:false` + `output_path` — the exact call signature that, before this guard, produced a contract-violating skeleton on disk (the PDX-479 regression class). All other empty-steps shapes remain allowed:
+**`STEPS_REQUIRED`.** The rejected shape is `steps:[]` + `dry_run:false` + `output_path`. Constructing a test case requires the full step tree in a single call; passing an empty array on the write path would produce a skeleton-only file. All other empty-steps shapes remain allowed:
 
 | `steps.length` | `dry_run`     | `output_path` | Result                                                  |
 | -------------- | ------------- | ------------- | ------------------------------------------------------- |