Skip to content

feat: validate command with five fail-fast schema/lockstep/shape checks#17

Open
dhruva-reddy wants to merge 1 commit intomainfrom
dhruva-reddy/feat/validate-command
Open

feat: validate command with five fail-fast schema/lockstep/shape checks#17
dhruva-reddy wants to merge 1 commit intomainfrom
dhruva-reddy/feat/validate-command

Conversation

@dhruva-reddy
Copy link
Copy Markdown
Contributor

ELI5

Problem. The Vapi API rejects bad configs at PATCH time with terse
400s ("property speed should not exist") — and by then the push has
already partially completed against other resources. We watched the
same five classes of mistake hit production over and over:

  1. Assistant names (or eval names) longer than 40 chars (silent cap).
  2. Structured-output ↔ assistant lockstep mismatch — one side declares
    the relationship, the other doesn't, dashboard ends up inconsistent.
  3. Prompts duplicated by paste-on-top dashboard edits (10kB prompt
    with two identical headers stacked, agent follows both).
  4. maxTokens set lower than the JSON-schema size of the attached
    tools' arguments — assistant looks fine on push, bricks on first
    tool-using call.
  5. Voice fields nested wrong for the provider (voice.speed on
    Cartesia, where it lives at voice.generationConfig.speed).

What this fix does. Five client-side validators, all running off
the same LoadedResources shape that push.ts would actually ship —
so the lint runs against exactly what would be pushed, no separate
parser to drift. Surfaces as warnings by default (one bad spec doesn't
block an otherwise-good push); promote to abort with --strict. Run
standalone via npm run validate -- <org>.

Outcome you'll notice. Most schema-class mistakes get caught
locally in seconds instead of mid-push 400s. Voice provider field
mismatch gets a specific message pointing at the right path. CI can
add npm run push -- <env> --strict as a gate before any deploy.


Catch the classes of errors that today only surface when the API returns
a 400 mid-push. The push pipeline runs validation in warn-only mode by
default; --strict promotes errors to a blocking abort before any API
call. Standalone runner via npm run validate -- <org>.

Validators implemented:

  1. Name length cap (40 chars). Walks every assistant.name and every
    evaluations[].structuredOutput.name in scenarios. Closes feat: simulation suite runner (npm run sim) #18.
  2. SO ↔ assistant bidirectional lockstep. For every SO file's
    assistant_ids, checks the named assistant's structuredOutputIds
    mirrors it; reverse direction too. Closes fix(call): clear wrapped partial transcripts cleanly in npm run call #11.
  3. Prompt duplication heuristics. Same H1 heading appearing twice,
    repeated CONTINUITY ON ENTRY / CLOSEOUT FLOW STRUCTURE blocks.
    Partial fix for membersOverrides.artifactPlan.structuredOutputIds is requiring UUID #8 (paste-on-top dashboard duplications).
  4. maxTokens floor for tool-using assistants. Computes
    floor ≈ 25 + sum(len(JSON.stringify(tool.function.parameters)))
    per attached tool. Warns under floor. Closes refactor: state schema with per-resource content hashes #19.
  5. Per-provider voice schema. Cartesia rejects top-level speed /
    stability / similarityBoost / enableSsmlParsing (point at
    generationConfig.* / drop the field). 11labs rejects
    generationConfig (it's a Cartesia path). Closes Specifying handoff tools in a squad requires UUID to function correctly #9 (engine half).
  • src/validate.ts (NEW): validateResources(loadedResources) returning
    ValidationFinding[] with severity / type / resourceId / rule / message
    / fieldPath. Pure data; safe to test directly.
  • src/validate-cmd.ts (NEW): CLI entry. Loads same resource shape as
    push.ts so the lint runs against exactly what would ship. Exit non-zero
    on any error finding.
  • src/config.ts: --strict flag.
  • src/push.ts: validators run in default-warn mode; --strict aborts.
  • package.json: validate script.
  • AGENTS.md: document npm run validate and --strict.
  • tests/validate.test.ts: per-rule fixtures (golden + bad inputs)
    covering all five checks.

Closes improvements.md #11, #18, #19. Resolves engine half of #9.
Partial #8, #20 (heuristic only).

🤖 Generated with Claude Code

@dhruva-reddy dhruva-reddy force-pushed the dhruva-reddy/feat/push-dry-run branch from 392855d to d9d9477 Compare May 1, 2026 22:56
@dhruva-reddy dhruva-reddy force-pushed the dhruva-reddy/feat/validate-command branch from b62592a to cb8079a Compare May 1, 2026 22:56
dhruva-reddy added a commit that referenced this pull request May 2, 2026
## ELI5

**Problem.** Every push rewrites `.vapi-state.<env>.json`. JavaScript's
`JSON.stringify` keeps whatever order keys happened to land in — and
state sections get rebuilt from multiple sources (push, pull, bootstrap)
with unpredictable insertion order. Result: about half of every state
diff is just lines moving up and down without any actual change.
Reviewers stopped reading state diffs because they were mostly noise,
which defeats the point of versioning the file.

**What this fix does.** Adds a `sortedKeysReplacer` that runs during
`JSON.stringify` and emits object keys alphabetically at every nesting
level. Arrays stay in their original order (squad member ordering, tool
destination priority, etc. are semantic). State writes go through this
replacer.

**Outcome you'll notice.** The first push after this lands produces a
**big one-time diff** of pure reordering across every customer. That's
the cost of landing the fix — please don't read the first state diff
post-merge, it's churn. Every diff after that shows only real changes:
new UUIDs, removed entries, hashes changing. Reviewing state files
becomes useful again.

---

JS's JSON.stringify honors insertion order. State sections get rebuilt
from multiple sources (push, pull, bootstrap) with unpredictable
insertion order, so ~half of every state-file diff is pure reorderings
that hide the real changes.

- src/state-serialize.ts (NEW): sortedKeysReplacer (recursive alphabetical
  key sort, arrays untouched) + canonicalize (also drops null/undefined
  leaves; reused by Stack F/G). Kept config-free so tests can import
  without triggering config.ts's CLI parser.
- src/state.ts: saveState now passes sortedKeysReplacer to JSON.stringify.
  Atomic-write pattern preserved.
- tests/state-key-order.test.ts: pin byte-identical serialization across
  insertion orders, recursion, array preservation, primitive handling,
  idempotence.

Closes improvements.md #17.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
dhruva-reddy added a commit that referenced this pull request May 2, 2026
## ELI5

**Problem.** Every push rewrites `.vapi-state.<env>.json`. JavaScript's
`JSON.stringify` keeps whatever order keys happened to land in — and
state sections get rebuilt from multiple sources (push, pull, bootstrap)
with unpredictable insertion order. Result: about half of every state
diff is just lines moving up and down without any actual change.
Reviewers stopped reading state diffs because they were mostly noise,
which defeats the point of versioning the file.

**What this fix does.** Adds a `sortedKeysReplacer` that runs during
`JSON.stringify` and emits object keys alphabetically at every nesting
level. Arrays stay in their original order (squad member ordering, tool
destination priority, etc. are semantic). State writes go through this
replacer.

**Outcome you'll notice.** The first push after this lands produces a
**big one-time diff** of pure reordering across every customer. That's
the cost of landing the fix — please don't read the first state diff
post-merge, it's churn. Every diff after that shows only real changes:
new UUIDs, removed entries, hashes changing. Reviewing state files
becomes useful again.

---

JS's JSON.stringify honors insertion order. State sections get rebuilt
from multiple sources (push, pull, bootstrap) with unpredictable
insertion order, so ~half of every state-file diff is pure reorderings
that hide the real changes.

- src/state-serialize.ts (NEW): sortedKeysReplacer (recursive alphabetical
  key sort, arrays untouched) + canonicalize (also drops null/undefined
  leaves; reused by Stack F/G). Kept config-free so tests can import
  without triggering config.ts's CLI parser.
- src/state.ts: saveState now passes sortedKeysReplacer to JSON.stringify.
  Atomic-write pattern preserved.
- tests/state-key-order.test.ts: pin byte-identical serialization across
  insertion orders, recursion, array preservation, primitive handling,
  idempotence.

Closes improvements.md #17.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
@dhruva-reddy dhruva-reddy force-pushed the dhruva-reddy/feat/push-dry-run branch from d9d9477 to 714523f Compare May 2, 2026 01:21
@dhruva-reddy dhruva-reddy force-pushed the dhruva-reddy/feat/validate-command branch from cb8079a to b1f91f7 Compare May 2, 2026 01:21
dhruva-reddy added a commit that referenced this pull request May 2, 2026
## ELI5

**Problem.** Every push rewrites `.vapi-state.<env>.json`. JavaScript's
`JSON.stringify` keeps whatever order keys happened to land in — and
state sections get rebuilt from multiple sources (push, pull, bootstrap)
with unpredictable insertion order. Result: about half of every state
diff is just lines moving up and down without any actual change.
Reviewers stopped reading state diffs because they were mostly noise,
which defeats the point of versioning the file.

**What this fix does.** Adds a `sortedKeysReplacer` that runs during
`JSON.stringify` and emits object keys alphabetically at every nesting
level. Arrays stay in their original order (squad member ordering, tool
destination priority, etc. are semantic). State writes go through this
replacer.

**Outcome you'll notice.** The first push after this lands produces a
**big one-time diff** of pure reordering across every customer. That's
the cost of landing the fix — please don't read the first state diff
post-merge, it's churn. Every diff after that shows only real changes:
new UUIDs, removed entries, hashes changing. Reviewing state files
becomes useful again.

---

JS's JSON.stringify honors insertion order. State sections get rebuilt
from multiple sources (push, pull, bootstrap) with unpredictable
insertion order, so ~half of every state-file diff is pure reorderings
that hide the real changes.

- src/state-serialize.ts (NEW): sortedKeysReplacer (recursive alphabetical
  key sort, arrays untouched) + canonicalize (also drops null/undefined
  leaves; reused by Stack F/G). Kept config-free so tests can import
  without triggering config.ts's CLI parser.
- src/state.ts: saveState now passes sortedKeysReplacer to JSON.stringify.
  Atomic-write pattern preserved.
- tests/state-key-order.test.ts: pin byte-identical serialization across
  insertion orders, recursion, array preservation, primitive handling,
  idempotence.

Closes improvements.md #17.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
@dhruva-reddy dhruva-reddy force-pushed the dhruva-reddy/feat/push-dry-run branch 2 times, most recently from bf5161c to 87fb394 Compare May 2, 2026 01:27
@dhruva-reddy dhruva-reddy force-pushed the dhruva-reddy/feat/validate-command branch from b1f91f7 to 3558d10 Compare May 2, 2026 01:27
@dhruva-reddy dhruva-reddy changed the base branch from dhruva-reddy/feat/push-dry-run to graphite-base/17 May 2, 2026 01:31
@dhruva-reddy dhruva-reddy force-pushed the dhruva-reddy/feat/validate-command branch from 3558d10 to bcd23de Compare May 2, 2026 01:31
@graphite-app graphite-app Bot changed the base branch from graphite-base/17 to main May 2, 2026 01:31
**Problem.** The Vapi API rejects bad configs at PATCH time with terse
400s ("property speed should not exist") — and by then the push has
already partially completed against other resources. We watched the
same five classes of mistake hit production over and over:

  1. Assistant names (or eval names) longer than 40 chars (silent cap).
  2. Structured-output ↔ assistant lockstep mismatch — one side declares
     the relationship, the other doesn't, dashboard ends up inconsistent.
  3. Prompts duplicated by paste-on-top dashboard edits (10kB prompt
     with two identical headers stacked, agent follows both).
  4. `maxTokens` set lower than the JSON-schema size of the attached
     tools' arguments — assistant looks fine on push, bricks on first
     tool-using call.
  5. Voice fields nested wrong for the provider (`voice.speed` on
     Cartesia, where it lives at `voice.generationConfig.speed`).

**What this fix does.** Five client-side validators, all running off
the same `LoadedResources` shape that `push.ts` would actually ship —
so the lint runs against exactly what would be pushed, no separate
parser to drift. Surfaces as warnings by default (one bad spec doesn't
block an otherwise-good push); promote to abort with `--strict`. Run
standalone via `npm run validate -- <org>`.

**Outcome you'll notice.** Most schema-class mistakes get caught
locally in seconds instead of mid-push 400s. Voice provider field
mismatch gets a specific message pointing at the right path. CI can
add `npm run push -- <env> --strict` as a gate before any deploy.

---

Catch the classes of errors that today only surface when the API returns
a 400 mid-push. The push pipeline runs validation in warn-only mode by
default; --strict promotes errors to a blocking abort before any API
call. Standalone runner via `npm run validate -- <org>`.

Validators implemented:

1. Name length cap (40 chars). Walks every assistant.name and every
   evaluations[].structuredOutput.name in scenarios. Closes #18.
2. SO ↔ assistant bidirectional lockstep. For every SO file's
   assistant_ids, checks the named assistant's structuredOutputIds
   mirrors it; reverse direction too. Closes #11.
3. Prompt duplication heuristics. Same H1 heading appearing twice,
   repeated CONTINUITY ON ENTRY / CLOSEOUT FLOW STRUCTURE blocks.
   Partial fix for #8 (paste-on-top dashboard duplications).
4. maxTokens floor for tool-using assistants. Computes
   floor ≈ 25 + sum(len(JSON.stringify(tool.function.parameters)))
   per attached tool. Warns under floor. Closes #19.
5. Per-provider voice schema. Cartesia rejects top-level speed /
   stability / similarityBoost / enableSsmlParsing (point at
   generationConfig.* / drop the field). 11labs rejects
   generationConfig (it's a Cartesia path). Closes #9 (engine half).

- src/validate.ts (NEW): validateResources(loadedResources) returning
  ValidationFinding[] with severity / type / resourceId / rule / message
  / fieldPath. Pure data; safe to test directly.
- src/validate-cmd.ts (NEW): CLI entry. Loads same resource shape as
  push.ts so the lint runs against exactly what would ship. Exit non-zero
  on any error finding.
- src/config.ts: --strict flag.
- src/push.ts: validators run in default-warn mode; --strict aborts.
- package.json: validate script.
- AGENTS.md: document npm run validate and --strict.
- tests/validate.test.ts: per-rule fixtures (golden + bad inputs)
  covering all five checks.

Closes improvements.md #11, #18, #19. Resolves engine half of #9.
Partial #8, #20 (heuristic only).

🤖 Generated with [Claude Code](https://claude.com/claude-code)
@dhruva-reddy dhruva-reddy force-pushed the dhruva-reddy/feat/validate-command branch from bcd23de to 5cb218f Compare May 2, 2026 01:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Specifying handoff tools in a squad requires UUID to function correctly

1 participant