This document describes the harness around openboot.dev: the set of controls that catch drift and steer both human and AI contributors toward correct outputs. It is based on Martin Fowler's Harness Engineering for Coding Agents.
If you are adding a new control or reasoning about why an existing one exists, start here.
Agent = Model + Harness. The harness is everything you can change.
We can't change the underlying LLM. We can change what guidance it gets before writing code (feedforward) and what feedback it gets after (feedback). When a class of issue recurs, the right reaction is not "tell the agent again" — it's to encode the rule into the harness so the next agent (or the next refactor by a human) cannot drift the same way.
Two execution flavors:
- Computational — deterministic, fast, free:
eslint,svelte-check,vitest,src/archtest/*. Run on every change. - Inferential — non-deterministic, slower, paid: AI code review,
/security-review,/ultrareview. Run on integration boundaries.
Three regulation categories:
- Maintainability — code style, complexity, dead code.
- Architecture fitness — project-specific invariants (the "do X, not Y" rules in CLAUDE.md).
- Behaviour — does the code actually do the right thing.
| Category | Control | Trigger | File |
|---|---|---|---|
| Maint. | prettier --check |
save / npm run format:check |
.prettierrc |
| Maint. | eslint (ts/svelte) — no-explicit-any, no-unused-vars, no-console, no-restricted-globals (banned: process), Svelte best practices |
npm run lint / .claude/hooks/post-tool-use.sh / CI |
eslint.config.js |
| Maint. | npm audit --audit-level=high (drift) |
informational CI | .github/workflows/harness.yml |
| Maint. | knip dead-code (drift) |
informational CI | .github/workflows/harness.yml |
| Maint. | required-checks alignment (drift) — .github/required-checks.txt ↔ workflow job names |
informational CI | .github/workflows/harness.yml |
| Arch. | db-access-scoping — .prepare() / .exec() / .batch() only in +server.ts or src/lib/server/db/ |
vitest (src/archtest/) |
src/archtest/db-access.test.ts |
| Arch. | no-process-env — Cloudflare Workers uses platform.env, not process.env |
vitest + eslint no-restricted-globals |
src/archtest/env.test.ts + eslint.config.js |
| Arch. | no-console-in-server — server code uses console.error only (no console.log) |
vitest | src/archtest/server-console.test.ts |
| Behav. | svelte-check (TypeScript across .ts and .svelte) |
npm run check / .claude/hooks/stop.sh / CI |
tsconfig.json |
| Behav. | vitest run (unit + smoke) |
npm test / pre-push / CI |
vitest.config.ts |
| Behav. | vitest --coverage → Codecov (informational) |
CI | .github/workflows/ci.yml |
| Behav. | Contract schema validation against openboot-contract |
CI check job + post-deploy |
.github/workflows/ci.yml, .github/workflows/deploy.yml |
| Behav. | Post-deploy health check (/api/health) |
CD deploy job |
.github/workflows/deploy.yml |
| Behav. | Post-deploy smoke test + contract round-trip | CD deploy job |
scripts/smoke-test-api.sh |
| Feedfwd. | Agent conventions | every AI turn | CLAUDE.md, AGENTS.md |
| Feedfwd. | Session-start hook (warm svelte-kit sync) |
every Claude session | .claude/hooks/session-start.sh |
| Feedfwd. | ship-pr skill — push → CI → review → triage → squash → cleanup; no --auto |
model-loaded | .claude/skills/ship-pr/SKILL.md |
| Feedback (agent) | eslint on edited file |
after every Edit/Write/MultiEdit | .claude/hooks/post-tool-use.sh |
| Feedback (agent) | svelte-check + archtest |
end of every Claude turn (if ts/svelte dirty) | .claude/hooks/stop.sh |
| Maint. | eslint on staged diff + prettier --check |
local git pre-commit | scripts/hooks/pre-commit |
| Behav. | npm run validate (lint + check + test + build) |
local git pre-push | scripts/hooks/pre-push |
| Drift loop | Failed harness sensor → open/update GitHub issue | on main / nightly | .github/workflows/drift-to-issue.yml |
| Format | Conventional Commits subject check | push / PR | .github/workflows/conventional-commits.yml |
When you observe a recurring issue, decide where to encode the fix:
| Observation | Encode it as |
|---|---|
"Agent keeps using as any in production code." |
Promote @typescript-eslint/no-explicit-any from warn to error in the relevant files: override. The rule is already wired; only the severity changes. |
| "Agent reads/writes D1 from a non-server file." | Already enforced by src/archtest/db-access.test.ts. Update the test if the allowed-paths list legitimately changes. |
"Agent reaches for process.env on Cloudflare." |
Already enforced by eslint no-restricted-globals + src/archtest/env.test.ts. |
| "Agent introduces a new lint failure that ESLint should have caught." | Enable the relevant rule in eslint.config.js. |
| "Agent breaks behaviour that has no test." | Write the test next to existing ones (src/**/*.test.ts). The pattern is vitest + happy-dom + the helpers in src/lib/test/. |
| "Agent missed a CLAUDE.md rule we keep restating." | Make it a lint rule or an archtest test. A docs rule that doesn't fail is a docs rule that drifts. |
| "Agent guessed at an API contract." | Update openboot-contract fixtures + schemas. CI runs schema validation against them in the check job and after deploy. |
| "Agent's PR description was off." | Tighten .github/pull_request_template.md (if added) or the ship-pr skill. |
| "Drift sensor failed on main but nobody noticed." | Already handled: .github/workflows/drift-to-issue.yml opens/updates a tracking issue per failed sensor. |
Rule of thumb: if you reach for a doc edit, first ask whether a test or lint rule would catch the same drift mechanically. Mechanical wins because it survives doc rot.
ESLint rules currently set to warn (so validate stays green on
existing code). Promote to error once the call sites are cleaned up,
ideally one rule per PR so the diff is reviewable:
| Rule | Why warn today | Promote when |
|---|---|---|
@typescript-eslint/no-explicit-any |
~30 occurrences in Svelte components and routes | Components are typed properly (most are (p: any) => in .map()s with a known shape) |
@typescript-eslint/no-unused-vars |
~20 leftover imports | One sweep with manual review |
svelte/require-each-key |
~10 unkeyed {#each} blocks |
Each route owns its fix |
svelte/no-navigation-without-resolve |
~10 <a href> / goto() without resolve() |
Svelte 5 routing migration |
svelte/prefer-svelte-reactivity |
new Set() instead of SvelteSet in 2 files |
Replace inline |
no-useless-escape |
bash-template strings escape \$ for shell interpolation |
False-positive review — likely stay warn |
- No coverage gate that fails PRs. Codecov is informational
(
fail_ci_if_error: false). Hard coverage gates push toward test-shaped code without raising actual quality. - No husky / lint-staged.
scripts/hooks/symlinked vianpm run install:hooksdoes the same job in ~30 lines, no runtime dep. - No baseline file for archtest rules. Repo is small enough to clean up violations directly when a new rule is added.
- No agent-driven changes to
mainwithout human review. All AI changes go through PR review and the existing CI matrix. - No auto-release / tag automation. Push to
maintriggersci.yml; on successdeploy.ymlfires viaworkflow_runand ships to production. There is no separate release cadence to automate. - No "stale baseline" sensor. N/A while there are no baselines.
If you are reading this as an AI agent: this file tells you where to
add a control, not what to check. The actual checks fire from npm run validate, the .claude/hooks/, and the CI jobs. The most useful
contribution you can make is, when a review reveals a recurring issue,
proposing the row in the table above where the new control belongs — that
is how the harness improves over time.