feat(run): emit tool_catalog NDJSON event at session start by byapparov · Pull Request #88 · aictrl-dev/cli

byapparov · 2026-06-22T03:49:02Z

Summary

Emits a tool_catalog NDJSON event immediately after session_start and before the first model turn, listing every resolved builtin tool, MCP tool, and skill with version.
Adds MCP.toolEntries() — a lightweight sibling to MCP.tools() that returns {toolKey, serverName}[] without building full AI-SDK tool objects.
Threads version from SKILL.md frontmatter through Skill.Info (new optional field; null in the event when not declared).
New src/cli/cmd/tool-catalog.ts with injectable deps for testing; 9 TDD tests cover builtin/mcp shape, the consumer gate ("tool exposed vs missing"), skill version presence/absence, and tools/skills separation.
EVENTS.md updated with full schema documentation.

Test plan

bun test test/tool-catalog.test.ts — 9/9 pass (RED→GREEN TDD)
bun test full suite — 1294 pass, 0 fail
bun run typecheck (turbo, all packages) — clean

Skill version gap

Skill.Info.version is now threaded through from the SKILL.md frontmatter version field. Skills that don't declare a version emit "version": null. No fabricated versions — if a skill pack needs version tracking, add version: x.y.z to its frontmatter.

Closes #85

Co-Authored-By: Claude Sonnet 4.6 noreply@anthropic.com
Claude-Session: https://claude.ai/code/session_0187MsfK1upr6K2BKVbmaebQ

Emits a single `tool_catalog` event immediately after `session_start` and before the first model turn so the server-side completion gate can structurally verify that required MCP tools (e.g. `record_finding`) were actually exposed to the model — without string-matching prose output. Changes: - `src/cli/cmd/tool-catalog.ts` (new): `buildToolCatalogItems()` collects builtin tool IDs via `ToolRegistry.ids()` and MCP entries via the new `MCP.toolEntries()`, returning `{ tools, skills }` with injectable deps for unit testing. - `src/mcp/index.ts`: adds `MCP.toolEntries()` — lightweight sibling to `tools()` that returns `{ toolKey, serverName }[]` without building full AI-SDK tool objects; used by `buildToolCatalogItems()`. - `src/skill/skill.ts`: extends `Skill.Info` with optional `version` field threaded from SKILL.md frontmatter; `null` when not declared. - `src/cli/cmd/run.ts`: `await buildToolCatalogItems()` + `emit("tool_catalog", ...)` inserted between `session_start` and the first `loop()` call. - `test/tool-catalog.test.ts` (new, TDD): 9 tests covering builtin/mcp tool shape, the "tool exposed vs missing" consumer gate, skill version presence/absence, and skills/tools separation. - `EVENTS.md`: documents the new event schema including the version gap note. Skill version gap: `Skill.Info.version` is now threaded through from frontmatter. Skills without a `version` field in their SKILL.md emit `version: null`. Closes #85 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_0187MsfK1upr6K2BKVbmaebQ

…capability (#88) Instead of a one-off `version` field bolt-on, `Skill.Info` now carries a `metadata: Record<string, string>` field that retains every SKILL.md frontmatter key beyond `name`/`description`, with values string-coerced (handles YAML numbers like `version: 1.0` → `"1.0"`). `version` becomes a derived convenience accessor (`metadata.version`) rather than a parallel extraction path. The dual parse path is gone — one coherent pass builds `metadata`, then derives `version` from it. Changes: - `Skill.Info`: add `metadata: z.record(z.string(), z.string()).default({})`; keep `version: z.string().optional()` as derived accessor - Loader: replace separate `"version" in md.data` extraction with a single `Object.fromEntries(…filter name/description…map String)` pass - Tests: add "non-version frontmatter keys survive in skill.metadata" test (license+author keys survive; name/description excluded; version still resolves via metadata.version) — written RED first, then GREEN `tool_catalog` event payload unchanged — still emits `version` sourced from `metadata.version`. `metadata` not added to event payload (kept lean per spec). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_0187MsfK1upr6K2BKVbmaebQ

aictrl-dev · 2026-06-22T07:58:54Z

Code review

Verdict: Address the major findings before merging. · 🔴 0 · 🟠 2 · 🟡 3 · ⚪ 0 · 0/5 resolved

🟡 packages/cli/src/cli/cmd/run.ts:710 — Catalog built even when output is not JSON
🟠 packages/cli/src/cli/cmd/run.ts:716-718 — Catalog errors swallowed with no logging
🟠 packages/cli/src/cli/cmd/tool-catalog.ts:90 — Catalog builtins are unfiltered, not the dispatched set
🟡 packages/cli/src/mcp/index.ts:670-677 — listTools() runs twice per session start
🟡 packages/cli/src/mcp/index.ts:680-687 — MCP key derivation duplicated in toolEntries() and tools()

🤖 Fix all 5 open findings with your agent

Fix the following code review findings on aictrl-dev/cli PR #88 (head branch).
Run the relevant tests/linters after each change.

1. packages/cli/src/cli/cmd/run.ts:710 — Catalog built even when output is not JSON
   Detail: `emit()` only writes when `args.format === "json"`, but the catalog block unconditionally awaits `buildToolCatalogItems()` — which enumerates every MCP server via `listTools()` and scans the skill filesystem — even in the default human-readable mode where the result is never emitted. For interactive (non-JSON) sessions this adds unnecessary MCP round-trips and startup latency before the first turn.

Guard the collection on `args.format === "json"` (matching how the result is consumed), e.g. `if (args.format === "json") { await buildToolCatalogItems().then(...).catch(...) }`.
2. packages/cli/src/cli/cmd/run.ts:716-718 — Catalog errors swallowed with no logging
   Detail: The `.catch(() => {})` handler silently drops every error from `buildToolCatalogItems()`. The entire purpose of this event (per EVENTS.md) is to let the server-side gate detect "silent success" failure modes — but if catalog collection itself fails (e.g. `Skill.all()` throws on a malformed SKILL.md, `MCP.clients()` rejects, or `ToolRegistry.ids()` fails), the run completes with NO `tool_catalog` event at all and NO signal that anything went wrong. That is a brand-new silent-success path created by the very feature meant to eliminate them.

Because `buildToolCatalogItems` uses `Promise.all` over builtins, MCP, and skills, a single failing source (most likely `Skill.all()` hitting the filesystem) suppresses the whole catalog including the unrelated, healthy tool lists.

Please log the error (at least `log.warn`/`log.error`) inside the catch so operators can diagnose missing catalogs, while still keeping the session alive.
   Suggested fix:         .catch((err) => {
          // Never let catalog collection block or crash the session, but
          // surface the failure so a missing event is diagnosable.
          log.error("failed to emit tool_catalog", { error: err instanceof Error ? err.message : String(err) })
        })
3. packages/cli/src/cli/cmd/tool-catalog.ts:90 — Catalog builtins are unfiltered, not the dispatched set
   Detail: `buildToolCatalogItems()` resolves builtins via `ToolRegistry.ids()`, which returns the instance-level superset with NO model/agent filtering. But `resolveTools` (session/prompt.ts) dispatches via `ToolRegistry.tools({modelID, providerID}, agent, sessionID)`, which applies real per-model filters:

- `apply_patch` is only exposed for gpt-* models, and `edit`/`write` are HIDDEDEN for those same models — they are mutually exclusive at dispatch, yet the catalog will always report BOTH.
- `codesearch`/`websearch` are gated on `providerID === "aictrl" || AICTRL_ENABLE_EXA`, but the catalog always lists them.

EVENTS.md overstates the guarantee: it claims the gate "uses `tools[]` to verify that [tools] were actually in the model's function list at dispatch time." That is true for MCP tools (record_finding, etc.), but NOT for the builtin portion. A consumer that structurally checks for `edit` (or absence of `apply_patch`) on a gpt run will get wrong answers.

The author already notes "instance-level superset, without model-specific filtering" in a code comment — so the fix is either (a) pass model+agent into `buildToolCatalogItems` and call `ToolRegistry.tools(...)` so the catalog matches dispatch, or (b) correct EVENTS.md to scope the claim to MCP tools only.
4. packages/cli/src/mcp/index.ts:670-677 — listTools() runs twice per session start
   Detail: `MCP.toolEntries()` calls `client.listTools()` on every connected client at catalog time (before the first turn). Moments later, the first turn's `resolveTools` → `MCP.tools()` calls `client.listTools()` again on the same clients. For sessions with several MCP servers this doubles the round-trips to each server at session start, and for slow/remote MCP servers it adds latency directly on the critical path before the first model turn (the catalog `await` blocks `loop()`).

Consider memoizing the listTools result on the client/state for the session, or having `tools()` produce the entries so the work happens once.
5. packages/cli/src/mcp/index.ts:680-687 — MCP key derivation duplicated in toolEntries() and tools()
   Detail: `MCP.toolEntries()` re-implements the connected-client iteration and `sanitizedClientName + "_" + sanitizedToolName` key derivation that already lives in `MCP.tools()`. The catalog's value depends entirely on those keys matching what `resolveTools` exposes to the model (which comes from `MCP.tools()`'s keys). The two implementations must stay byte-for-byte in sync; if either sanitization changes, the catalog silently stops matching dispatch and the consumer gate produces false negatives/positives.

Extract a single shared helper (e.g. `mcpToolKey(clientName, toolName)` plus a shared `connectedClients()` iterator) so there is one source of truth, or have `tools()` delegate to `toolEntries()` for keying.

📋 Out-of-diff findings (5)

Sev	Location	Finding
🟡	`packages/cli/src/cli/cmd/run.ts:710`	Catalog built even when output is not JSON
🟠	`packages/cli/src/cli/cmd/run.ts:716-718`	Catalog errors swallowed with no logging
🟠	`packages/cli/src/cli/cmd/tool-catalog.ts:90`	Catalog builtins are unfiltered, not the dispatched set
🟡	`packages/cli/src/mcp/index.ts:670-677`	listTools() runs twice per session start
🟡	`packages/cli/src/mcp/index.ts:680-687`	MCP key derivation duplicated in toolEntries() and tools()

Reviewed 6 files · 0 inline · view all 5 findings ↗

aictrl · AI code review for fast-moving teams · aictrl.dev

- run.ts: guard buildToolCatalogItems on args.format==="json" so MCP listTools + skill scan are skipped for interactive sessions (#88 🟡) - run.ts: log catalog build failures in catch instead of silencing them so a missing tool_catalog event is diagnosable (#88 🟠) - mcp/index.ts: extract mcpToolKey() helper shared by tools() and toolEntries() — single source of truth for key derivation (#88 🟡) - EVENTS.md: document that builtin tools[] reflects the instance-level superset, not the per-model filtered dispatch set (#88 🟠) - test: regression tests for error propagation and key sanitization Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_0187MsfK1upr6K2BKVbmaebQ

byapparov · 2026-06-22T08:06:21Z

Review response — PR #88

All 5 findings triaged. 2 fixed directly, 1 fixed via EVENTS.md correction, 1 fixed by extracting shared key helper (🟡 duplication), 1 deferred (🟡 double-listTools — substantially mitigated by the JSON-mode guard).

Issues addressed (pushed to this PR)

🟠 Catalog errors swallowed with no logging — packages/cli/src/cli/cmd/run.ts:715-718: replaced .catch(() => {}) with .catch((err) => { console.error("failed to emit tool_catalog", ...) }). Error is now surfaced so a missing tool_catalog event is diagnosable, while still keeping the session alive. (commit 1a10f2306)
🟠 Catalog builtins are unfiltered, not the dispatched set — EVENTS.md: option (b) — corrected the documentation to explicitly state that source:"builtin" entries reflect the instance-level superset and that per-model filters (apply_patch/edit/write gating, codesearch/websearch providerID gate) are applied at dispatch time and not reflected in the catalog. The strong structural guarantee now scopes correctly to MCP tools only. (commit 1a10f2306)
🟡 Catalog built even when output is not JSON — packages/cli/src/cli/cmd/run.ts:711: wrapped the entire buildToolCatalogItems() block in if (args.format === "json"). MCP listTools() round-trips and skill filesystem scan no longer run in interactive (non-JSON) sessions. (commit 1a10f2306)
🟡 MCP key derivation duplicated in toolEntries() and tools() — packages/cli/src/mcp/index.ts: extracted mcpToolKey(clientName, toolName) private helper; both tools() and toolEntries() now call it. One source of truth — if the sanitization regex ever changes, both paths update together. (commit 1a10f2306)

Review claims verified false (no change needed)

None — all 5 findings verified TRUE.

Not addressed here

🟡 listTools() runs twice per session start — packages/cli/src/mcp/index.ts:670-677: TRUE but substantially mitigated by the JSON-format guard (fix feat: lean headless flags #3 above). toolEntries() now only runs when args.format==="json", so the double listTools() only affects json-mode sessions. Proper fix (memoizing listTools or having toolEntries derive from tools()) requires more careful coupling of tools()'s error-handling side-effects (s.status[clientName]=failed/delete s.clients[clientName]) with the catalog path. Deferring to a follow-up — the regression impact is now limited to json-mode startup latency.

byapparov self-assigned this Jun 22, 2026

byapparov added this to the Enterprise Observability milestone Jun 22, 2026

aictrl-dev Bot reviewed Jun 22, 2026

View reviewed changes

Comment thread packages/cli/src/cli/cmd/run.ts Outdated

Comment thread packages/cli/src/cli/cmd/tool-catalog.ts

Comment thread packages/cli/src/cli/cmd/run.ts

Comment thread packages/cli/src/mcp/index.ts

Comment thread packages/cli/src/mcp/index.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(run): emit tool_catalog NDJSON event at session start#88

feat(run): emit tool_catalog NDJSON event at session start#88
byapparov wants to merge 3 commits into
mainfrom
feat/tool-catalog-session-start

byapparov commented Jun 22, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

aictrl-dev Bot commented Jun 22, 2026

Uh oh!

byapparov commented Jun 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

byapparov commented Jun 22, 2026

Summary

Test plan

Skill version gap

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

aictrl-dev Bot commented Jun 22, 2026

Code review

Uh oh!

byapparov commented Jun 22, 2026

Review response — PR #88

Issues addressed (pushed to this PR)

Review claims verified false (no change needed)

Not addressed here

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant