feat(events): emit 5-way token breakdown + context-window utilization in message_complete by byapparov · Pull Request #87 · aictrl-dev/cli

byapparov · 2026-06-22T03:41:23Z

Summary

5-way token breakdown in message_complete: expands tokens from an opaque info.tokens passthrough to an explicit object — input / output / reasoning / cache.read / cache.write — mirroring upstream LLM.Usage. The data was already captured in MessageV2.Assistant.tokens via StepFinishPart accumulation; this change surfaces all fields explicitly.
Context-window utilization added as context: { used, limit, ratio } where used = input + cache.read, limit comes from Provider.getModel() → model.limit.context (models.dev registry), and ratio = used / limit. Emits null when the model's context limit is not known (unregistered custom endpoint).
cost: 0 trap avoided: info.cost is kept as-is — it accumulates real per-step cost from StepFinishPart. The new upstream step.ended event emits cost: 0 (reconciled later by a projector); we do not touch that path.
EVENTS.md updated with the extended schema, field-by-field documentation, and non-overlap invariant note.

How the cache split was already captured

MessageV2.StepFinishPart (the legacy step-finish message part) already has tokens: { input, output, reasoning, cache: { read, write } }. The assistant message info.tokens is accumulated from these step parts — cache split included. No new provider-level capture was needed; we just stop dropping it in the emit call.

Context limit source

Provider.getModel(providerID, modelID) returns the model record from the models.dev registry, which has model.limit.context. The lookup is wrapped in a .catch(() => null) so an unknown model (custom endpoint) gracefully emits context: null rather than throwing.

Test plan

TDD: wrote failing test in packages/cli/test/cli/usage-token-breakdown.test.ts (8 cases) — confirmed RED before implementation
Implemented changes in packages/cli/src/cli/cmd/run.ts
bun test test/cli/usage-token-breakdown.test.ts — 8/8 GREEN
bun test test/cli/ — 77/77 GREEN (no regressions)
bun run typecheck — clean
Pre-push hook ran bun turbo typecheck across all 5 packages — 6/6 tasks successful

Closes #86

🤖 Generated with Claude Code

https://claude.ai/code/session_0187MsfK1upr6K2BKVbmaebQ

… in message_complete (#86) - Expand `tokens` in `message_complete` from an opaque `info.tokens` passthrough to an explicit object with all 5 fields: input / output / reasoning / cache.read / cache.write — mirroring upstream LLM.Usage shape. The data was already captured in MessageV2.Assistant.tokens via StepFinishPart accumulation; this change surfaces it explicitly. - Add `context: { used, limit, ratio }` to `message_complete`: - `used = input + cache.read` (tokens occupying the context window this turn) - `limit` sourced from Provider.getModel() → model.limit.context (models.dev) - `ratio = used / limit`; emits `null` when limit is unknown (unregistered endpoint) - Cost kept correct: `info.cost` accumulates real per-step cost from StepFinishPart, NOT from the new step.ended event which emits cost:0 (the cost:0 trap). - Update EVENTS.md with the extended schema and field-by-field documentation. - Add TDD test file (RED→GREEN): `test/cli/usage-token-breakdown.test.ts`. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_0187MsfK1upr6K2BKVbmaebQ

aictrl-dev · 2026-06-22T07:56:29Z

Code review

Verdict: Address the major findings before merging. · 🔴 0 · 🟠 1 · 🟡 1 · ⚪ 0 · 0/2 resolved

🟠 packages/cli/src/cli/cmd/run.ts:488-494 — limit:0 yields Infinity ratio, breaks null contract
🟡 packages/cli/test/cli/usage-token-breakdown.test.ts:63 — Tests grep source text instead of running emit path

🤖 Fix all 2 open findings with your agent

Fix the following code review findings on aictrl-dev/cli PR #87 (head branch).
Run the relevant tests/linters after each change.

1. packages/cli/src/cli/cmd/run.ts:488-494 — limit:0 yields Infinity ratio, breaks null contract
   Detail: The guard `contextLimit != null` does not catch the case where the model's context limit is `0`. `Provider.Model.limit.context` is a required `z.number()` (provider.ts:703), and config-defined custom models without a limit default `context` to `0` (provider.ts:929: `context: model.limit?.context ?? existingModel?.limit?.context ?? 0`). For such models `Provider.getModel` succeeds and returns `context: 0`, so `contextLimit` is `0` (not `null`), the guard passes, and `ratio = contextUsed / 0` evaluates to `Infinity` (or `NaN` when `contextUsed` is also `0`). `JSON.stringify` then serializes `Infinity`/`NaN` as `null` *inside* the context object, yielding `{ used: <N>, limit: 0, ratio: null }` instead of the documented top-level `null`. EVENTS.md explicitly promises: "null — emitted when the model's context limit is not known (e.g. unregistered custom endpoint)". This diverges from the documented contract and emits a misleading object for the exact scenario it claims to handle. Fix: also require `contextLimit > 0` before emitting the object.
   Suggested fix: Change the guard from `contextLimit != null` to `contextLimit != null && contextLimit > 0` so a context limit of `0` (config/custom models without a known limit) is treated as "unknown" and emits `null` as documented, avoiding `Infinity`/`NaN` in the ratio.
2. packages/cli/test/cli/usage-token-breakdown.test.ts:63 — Tests grep source text instead of running emit path
   Detail: The new tests assert behavior by reading `run.ts` as a string and matching regexes against the source text rather than exercising the emit path. Several assertions are loose enough to pass even if the logic is wrong:\n\n- `test("context.ratio is used / limit")` (line 63) uses `/ratio.*\\/|\\/.*ratio|used\\s*\\/\\s*limit|contextLimit/`. The `contextLimit` alternative matches the bare variable declaration on line 483, so the test stays green even if `ratio` is changed to a constant (e.g. `ratio: 0`).\n- `test("context.limit is sourced from Provider.getModel context limit")` (line 57) uses `/Provider\\.getModel|limit\\.context|contextLimit/` — any one of those tokens anywhere in the file passes; it does not verify the value is actually wired into `context.limit`.\n\nThese give a false sense of coverage: a refactor that renames a variable or removes the real computation but leaves a stray token would not be caught. Prefer asserting against actual emitted JSON (e.g. stub `Provider.getModel` and capture `process.stdout` writes from `emit("message_complete", …)`) so the 5-way `tokens` and `context` shapes and the `used = input + cache.read` / `ratio = used/limit` computations are verified at runtime.

📋 Out-of-diff findings (2)

Sev	Location	Finding
🟠	`packages/cli/src/cli/cmd/run.ts:488-494`	limit:0 yields Infinity ratio, breaks null contract
🟡	`packages/cli/test/cli/usage-token-breakdown.test.ts:63`	Tests grep source text instead of running emit path

Reviewed 3 files · 0 inline · view all 2 findings ↗

aictrl · AI code review for fast-moving teams · aictrl.dev

…dContextWindow Custom models without a registered context limit default to `limit.context = 0` (provider.ts:929). The old guard `contextLimit != null` passed for 0, causing `ratio = used / 0 = Infinity`, which JSON.stringify serialises as `null` *inside* the context object — diverging from the documented top-level null contract in EVENTS.md ("null — emitted when the model's context limit is not known"). Fix: extract pure helper `buildContextWindow(limit, used)` that returns null when limit is null or <=0. This also makes the computation unit-testable. Replace source-grep tests (which could pass even with wrong logic, per bot review) with 10 behavioural unit tests of `buildContextWindow` covering: null limit, zero limit (🟠 regression case), ratio computation, JSON-serialisability, and the top-level-null contract. Retain slim source-text checks for structural wiring. Fixes review findings from PR #87 (aictrl-dev bot): - 🟠 limit:0 yields Infinity ratio, breaks null contract - 🟡 Tests grep source text instead of running emit path

byapparov · 2026-06-22T08:05:39Z

Review response — PR #87

Triaged 2 findings (🟠 1, 🟡 1). Both verified TRUE; both fixed.

Issues addressed (pushed to this PR)

limit:0 yields Infinity ratio, breaks null contract — packages/cli/src/cli/cmd/run.ts:488: Verified TRUE. provider.ts:929 confirms custom models default context to 0. The old contextLimit != null guard passes for 0, so ratio = used / 0 = Infinity, which JSON.stringify serialises as null inside the object — breaking the documented top-level null contract in EVENTS.md line 113. Fixed by extracting buildContextWindow(limit, used) which returns null when limit == null || limit <= 0. (commit 7bbf2045d)
Tests grep source text instead of running emit path — packages/cli/test/cli/usage-token-breakdown.test.ts:63: Verified TRUE. context.ratio is used / limit matched the bare contextLimit variable declaration, not the computation — the test stayed green with ratio: 0 hardcoded. Fixed by replacing the 5 loose regex tests with 10 behavioural unit tests of the extracted buildContextWindow helper: null limit, zero limit (regression case), ratio math, JSON-serialisability, top-level-null contract. Source-text checks retained only for structural wiring that can't be covered by the pure helper. (commit 7bbf2045d)

Review claims verified false (no change needed)

(none — both findings were genuine)

Not addressed here

(none — all findings fixed)

aictrl-dev Bot reviewed Jun 22, 2026

View reviewed changes

Comment thread packages/cli/src/cli/cmd/run.ts Outdated

Comment thread packages/cli/test/cli/usage-token-breakdown.test.ts Outdated

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(events): emit 5-way token breakdown + context-window utilization in message_complete#87

feat(events): emit 5-way token breakdown + context-window utilization in message_complete#87
byapparov wants to merge 2 commits into
mainfrom
feat/usage-token-breakdown

byapparov commented Jun 22, 2026

Uh oh!

Uh oh!

Uh oh!

aictrl-dev Bot commented Jun 22, 2026

Uh oh!

byapparov commented Jun 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

byapparov commented Jun 22, 2026

Summary

How the cache split was already captured

Context limit source

Test plan

Uh oh!

Uh oh!

Uh oh!

aictrl-dev Bot commented Jun 22, 2026

Code review

Uh oh!

byapparov commented Jun 22, 2026

Review response — PR #87

Issues addressed (pushed to this PR)

Review claims verified false (no change needed)

Not addressed here

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant