feat(review): custom reviews as Agent Skills + whole-file/general findings by backnotprop · Pull Request #955 · backnotprop/plannotator

backnotprop · 2026-06-23T02:10:34Z

What

Two related changes that make code review configurable and lossless.

1. Custom reviews as Agent Skills. Pick a skill and that skill becomes the review. Plannotator reads the skill's SKILL.md body live from your global skill folders (~/.claude/skills, ~/.codex/skills, ~/.agents/skills, XDG) and uses it as the entire review prompt — the default review prompt is dropped for custom reviews, and the user message is trimmed to just the git/PR context. Nothing is copied; skills that carry references//scripts//assets/ are read on demand from where they already live (the agent is pointed at the real folder). No frontmatter parsing: name = folder name, instructions = body with the leading ---…--- block stripped. The built-in default review stays byte-for-byte unchanged.

2. Agent findings can be a line, a whole file, or general — and are never silently dropped. A finding that couldn't be pinned to a diff line used to be filtered out with no trace; now it becomes a whole-file or review-level (general) comment instead. The verdict count is recomputed from what actually rendered, so the card can't claim more than it shows.

UI: a Review picker in the launch panel (Default, your enabled reviews, "+ add a review") and a "General" section in the review sidebar.

Design / invariants

Skill becomes the prompt for custom reviews; default review byte-identical. A short output-contract reminder is appended to custom prompts so a verdict-style skill returns discrete, line-attached findings plus a general summary, instead of one block.
Read live, never copied, never writes your repo. Edit a skill the normal way and the next review uses it.
Curation: ~/.plannotator/review-skills.json ({ version: 1, enabled: [names] }). Global skills only (fork-trust boundary).
Placement inferred from what a finding carries: file + line → line, file only → whole-file, neither → general. Nothing is dropped.
Output schemas are relaxed the strict-safe way (nullable but still required), so codex exec --output-schema and Claude's --json-schema both accept them. A test asserts the strict invariant on both.
Selection is authoritative: a picked review that can't be resolved fails loud instead of quietly running Default; launch is held until the review list has loaded.
Bun/Pi parity via vendor.sh (one shared resolver/composer, vendored to the Pi extension).

Verification

Typecheck clean; 1686 tests pass / 0 fail — covering the placement classifier, the scope-aware validator, transform routing, prompt composition, the strict-schema invariant, and launch-time profile resolution.

Docs

docs/custom-reviews.md — user-facing behavior and migration from the old JSON-profile design (supersedes #913).

A custom review is an Agent Skill, read live from the user's global skill folders and injected as the review's focus. Skills with reference/script files are pointed at where they live, never copied. Backend plus a launch-panel picker. - Loader: discover global skills, strip the SKILL.md body (no frontmatter parse), curate via review-skills.json, map to the existing ResolvedReviewProfile. - Skills with extra files: point the agent at the real skill dir (no copy). - Endpoints: GET /api/agents/review-profiles (enabled), GET /api/agents/skills (all, for the picker), POST /api/agents/review-skills (enable one). - UI: a Review dropdown in the launch panel (Default, enabled reviews, separator, "Add new review") and a type-ahead dialog over every discovered skill. - Replaces the old ~/.plannotator/reviews JSON-profile design. Default review stays byte-identical. Bun and Pi in parity.

When a user picks a custom review skill, its SKILL.md body now fully replaces the default review system prompt instead of being appended as a trailing section, so the picked review actually drives the review. The user message is stripped to git/PR context only. The built-in default review stays byte-for-byte unchanged, and an empty skill body now fails loud instead of silently running the default.

…ently drop a finding A review finding can now be placed three ways — a line, a whole file, or general (review-level). Findings that can't be pinned to a line become whole-file or general comments instead of being filtered out, and the verdict count is recomputed from what actually rendered so the card can't claim more than it shows. Server: relax both agent output schemas so location is optional; route every finding by what it carries via classifyFindingPlacement (no drop); recompute the Claude verdict from rendered findings (mirrored in Pi). UI: a General section in the review sidebar, scope-aware card badges and copy text, and general/file comments carried in the agent feedback and PR body. Validator is scope-aware and stays strict on line findings.

…equired OpenAI strict structured output (codex exec --output-schema) requires every property of an object with additionalProperties:false to appear in `required`; a field is made optional by being nullable, not by being dropped from required. ADR 004 removed code_location/line_range (Codex) from required, which made codex exec fail with a 400 invalid_json_schema before the review ran. Restore them to required and mark them nullable; apply the same nullable pattern to the Claude schema for parity. Add a strict-mode invariant test over both schemas.

The job detail panel rendered every finding as a line finding — general comments showed an empty path with 'L0' and copied ':0', file comments showed a fake line 0. Make AnnotationRow scope-aware (general badge, file path without a line, line as before) and don't open a diff file for a general comment. Extract the shared copy-prefix helper so the sidebar and the panel can't drift.

A custom review skill carries its own methodology but doesn't know Plannotator's output contract, so a verdict-style skill (e.g. interrogate) collapses good, line-locatable findings into one block. Append a short output-contract reminder to custom review prompts: one finding per issue, attach to file+line when about specific code, whole-file or general otherwise, and emit any final verdict as its own general finding in addition to the specific ones. Covers only how to report, never what to look for, and is appended only for custom skills — the default review already states its own contract and stays byte-identical.

…t Default Two ways a picked review could silently become Default: 1. Server: an unresolvable non-default reviewProfileId (renamed/removed skill, stale cookie, malformed request) fell through to the built-in default. Resolution now lives in one shared resolveRequestedReviewProfile() used by both Bun and Pi, which throws a clear error instead of downgrading. The previous F3 guard (skill found but unreadable/empty/oversized) is preserved as one of its branches. 2. Client: a saved custom pick was coerced to Default during the window before /api/agents/review-profiles loaded, so a launch in that instant ran Default. Hold launch for a custom pick until the profile list has loaded; a Default pick never waits. Adds loader tests: unknown id throws, empty-body skill throws, the reserved default id and absent id resolve to default.

Two code-quality fixes from PR review: - launchJob's inner params type was missing reviewProfileId (the public UseAgentJobsReturn interface and callers already had it). Add it so the type matches the contract. - reviewProfileLabel was threaded server→client but never rendered, so it was dead on the wire and its comment ('the UI can show a profile tag') overstated reality. Add the field to CodeAnnotation and render it as a small tag on agent findings in the sidebar and job detail panel, so a custom review's findings show which review produced them.

Replace the Claude/Codex engine dropdown in the launcher with a small two-icon toggle — tap an agent's brand mark to select it. Icons are inlined as components (new AgentIcons.tsx) so the single-file review bundle keeps no external asset dependency: Claude as a vector path, Codex as a base64 data-URI (it only ships as a PNG in the repo). Falls back to a single static icon+label when only one engine is available.

The open-in-app control in the code-review file header showed the app label inline when the header was wide. Drop showLabel so it renders icon-only like the plan/annotate side (DocBadges). The picked app's name still shows in the dropdown.

PR #942 moved the semantic diff from a dockview panel into a resizable sidebar accordion. Put it back as its own dock tab: recover the deleted ReviewSemanticDiffPanel verbatim from before #942, re-register it in the dock component/type maps, restore the App.tsx wiring (panel-open + active tracking, code-nav reference panel, hide-headers, auto-fallback, and the advert handler) and the ReviewStateContext callbacks the panel needs, and re-add the 'Semantic diff' button in the file tree. Remove the sidebar accordion. Shared pieces (semanticDiffShared, /api/semantic-diff, the data hooks) are reused unchanged; #942's other changes are untouched.

Sweep of code this PR orphaned: the .semantic-diff-accordion CSS (the accordion is gone), loadSemanticDiff's now-unnecessary export (the deleted accordion was its only external caller), the unused React import in the generated AgentIcons, and the ResolvedReviewProfile type import in review.ts and the Pi serverReview.ts mirror (orphaned when the inline type annotation gave way to resolveRequestedReviewProfile's inferred return).

backnotprop mentioned this pull request Jun 23, 2026

feat(review): custom review profiles (backend) #913

Closed

backnotprop force-pushed the feat/review-skills branch 4 times, most recently from 6333ec7 to 4a00af5 Compare June 23, 2026 03:54

backnotprop force-pushed the feat/review-skills branch from 4a00af5 to eef2c7c Compare June 23, 2026 04:36

backnotprop added 6 commits June 23, 2026 14:22

backnotprop changed the title ~~feat(review): custom reviews as Agent Skills (backend)~~ feat(review): custom reviews as Agent Skills + whole-file/general findings Jun 23, 2026

backnotprop added 5 commits June 23, 2026 18:22

backnotprop merged commit affeaa0 into main Jun 24, 2026
13 checks passed

backnotprop mentioned this pull request Jun 26, 2026

[Feature-request] Agent-agnostic custom review skill/prompt #897

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(review): custom reviews as Agent Skills + whole-file/general findings#955

feat(review): custom reviews as Agent Skills + whole-file/general findings#955
backnotprop merged 12 commits into
mainfrom
feat/review-skills

backnotprop commented Jun 23, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

backnotprop commented Jun 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Design / invariants

Verification

Docs

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

backnotprop commented Jun 23, 2026 •

edited

Loading