feat: surface gate detail in the workflow run/resume --json payload by doquanghuy · Pull Request #2965 · github/spec-kit

doquanghuy · 2026-06-12T17:37:44Z

Description

Reference implementation for #2964 — for discussion, direction welcome.

When a run pauses at a gate, the --json outcome now carries a gate block (step_id / message / options / choice) so orchestrators can detect "human review needed" and present the options without parsing the human-facing stream. Two small pieces:

The engine records each step's type in the run state's step results (one added line in step_data — previously the type was not recoverable from state).
_workflow_run_payload adds the gate block via a _gate_outcome helper when the run's current step is a gate. choice populates when the outcome ends at the gate with a decision recorded (e.g. an interactive rejection with on_reject: abort → a failed payload carrying "choice": "reject"; an on_reject: retry pause likewise). A mid-flow approval proceeds past the gate, so the block clears — by design. Non-gate runs and runs that end elsewhere are unchanged — no gate key, payload byte-identical to today.

The issue lists alternatives (a generic paused_step block; a dedicated status value) — happy to rework toward either.

Testing

Ran existing tests with uv sync && uv run pytest — full suite 3727 passed
Two new CLI-level tests (TestWorkflowRunGateOutcomeJson): a gate pause carries the exact block (CliRunner stdin is non-TTY, so the gate pauses); a completed run has no gate key — the gate-pause test is red against current main, green with the change (verified both directions)
uvx ruff check src/ — clean
Tested locally with uv run specify --help
Tested with a sample project (covered by the CLI-level tests, which drive a real gate workflow through workflow run --json)

AI Disclosure

I did not use AI assistance for this contribution
I did use AI assistance (describe below)

Code, tests, and this description were authored with AI assistance (Claude); verified by running the repo's test suite and ruff locally in both red and green directions.

A paused run was indistinguishable from any other pause in the machine-readable outcome, and the gate's prompt/options/choice never left the human-facing stream. Record each step's type in the run state's step results (one engine line) and, when the run sits at a gate, add a gate block (step_id/message/options/choice) to the payload so orchestrators can drive review gates without parsing stdout. Reference implementation for the proposal in github#2964. Addresses github#2964

doquanghuy · 2026-06-12T17:38:44Z

@mnriem when you have a moment, would appreciate your thoughts on the direction here — the issue lists the alternatives considered, and I'm happy to rework toward whichever shape fits Spec Kit best.

Copilot

Pull request overview

This PR extends the workflow CLI’s --json run/resume outcome payload to include structured details when the run is paused at a gate step, enabling external orchestrators to detect “human review needed” without parsing stdout.

Changes:

Record each executed step’s type into persisted step_results so step types are recoverable from run state.
Add an optional gate block to the workflow run --json / workflow resume --json payload when the current step is a gate.
Add CLI-level tests covering a non-interactive gate pause (includes gate block) and a non-gate completed run (no gate key).

Show a summary per file

File	Description
`tests/test_workflows.py`	Adds CLI-level tests asserting `--json` includes a structured `gate` block on gate pauses and omits it for a normal completed run.
`src/specify_cli/workflows/engine.py`	Persists `type` in each step’s recorded `step_results` entry so step-type introspection is possible from run state.
`src/specify_cli/__init__.py`	Builds the `--json` outcome payload and conditionally injects `gate` details via a helper when the current step is a gate.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Files reviewed: 3/3 changed files
Comments generated: 2

+    step = (getattr(state, "step_results", None) or {}).get(state.current_step_id)
+    if not isinstance(step, dict) or step.get("type") != "gate":
+        return None


+        runner = CliRunner()
+        result = runner.invoke(app, ["workflow", "run", str(path), "--json"])
+        return _json.loads(result.stdout)


mnriem · 2026-06-16T20:37:12Z

Please address Copilot feedback

Address review (github#2965): _gate_outcome() emitted a gate block whenever current_step_id pointed at a gate step. Since RunState.current_step_id is never cleared on completion, a completed/failed run whose last step was a gate leaked stale gate detail in run/resume/status --json. Guard on status == paused. Also assert CLI success in the _run_json test helper before JSON-parsing, and add direct coverage for the suppression guard. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

doquanghuy · 2026-06-17T02:16:49Z

@mnriem Thanks for the review — addressed the Copilot feedback:

_gate_outcome() now only surfaces the gate block while the run is actually paused (guards on status == paused). Since RunState.current_step_id isn't cleared on completion, a completed/failed run whose last step was a gate no longer leaks stale gate detail in run/resume/status --json.
Hardened the _run_json test helper to assert CLI success before JSON-parsing, and added direct coverage for the suppression guard.

Full suite green; ruff check src/ clean. Ready for another look.

Copilot

Copilot's findings

Files reviewed: 3/3 changed files
Comments generated: 3

Address Copilot review: - `_gate_outcome` now also surfaces the gate block when a run is `aborted` by a gate rejection (`on_reject: abort`), not only when `paused`. Abort is the only path that sets ABORTED and it leaves current_step_id on the gate, so an orchestrator can read the recorded `choice` for the stop. - Coerce `message` to a string (it may be a non-string YAML literal that GateStep only coerces for interpolation) so the JSON schema stays stable. - Tests: add a CLI-level aborted-path test, a message-coercion test, and extend the suppression test to allow `aborted`; share the run helper via `_invoke_json` to avoid duplicating the invoke boilerplate. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

doquanghuy · 2026-06-17T13:46:11Z

@mnriem Pushed 5fd0f85 addressing the latest Copilot round:

Abort path now surfaces the gate block. _gate_outcome() emits the gate detail for aborted runs too, not only paused. Abort is the only path that sets ABORTED (gate rejection with on_reject: abort) and it leaves current_step_id on that gate, so an orchestrator can read the recorded choice for the stop. completed/failed stay suppressed.
Stable JSON schema. message is coerced to a string — GateStep only coerces it for expression interpolation, so a non-string YAML literal could otherwise leak into the payload.
Tests: added a CLI-level aborted-path test (test_gate_abort_carries_gate_block, asserts status == aborted and choice == reject), a message-coercion test, and extended the suppression test to allow aborted. Shared the run helper via _invoke_json to avoid duplicating invoke boilerplate.

Copilot

Copilot's findings

Files reviewed: 3/3 changed files
Comments generated: 1

+        result = self._invoke_json(tmp_path, monkeypatch, self._WF_GATE)
+        payload = _json.loads(result.stdout)
+        assert payload["status"] == "aborted"


doquanghuy requested a review from mnriem as a code owner June 12, 2026 17:37

mnriem requested a review from Copilot June 16, 2026 13:45

Copilot started reviewing on behalf of mnriem June 16, 2026 13:45 View session

Copilot AI reviewed Jun 16, 2026

View reviewed changes

mnriem requested a review from Copilot June 17, 2026 12:25

Copilot started reviewing on behalf of mnriem June 17, 2026 12:25 View session

Copilot AI reviewed Jun 17, 2026

View reviewed changes

Comment thread src/specify_cli/__init__.py Outdated

Comment thread tests/test_workflows.py Outdated

Comment thread tests/test_workflows.py

mnriem requested a review from Copilot June 17, 2026 14:28

Copilot started reviewing on behalf of mnriem June 17, 2026 14:29 View session

Copilot AI reviewed Jun 17, 2026

View reviewed changes

Comment thread tests/test_workflows.py

Comment on lines +4023 to +4025

result = self._invoke_json(tmp_path, monkeypatch, self._WF_GATE)

payload = _json.loads(result.stdout)

assert payload["status"] == "aborted"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: surface gate detail in the workflow run/resume --json payload#2965

feat: surface gate detail in the workflow run/resume --json payload#2965
doquanghuy wants to merge 3 commits into
github:mainfrom
doquanghuy:feat/2964-gate-outcome-json

doquanghuy commented Jun 12, 2026 •

edited

Loading

Uh oh!

doquanghuy commented Jun 12, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

mnriem commented Jun 16, 2026

Uh oh!

doquanghuy commented Jun 17, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

doquanghuy commented Jun 17, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

doquanghuy commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Testing

AI Disclosure

Uh oh!

doquanghuy commented Jun 12, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Copilot's findings

Uh oh!

mnriem commented Jun 16, 2026

Uh oh!

doquanghuy commented Jun 17, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Copilot's findings

Uh oh!

Uh oh!

Uh oh!

Uh oh!

doquanghuy commented Jun 17, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Copilot's findings

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

doquanghuy commented Jun 12, 2026 •

edited

Loading