feat(agent-runtime): honor per-step gate on the coding lane (#301) by krokoko · Pull Request #307 · aws-samples/sample-autonomous-cloud-coding-agents

krokoko · 2026-06-10T16:34:26Z

What this does (plain English)

After a coding agent finishes its work, the platform runs checks (like "does the build still pass?") before opening a pull request. Workflow authors can declare how strict each check is in the workflow YAML:

strict — any failure fails the task.
regression_only — fails only if the agent broke something that was working before.
informational — reported, but never fails the task.

The bug: on coding tasks, that setting was silently ignored — the pipeline used hardcoded regression-only logic instead. You could write gate: strict and a build the agent broke would still pass. The three built-in workflows behaved correctly only by coincidence (their declared gates happened to match the hardcoded behavior).

The fix: the coding path now actually reads and applies the declared gate, using the same single piece of logic the repo-less lane already used — one definition of gate semantics, not two copies that can drift. Even when a check fails the task, the PR is still opened so the agent's work stays visible and reviewable, just marked failed. Tests prove strict/informational now work as declared and that the three shipped workflows behave exactly as before.

Summary

Fixes #301. On the coding lane, steps[].gate was declarative-only: the post-hook block in pipeline.run_task hardcoded regression-only build gating, ignoring each verify_* step's declared gate. This PR makes the coding lane honor the declared gate through the same implementation the repo-less lane uses — the runner's gate_status (formerly _gate_status, now public) — so gate semantics live in exactly one place.

Decision (documented in _apply_post_hook_gates and WORKFLOWS.md §"Step kinds"): the inline path consults step.gate via gate_status rather than routing the post-hooks through the runner's step handlers. Routing through the runner would change failure-path side effects (a gating verify_build with on_failure: fail stops the runner before ensure_pr, stranding committed work with no PR) — that is the broader half-migrated-runner unification #301 explicitly defers. The inline ordering (verify → ensure_pr always runs) is preserved; only the task verdict honors the gate.

Changes

agent/src/workflow/runner.py — _gate_status → gate_status (public; exported from workflow).
agent/src/pipeline.py — new _apply_post_hook_gates(workflow, ...) replaces the hardcoded build_ok block:
- declared verify_build/verify_lint gate per their gate (strict | regression_only | informational; unset = regression_only), only when on_failure: fail (advisory continue/skip_remaining never gate, matching the runner);
- undeclared verify_build keeps legacy regression-only gating; undeclared verify_lint never gates (legacy);
- read_only workflows never gate (legacy);
- workflow reload failure falls back to legacy gating (work is never stranded — preserves PR feat(tasks): workflow-driven tasks — declarative steps + repo-less workflows (#248) #296 finding feat: add Iteration 3e for memory security and integrity (OWASP ASI06) #5 behavior).
agent/tests/test_pipeline_post_hook_gates.py — 29 new tests.
docs/design/WORKFLOWS.md — documents the gate field, the one-implementation rule, and the deferred runner unification (+ synced Starlight mirror).

Acceptance criteria

Coding lane verify_build/verify_lint honor declared gate via the same gate_status path the repo-less lane uses — one implementation, not two.
gate: strict fails the task on a build regression (and even on a pre-existing failure); informational never gates — proven end-to-end through run_task (TestRunTaskHonorsGate).
No behavior change for the three shipped coding workflows — TestShippedWorkflowParity asserts inline == runner == legacy verdicts across the full (build_passed, build_before) matrix for regression_only and read_only cases.
Choice (consult step.gate inline vs route through runner) documented in code and WORKFLOWS.md.

Test plan

mise //agent:quality — 1004 passed (29 new), ruff clean, coverage 76.05%
mise run build — full monorepo build green (CDK 1844 tests, CLI, docs sync + build)

The coding lane's inline post-hook gating now consults each declared verify_build / verify_lint step's `gate` through the runner's gate_status (made public from _gate_status) — one gate implementation for both lanes. A coding workflow declaring `gate: strict` now fails the task on a build failure; `informational` never gates. The three shipped coding workflows keep their exact pre-change effective gating, locked by a parity test matrix (inline == runner == legacy verdicts). The inline ordering is preserved: ensure_pr still runs after a gating verify failure so the agent's work surfaces as a reviewable PR. Refs #301

isadeks · 2026-06-10T17:28:36Z

Code Review: PR #307 — honor per-step `gate` on the coding lane

Verdict: Approve. No blockers, no nits.

This is a clean, well-scoped PR. I reviewed the actual diff (7 files, +512/−51) against main, read the full enclosing functions, ran independent finder passes, and verified the key claims empirically — including running the test suite — rather than trusting the description.

What I checked

Risk area	Verdict
Inline verdict vs. runner verdict parity	The PR's central claim is that `_apply_post_hook_gates` agrees with the runner's `gate_status` + `on_failure` outcome. Traced all `(gate, on_failure, passed, was_passing_before, read_only)` combos — they agree. ✓
No behavior change for the 3 shipped workflows	Confirmed against the real YAMLs (`new-task-v1`/`pr-iteration-v1`: `regression_only`; `pr-review-v1`: `informational` + `read_only`). New code reproduces the old `build_ok = passed or not build_before` exactly across the full `(build_passed, build_before)` matrix. ✓
Removed-behavior re-establishment	The deleted hardcoded block's invariants (read-only never gates, regression-only default, undeclared lint never gates, "already failing before" not counted as a regression) are all re-established. ✓
`workflow=None` fallback	Reload failure → build keeps legacy regression-only, lint never gates; work is never stranded (preserves #296 finding #5). ✓
Cross-file breakage (`_gate_status` → `gate_status` rename)	`grep -rn "_gate_status"` repo-wide: zero leftover references; correctly exported from `workflow/__init__.py`. ✓
`Workflow \| None` annotation under `TYPE_CHECKING`	Safe — `pipeline.py` has `from __future__ import annotations`. ✓
Tests vacuous / wrong patch targets?	Ran them: `63 passed` (29 new + 34 in the runner suite). `TestShippedWorkflowParity` loads the real YAMLs; `TestRunTaskHonorsGate` exercises `run_task` end-to-end and asserts `status == "error"` requires `build_ok == False` — not vacuous. ✓

Quality notes (not findings)

The "consult step.gate inline rather than route post-hooks through the runner" decision is the intentional feat(agent-runtime): honor per-step gate on the coding lane (declarative steps decorative for verify_build/ensure_pr) #301 scope boundary (routing would strand committed work before ensure_pr). It's documented in both the code and WORKFLOWS.md, with the deferred runner-unification rationale called out. Right altitude.
Gate semantics genuinely live in one place now (gate_status), and the diff respects on_failure (advisory continue/skip_remaining don't gate the verdict), matching the runner. No special-casing drift introduced.
Minor: build_ok keeps its name even though it now also folds in lint gating — slightly under-named, but the function name (_apply_post_hook_gates) and call-site comment make intent clear. Not worth changing.

Ship it.

krokoko requested a review from a team as a code owner June 10, 2026 16:34

Merge branch 'main' into feat/301-honor-gate-coding-lane

6aa357e

krokoko requested a review from isadeks June 10, 2026 17:34

krokoko enabled auto-merge June 10, 2026 17:36

isadeks approved these changes Jun 10, 2026

View reviewed changes

krokoko added this pull request to the merge queue Jun 10, 2026

Merged via the queue into main with commit 14820c8 Jun 10, 2026
6 of 7 checks passed

krokoko deleted the feat/301-honor-gate-coding-lane branch June 10, 2026 17:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(agent-runtime): honor per-step gate on the coding lane (#301)#307

feat(agent-runtime): honor per-step gate on the coding lane (#301)#307
krokoko merged 2 commits into
mainfrom
feat/301-honor-gate-coding-lane

krokoko commented Jun 10, 2026 •

edited

Loading

Uh oh!

isadeks commented Jun 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

krokoko commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this does (plain English)

Summary

Changes

Acceptance criteria

Test plan

Uh oh!

isadeks commented Jun 10, 2026

Code Review: PR #307 — honor per-step gate on the coding lane

What I checked

Quality notes (not findings)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

krokoko commented Jun 10, 2026 •

edited

Loading

Code Review: PR #307 — honor per-step `gate` on the coding lane