feat(notifications): preview-deploy screenshot pipeline (provider-agnostic) by isadeks · Pull Request #241 · aws-samples/sample-autonomous-cloud-coding-agents

isadeks · 2026-06-02T20:51:40Z

Closes #240.

Summary

New deploy-preview screenshot pipeline. Listens for GitHub deployment_status: success events from any provider (Vercel, Amplify Hosting, Netlify, GitHub Actions custom CD); captures the preview URL via AgentCore Browser; posts a markdown image comment on the open PR.
Optional Linear-side comment: if a Linear identifier is in the PR title/body and Linear is configured (per feat(linear): OAuth migration with per-workspace token storage (Phase 2.0b) #160), the same screenshot lands on the Linear issue.
New bgagent github CLI subcommand: webhook-info + set-webhook-secret.
Provider-agnostic. The repo's worked example uses Vercel because that's what was smoke-tested.

Architecture

Lambda-only post-PR (no LLM). Receiver verifies HMAC, dedupes, async-invokes processor.
AgentCore Browser via SigV4-presigned WSS + Chrome DevTools Protocol — no Playwright in the bundle.
Private S3 + CloudFront with Origin Access Control. Bucket fully private; CloudFront serves images anonymously over HTTPS so GitHub markdown image embeds render without auth.
WAF exemption on /v1/github/webhook for GenericRFI_BODY (deployment_status payloads embed absolute URLs).

Test plan

Full CDK suite green locally: 1810 tests, 101 suites passing
Smoke-tested end-to-end on dev stack: Vercel-connected repo → push commit → preview builds → screenshot lands on PR ~12s after Vercel reports success → also lands on linked Linear issue
Provider-agnostic verified: no Vercel-specific code paths, only deployment_status filtering by configurable SCREENSHOT_TARGET_ENVIRONMENT (default Preview)
CI: build + tests pass on PR

Out of scope (followups)

IAM scoping — currently bedrock-agentcore:* for the processor; tighter action set is followup.
Tests for the screenshot pipeline (mock for AgentCore Browser WSS, GitHub PR-lookup, S3, comment posting).
Deploy-protection bypass tokens for providers that gate previews behind auth.
Multi-workspace Linear lookup prefix-routing.

… yet) Lambda + AgentCore Browser plumbing for capturing screenshots of preview deployments. Provider-agnostic — listens for GitHub deployment_status events from any source (Vercel, Amplify Hosting, Netlify, GitHub Actions custom CD). This commit lands the handler / construct code only. Stack wiring follows in the next commit.

- New `GitHubScreenshotIntegration` construct (mirrors `LinearIntegration`): bundles the screenshot bucket, dedup table, signing-secret placeholder, receiver Lambda, processor Lambda, and the API Gateway route. cdk-nag suppressions added inline (HMAC auth instead of Cognito; AgentCore Browser sessions have no per-resource ARN; Secrets Manager rotation is owned by GitHub). - Wired into `agent.ts` after the LinearIntegration block. Reuses the existing `githubTokenSecret` (the processor uses ABCA's main GitHub token to look up which PR a deploy SHA belongs to and post the screenshot comment — no new credential). - Three new stack outputs: `GitHubWebhookUrl`, `GitHubWebhookSecretArn`, `ScreenshotBucketName`. - Bumped agent.test.ts table count from 13 to 14 to account for the new dedup table. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ot bucket cdk-nag's S2 fires on any bucket that has `blockPublicPolicy: false` even when the policy is intentionally permissive. Add the suppression with the same rationale as S1/S5 — public reads are required by GitHub Markdown renderers and Linear `imageUploadFromUrl`, and the read grant is prefix-scoped to `screenshots/*`. Caught when the first deploy attempt aborted at synth-time on the new GitHubScreenshotIntegration construct. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The first deploy attempt failed at CFN-execute time on the bucket policy: s3:PutBucketPolicy ... because public policies are prevented by the BlockPublicPolicy setting in S3 Block Public Access. Account-level Block Public Access is on for this AWS account, which overrides per-bucket BPA settings. Disabling it would change the security posture of the whole account, so route around the constraint with the AWS-recommended pattern: private S3 + CloudFront with Origin Access Control. Changes: - `ScreenshotBucket` is now `BLOCK_ALL` BPA, no public bucket policy. Adds a `cloudfront.Distribution` whose origin is the bucket via `S3BucketOrigin.withOriginAccessControl`. The distribution policy is scoped to the CloudFront service principal only, so account-level BPA accepts it. - Processor reads `SCREENSHOT_PUBLIC_HOST` (the CloudFront domain) instead of building an S3 URL. PR comments now embed `https://<dist>.cloudfront.net/screenshots/...` URLs. - New stack output `ScreenshotCloudFrontDomain`. - Bucket-level S2/S5 suppressions removed (no longer applicable — bucket is private). Distribution gets CFR1/CFR2/CFR3/CFR4/CFR7 suppressions with rationales. Heads up on deploy time: CloudFront distributions take 5-15 min to provision on first create. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The CommonRuleSet was 403'ing GitHub deployment_status webhooks before the request reached our Lambda — the deployment payload contains absolute Vercel preview URLs in the body, which trips GenericRFI_BODY. Mirror the Linear webhook exemption: the GitHub webhook path is HMAC-verified in the Lambda, parsed as strict JSON, never interpolated into SQL/HTML, and rate-limited by the priority-3 rule. CRS still applies to every other route. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…loyment GitHub's `deployment_status` webhook puts the deployed URL on the *status* object, not the deployment itself. The deployment object is immutable per (sha, environment); the status changes through the deploy lifecycle (`pending` → `success`) and carries the URL only once the deploy finishes. Symptom: receiver kept short-circuiting `success` events from Vercel with `{ok: true, skipped_no_url: true}` because we read the wrong field. Verified by inspecting the webhook delivery payload via `gh api .../deliveries/<id> --jq .request.payload.deployment_status` — URL was there all along. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…dshake Node 24's global WebSocket (from undici) does NOT support arbitrary HTTP headers on the upgrade request — passing them as the second arg gets silently ignored. AgentCore Browser's WSS handshake requires SigV4-signed Authorization + X-Amz-* headers, so the connection was opening but then getting rejected, which surfaced as an empty `error` event ("AgentCore Browser WebSocket error: "). Switch to the `ws` package which natively supports `options.headers`. Also add an `unexpected-response` handler so HTTP-level handshake failures (403, 400) surface with status codes instead of empty errors. Smoke verified locally — the ws-based path opens cleanly against example.com and Vercel preview URLs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Lambda runtime returned a 403 on the WSS upgrade despite well-formed SigV4 headers — `ws` rewrites the Host header during the upgrade GET, which invalidates the canonical-request signature we computed against the original Host. This works locally because Node's tooling on macOS keeps the original Host through the handshake, but the Lambda runtime's TLS stack normalizes differently. Switch to query-parameter SigV4 (presigned URL): SignatureV4.presign returns a wss://...?X-Amz-Algorithm=...&X-Amz-Signature=... URL where the auth lives in the URL itself, so any Host-header rewriting downstream doesn't break the signature. Smoke verified locally — presigned URL connects cleanly to AgentCore Browser and the screenshot pipeline runs end-to-end (6.3s, valid PNG, captures example.com correctly). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The minimal IAM I shipped earlier (`StartBrowserSession`, `StopBrowserSession`, `GetBrowserSession`, `UpdateBrowserStream`) wasn't enough — the WSS automation-stream connect requires an additional `ConnectBrowserAutomationStream`-flavored action that isn't in the public CLI command list. Lambda invocations were opening sessions cleanly but 403'ing on the WSS upgrade. Widen to `bedrock-agentcore:*` to unblock the e2e flow. Followup: scope back down to the specific connect action once it's documented or surfaced via CloudTrail decoded-message-on-deny. Smoke verified: PR #1 on isadeks/vercel-abca-linear now receives a screenshot comment within ~7s of the deployment_status webhook. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Extends the screenshot processor to find a Linear issue via the PR's title/body and post the same image comment there. Approach (no GSI write-back needed): - Regex-extract Linear identifier (e.g. `ABCA-42`) from PR title/body. These are present whether the agent put them there (`task_description` carries the identifier) or Linear's own GitHub integration auto-injected the back-reference on PR open. - Scan `LinearWorkspaceRegistryTable` for `status=active` workspaces. Per-workspace, query Linear's `issueVcsBranchSearch` (which accepts the human-readable identifier) and accept the first exact-match hit. - Post the markdown image comment via the existing `postIssueComment` helper from Phase 2.0b. The Linear post is best-effort — if the registry table isn't wired, the identifier doesn't extract, or the lookup misses, the GitHub PR comment still lands. New env var `LINEAR_WORKSPACE_REGISTRY_TABLE_NAME` is optional on the processor; the construct only sets it when the prop is provided. CDK: `GitHubScreenshotIntegrationProps` gains an optional `linearWorkspaceRegistryTable`. When provided, the processor's IAM grows: ReadData on the registry, GetSecretValue+PutSecretValue on `bgagent-linear-oauth-*`. `agent.ts` wires `linearIntegration.workspaceRegistryTable` into the screenshot construct. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Some providers (Vercel, Netlify) post deployment_status faster than the agent can run `gh pr create`. Retry the GitHub PR-lookup with backoff so the screenshot finds the open PR rather than dropping the event when the timing is reversed.

…issue comment spam Move the trigger-label check ahead of every user-facing comment path in the Linear webhook processor, and switch the default trigger label from 'bgagent' to 'abca'. An unlabeled issue is now a true no-op: no comment, no reaction, no createTaskCore, no DDB writes — regardless of whether the project is onboarded. Why: workspace webhooks fire workspace-wide. A single un-onboarded team in the same Linear workspace produced 47 identical "❌ project isn't onboarded" comments on GRO-783 in 5 minutes because every Issue event (create/update/label-change) hit the not-onboarded gate before the label gate. With the gate order flipped, only issues that explicitly opt in via the trigger label can ever generate user-facing feedback. Per-project label_filter override is still respected — the project mapping lookup now happens once, before the label gate, instead of after. Tests: two new regression tests pin the spam scenario (unlabeled issue in a non-onboarded project, and unlabeled issue with no projectId) to zero side effects. Full CDK suite (89 suites / 1572 tests) passes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Adds the operator walkthrough for wiring up the AgentCore-Browser preview-deploy screenshot pipeline.

Mirrors the Linear webhook-info pattern so docs and onboarding don't have to embed stack-specific URLs or copy-paste aws CLI invocations. Two subcommands: - `webhook-info` — read-only. Reads GitHubWebhookUrl + GitHubWebhookSecretArn from the CFN stack outputs and prints values to paste into a GitHub repo's webhook config (Settings → Webhooks → Add webhook). Includes the event-type ('Deployment statuses') and content-type guidance that operators consistently miss. - `set-webhook-secret` — interactive PutSecretValue against the stack output ARN. Replaces the cargo-cult `aws secretsmanager put-secret- value` operators were copy-pasting from the screenshot setup notes. Warns before overwriting an existing real secret (heuristic: a CDK- seeded JSON placeholder starts with `{`; a real GitHub secret won't). No CDK changes — both stack outputs were already there. Pure CLI add.

The pipeline was always provider-agnostic — it listens for GitHub deployment_status events, which Vercel, AWS Amplify, Netlify, and any GitHub-Actions-driven CD pipeline all post. Code comments, inline strings, and the setup guide referenced Vercel as if it were the only supported path; this commit aligns the surfacing with what the code actually does. Code: - Linear comment body: "after the Vercel preview deploy finished" → "after the deploy finished" (the GitHub PR comment already said this; just the Linear path was inconsistent) - Webhook receiver doc-comment + envelope interface comment: drop Vercel-only language; explain that the `environment` filter (`SCREENSHOT_TARGET_ENVIRONMENT` env var) is configurable per- provider, with a table of common values - Processor PR-race comment: explain that the gap is also seen on Netlify/Amplify, not unique to Vercel - AgentCore Browser comment: drop Vercel-specific phrasing on "what we don't try to be clever about" - GitHubScreenshotIntegration construct prop docstring: explain the per-provider env-name conventions Docs: - Rename VERCEL_SETUP_GUIDE.md → DEPLOY_PREVIEW_SCREENSHOTS_GUIDE.md - Lead with a "works with any provider that posts deployment_status" table (Vercel / Amplify / Netlify / GitHub Actions custom CD, with "out-of-the-box?" yes/no per provider) - Keep Vercel as the worked example since it's what we smoke-tested, but add a "skip Steps 1-2" callout for non-Vercel providers - New "Configuring for non-Vercel providers" section with the SCREENSHOT_TARGET_ENVIRONMENT override pointer - Replace 4a/4b's CFN-output spelunking with `bgagent github webhook-info` + `bgagent github set-webhook-secret` (commands shipped in 1c1b618) - Troubleshooting: mention that 401 "Invalid signature" is the set-webhook-secret-mismatch case - Sync registration: register as DEPLOY_PREVIEW_SCREENSHOTS_GUIDE in sync-starlight.mjs route map + the explicit mirror call; added to astro.config.mjs sidebar after the PAK runbook No CDK structural changes — the construct prop, env-var, and code behaviour were already provider-agnostic. Pure surfacing fix.

…eamble Step 3 (repo onboarding + Linear project mapping) duplicated work the Prerequisites section already establishes ('Linear OAuth installed for at least one workspace'). If the user followed the Linear setup guide, both are done. If they didn't, Step 4's smoke test fails fast and the troubleshooting routes them back. Net: 30 lines of doc gone, no information lost. Renumbered Step 4 → 3 and Step 5 → 4 (and the 4a/b/c → 3a/b/c sub-steps). Also dropped the 'demo configuration optimizes for "look, it works" rather than security posture' framing on the production-hardening section. The list of followups stands on its own; the framing reads as condescending toward someone reaching the bottom of the guide.

… state Public docs that say 'followup' read as commitments to do that work. Reframe gaps as current limitations with neutral language: - 'Production hardening (followups)' → 'Production hardening considerations'; bullets describe what to think about, not what ABCA promises to ship - Netlify table row: 'followup to support pattern matching' → '⚠ workable today only by picking one specific PR's environment string; broader pattern matching isn't shipped' - Vercel auth callout: 'tracked as a followup' → 'currently not implemented' - Non-Vercel providers table: drop 'followup aws-samples#96 covers prefix routing' reference (issue numbers don't belong in user-facing docs) Net: same information, no implicit roadmap commitments.

The screenshot pipeline only needs GitHub. Linear-side posting was phrased as a hard requirement throughout the guide because the demo flow happens to use Linear, but a non-Linear team gets a perfectly useful integration: screenshots land on GitHub PRs, the Linear lookup silently no-ops. Reframings: - Lead-in: 'on both the open GitHub PR AND the linked Linear issue' → 'on the open GitHub PR. If you also have Linear configured, the same screenshot is posted to the linked Linear issue as a bonus.' Plus a note on the gating (LinearWorkspaceRegistryTable having active rows is what flips the Linear path on). - 'How it works': step 4 (Linear post) marked optional with the silent-skip behaviour spelled out - Architecture comment: 'GitHub PR comment + Linear issue comment' → '... (+ Linear issue comment if linked)' - Prerequisites: Linear OAuth marked optional with rationale - Smoke test: rewritten as PR-driven by default ('open any PR on the configured repo'), with Linear-driven path as a follow-on paragraph ('If you also have Linear configured...') - Troubleshooting: 'Linear is best-effort' → 'opt-in and best- effort', explicit note that skipping is normal without Linear

GitHub PR comment now reads 'From [preview link](url)' and Linear comment reads '[Preview link](url)' instead of pasting the bare URL. Cleaner visual when the same comment is posted on both surfaces.

Closes the doc gaps from the screenshot feature followup list: - USER_GUIDE.md: new 'Preview-deploy screenshots (optional)' subsection under Notifications, points at DEPLOY_PREVIEW_SCREENSHOTS_GUIDE.md. - COST_MODEL.md: 'Optional: deploy-preview screenshots' table covering AgentCore Browser session, Lambda processor, S3, CloudFront line items (~$0.01 per screenshot, dominated by Browser session time). - ROADMAP.md: marks the feature shipped under Notification plane with a one-line description of the trigger model and post-deploy latency. Mirrors regenerated via docs/scripts/sync-starlight.mjs.

The 'Inviting teammates' section was missing the prerequisite that the teammate needs their own ABCA account (Cognito user + configured CLI) before they can redeem a Linear invite code. New flow walks through: Admin: invite-user (Cognito) → invite-user <slug> (Linear) Teammate: configure --from-bundle → login → linear link <code> with cross-references to USER_GUIDE.md's 'Joining an existing deployment' for the Cognito-side details. Also corrects the stale 'auto-links the person running the wizard' claim — setup now offers an inline picker (opt-in by admin), not an automatic mapping.

Last batch of stale 'Vercel' framing in CLI command output, missed in the original de-Vercel-ize sweep. Provider-agnostic now: webhook-info header reads 'preview-deploy screenshot pipeline', the closing note lists Vercel/Amplify/Netlify/GitHub Actions as example providers, and the smoke-test instruction says 'push to a PR-attached branch' rather than 'trigger a Vercel preview deploy'. No behaviour change; pure copy.

…mutation) The local build's eslint --fix step rewrote a no-interpolation template literal to single quotes; CI's 'Fail build on mutation' guard caught that the mutation wasn't committed. Apply the fix.

krokoko

Review — Preview-deploy screenshot pipeline

Thanks for this — it's a really well-built feature. The receiver/processor topology faithfully mirrors LinearIntegration, the verify module follows linear-verify.ts, the private-S3 + CloudFront-OAC design is the correct answer to account-level Block Public Access, and the SigV4-presigned-WSS handshake is nicely reasoned (the commit trail documenting why you landed there is genuinely helpful). The docs are thorough and the Starlight mirrors are regenerated and committed. A few things I'd like to see addressed before merge, then some non-blocking notes.

Requesting changes on

1. Tests on the new security boundary. POST /v1/github/webhook is authorizationType: NONE, so the in-Lambda HMAC is the entire authentication path — and it currently ships with no unit tests. The repo already has copy-pasteable templates for exactly this shape (slack-verify.test.ts, linear-webhook.test.ts). The load-bearing branches I'd want pinned:

verifyGitHubSignature — the timingSafeEqual unequal-length throw guard, the sha256= prefix check, and the rotation re-fetch in verifyGitHubRequest.
github-webhook.ts receiver — state/environment filter gates, and the dedup ConditionalCheckFailedException + rollback-DeleteCommand path.
extractLinearIdentifier — pure function, trivial to test, and the g-flag lastIndex reset is the kind of thing that breaks silently on back-to-back calls.

The browser/CDP plumbing in agentcore-browser.ts is a fair followup (it's smoke-tested), but the HMAC + receiver routing + regex are ~2–3h with the existing templates. Worth noting the coverage gate added in b277cc6 may reject this as-is.

2. Unrelated default-label change (bgagent → abca). linear-webhook-processor.ts:34 flips DEFAULT_LABEL_FILTER, but cli/src/commands/linear.ts:50, the mapping-table doc, and LINEAR_SETUP_GUIDE.md:89 ("Default trigger label is bgagent") still say bgagent. Any onboarded project whose row lacks an explicit label_filter would silently switch trigger label on deploy. Could we either split this out of the screenshot PR, or align all four sites and add a migration note? (The label-gate reorder itself looks correct and the two new regression tests are great — it's just the default-value change + drift.)

3. Processor failures are invisible. WebhookProcessorFn is invoked InvocationType: 'Event' with no DLQ / onFailure / Errors alarm, and every failure path logs-and-returns. Since the receiver already 200'd, GitHub won't redeliver — so a systemic break (IAM, AgentCore quota, token rotation) stops 100% of screenshots with no signal anywhere. Could we add an SQS DLQ + a CloudWatch alarm on processor Errors?

Non-blocking suggestions

WAF rationale vs. code. /v1/github/webhook is added to the priority-1 scope-down, which excludes only SizeRestrictions_BODY — so GenericRFI_BODY is still evaluated on the path, which doesn't match the commit message ("trips GenericRFI_BODY"). Since the smoke test passed, the real blocker may have been body-size rather than RFI. Worth reconciling the comment/commit with what's exempted (and the troubleshooting doc, which tells operators the path is RFI-exempt).
Error/login-page screenshots. Page.navigate only checks errorText; an HTTP 4xx/5xx or auth wall navigates "successfully" and the 404/login PNG gets posted as if it were the app (the guide acknowledges this). Enabling the Network domain and skipping the post when the main-document status isn't 2xx would avoid posting confidently-wrong output.
WebSocket leak on open-timeout (agentcore-browser.ts:1469-1494): the open promise rejects without ws.close()/terminate(), and the throw escapes before the try/finally, leaving a dangling socket per failed attempt. Wrapping the open in the same finally would tidy this up.
Unguarded res.json() (github-webhook-processor.ts:854): a 2xx non-array/HTML body throws, and since findPullRequestForShaWithRetry has no try/catch, it crashes the (un-DLQ'd) processor. An Array.isArray guard returning null keeps the function's existing contract.
SSRF note: environment_url flows from the payload straight into Page.navigate. It's HMAC-gated and runs in the managed AgentCore session (not the Lambda VPC), so blast radius is bounded — but a scheme/host allowlist before navigating would be a cheap hardening follow-up.
bedrock-agentcore:* on * is broader than the project precedent (task-orchestrator.ts scopes specific actions). Acknowledged in the guide as a known gap — worth a tracked follow-up issue.
Stale strings: a couple of CfnOutput descriptions still say "Vercel-preview", and the construct doc-comment still calls it a "Public-read screenshot S3 bucket" though it's now private + OAC.

Really nice work overall — the bulk of this is the test coverage on the auth boundary and making failures observable; the rest is polish. Happy to pair on the test scaffolding if useful.

codecov-commenter · 2026-06-02T23:49:58Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 90.99284% with 88 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (main@14820c8). Learn more about missing BASE report.

Files with missing lines	Patch %	Lines
cdk/src/handlers/shared/linear-issue-lookup.ts	53.47%	87 Missing ⚠️
...dk/src/constructs/github-screenshot-integration.ts	99.65%	1 Missing ⚠️
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #241   +/-   ##
=======================================
  Coverage        ?   86.47%           
=======================================
  Files           ?      181           
  Lines           ?    41764           
  Branches        ?     4157           
=======================================
  Hits            ?    36116           
  Misses          ?     5648           
  Partials        ?        0

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

…mention Closes aws-samples#94 (the existing 'scope IAM down from bedrock-agentcore:*' followup task). Addresses krokoko's PR aws-samples#241 review: 1. (BLOCKING per review #1) IAM action wildcard — narrow bedrock-agentcore:* to the three calls the screenshot processor actually makes: - StartBrowserSession (control plane, public CLI command) - StopBrowserSession (control plane, public CLI command) - ConnectBrowserAutomationStream (SigV4-presigned WSS dial; not in the public CLI list but verified live against the deployed dev stack — IAM accepts the action name) Resource wildcard remains because AgentCore Browser sessions are ephemeral with no stable ARN; the IAM5 suppression on the construct already documents that. Previous behaviour granted every AgentCore action surface (memory, runtime, gateway, identity, code-interpreter) the screenshot path doesn't use. Tightening to the call set leaves a precise audit surface; if a future API change needs another action, IAM denies with the action name in CloudTrail and we add it explicitly. 2. (NIT per review aws-samples#7) Stale 'Vercel' wording on ScreenshotBucketName CfnOutput description, plus an adjacent comment in agent.ts that said 'Vercel-style preview deploys'. Both replaced with provider-agnostic phrasing — the pipeline listens for any provider that posts deployment_status (Vercel, Amplify, Netlify, GitHub Actions custom CD). No behavioural change in either fix.

Two non-blocking review nits: - agentcore-browser.ts: failure paths in the WS open-promise (error, unexpected-response, timeout) now call ws.terminate() so a hung handshake doesn't leak the underlying TCP connection per failed attempt. - github-webhook-processor.ts: findPullRequestForSha now guards the GitHub commit-pulls JSON parse and the Array.isArray contract. A transient 2xx HTML body or malformed payload would have crashed the un-DLQ'd processor; treat non-array as no-PR.

isadeks · 2026-06-08T15:06:06Z

@krokoko thanks for the review. Status by item.

Shipped on this branch

bedrock-agentcore:* IAM scope-down (e791e62) — narrowed to StartBrowserSession, StopBrowserSession, ConnectBrowserAutomationStream
Cosmetic Vercel CfnOutput (e791e62)
WebSocket leak on open-timeout (be7b527) — error, unexpected-response, and timeout paths in the open-promise now call ws.terminate() before reject
findPullRequestForSha JSON parse + Array.isArray guard (be7b527) — non-array body no longer crashes the un-DLQ'd processor
WAF rationale reconciled (873ecea) — CloudWatch BlockedRequests metric for TaskApiWebAcl shows SizeRestrictions_BODY fired 2x on 2026-05-21 (matching commit 36e8d14's smoke-test window 1:1); GenericRFI_BODY has never fired on this WebACL. Code was correct (only excludes SizeRestrictions_BODY); just the inline comment + DEPLOY_PREVIEW_SCREENSHOTS_GUIDE.md rationale were wrong. Also fixed the doc wording that overstated the scope (was "excluded from the CommonRuleSet" → now "exempted from SizeRestrictions_BODY; LFI/RFI/XSS/SQLi still evaluate"). Starlight mirror re-synced.
DEFAULT_LABEL_FILTER reverted to bgagent (8a6fc86) — the bgagent → abca rename is unrelated to this PR; fix(linear): split bgagent → abca default-label change out of PR #241 + align CLI/docs #285 owns landing it across all four sites (processor / CLI / setup guide / mapping doc) in its own PR. Test fixture updated to match the restored default.

Shipped in stacked PRs

Tests on the security boundary → test(screenshot): cover the screenshot pipeline #275 (53 tests). Covers verifyGitHubSignature (timingSafeEqual unequal-length guard, sha256= prefix check, rotation re-fetch in verifyGitHubRequest), receiver state/environment filters and dedup ConditionalCheckFailedException + rollback DeleteCommand path, regex extract including the g-flag lastIndex reset on back-to-back calls, processor happy/failure paths with CloudFront-host URL assertion, and the AgentCore Browser CDP exchange (Stop invoked in finally even on error, 403 unexpected-response surfaced, navigate errorText raised). 385/385 across 16 suites.
Bonus → feat(linear): prefix-route multi-workspace issue lookup by team key #273 (multi-workspace prefix-routing for the findLinearIssueByIdentifier scan). Smoke-validated on dev with two real workspaces — ABCA-78 from a PR title routed directly to the matching workspace on the first try (single GraphQL call, not N).

Tracked as follow-up issues (won't hold this PR)

DLQ + alarm on processor → feat(infra): add SQS DLQ + CloudWatch alarm to GitHub webhook processor #284 (blocking item Docs: user guide hard-codes us-east-1 #3 — observability gap; better as its own PR with cdk-nag scrubs and a smoke test of the alarm firing)
Default-label split (bgagent → abca) + four-site alignment → fix(linear): split bgagent → abca default-label change out of PR #241 + align CLI/docs #285 (blocking item Docs: specify that you can't use the agent with the canned repo #2 — needs a migration note for projects relying on the default; cleaner as its own PR)
SSRF allowlist on environment_url → feat(security): scheme + host allowlist on screenshot processor's environment_url #286 (non-blocking — defense-in-depth; HMAC-gated path means it's not load-bearing on this PR)
Skip screenshot when preview URL returns non-2xx → feat(screenshot): skip screenshot when preview URL returns non-2xx #287 (non-blocking — Network domain status check; out of scope for the receiver/processor wiring under review here)

Re-requesting review.

CloudWatch BlockedRequests metric for TaskApiWebAcl (us-east-1) shows SizeRestrictions_BODY fired 2x on 2026-05-21, matching commit 36e8d14's smoke-test window 1:1. GenericRFI_BODY has never fired on this WebACL — the original commit message + scope-down comment + screenshots guide blamed RFI when the actual blocker was the 8 KB body-size limit (the deployment_status payload carries workflow run history + deploy URLs + deployment metadata). The code is correct as written (only excludes SizeRestrictions_BODY); only the rationale was wrong. Updated the inline comment in task-api.ts and the WAF-exemption section of the screenshots guide; the guide wording also overstated the scope ('excluded from the CommonRuleSet') when only one CRS rule is exempted, so reworded to make clear LFI/RFI/XSS/SQLi still evaluate. Starlight mirror re-synced.

…o screenshot pipeline Per krokoko PR-241 review item #2: the bgagent->abca rename is unrelated to the screenshot pipeline and was bundled into this PR by accident. Issue aws-samples#285 owns landing the rename in its own PR with all four sites (processor / CLI / setup guide / mapping doc) aligned in one commit. Restores DEFAULT_LABEL_FILTER to 'bgagent' so PR aws-samples#241 only carries the screenshot pipeline. Test fixture in linear-webhook-processor.test.ts updated from 'lbl-abca'/'abca' to 'lbl-bgagent'/'bgagent' to match the restored default; behavior under test is unchanged.

Closes #97. Adds 53 jest tests across the four screenshot files that landed with PR #241 + #273 with no existing coverage: - github-webhook-verify.test.ts (14): SHA256 sign/verify, sm cache TTL + forceRefresh, ResourceNotFound, transparent re-fetch on signature mismatch / null fresh. - github-webhook.test.ts (15): missing body/sig, ping ack, non-deploy events ignored, malformed JSON, state/environment filters, SCREENSHOT_TARGET_ENVIRONMENT override, missing fields, dedup hit, happy path, rollback-on-invoke-failure, non-condition DDB error. - linear-issue-lookup.test.ts (18): regex covers extract / multi / bounds / case-sensitivity, prefix-routing happy path, case-insensitive prefix match, fallback for legacy rows + post-prefix-miss, null token skip, fuzzy-match guard, GraphQL errors / non-2xx / network failure. - github-webhook-processor.test.ts (15): empty / malformed body, missing fields, token resolve failure, PR retry exhaustion, OPEN-only filter, happy path with CloudFront-host URL assertion, screenshot/S3/comment failure modes (each non-fatal where appropriate), Linear branch fires / falls back to body / skips on no-id / no-resolve / non-fatal post. - agentcore-browser.test.ts (6): StartBrowserSession failures, full CDP exchange (Target.getTargets -> attach -> enable -> navigate -> loadEventFired -> captureScreenshot) returning PNG bytes, Stop invoked in finally even on CDP error, Stop's own failure logged not thrown, 403 unexpected-response surfaced, navigate errorText raised. All tests use jest mocks for AWS SDK clients + an in-test FakeWebSocket for the CDP stream so they run hermetically without real AWS or network. Existing 286/286 handler tests still pass.

…screenshots # Conflicts: # cdk/package.json

Closes #97. Adds 53 jest tests across the four screenshot files that landed with PR #241 + #273 with no existing coverage: - github-webhook-verify.test.ts (14): SHA256 sign/verify, sm cache TTL + forceRefresh, ResourceNotFound, transparent re-fetch on signature mismatch / null fresh. - github-webhook.test.ts (15): missing body/sig, ping ack, non-deploy events ignored, malformed JSON, state/environment filters, SCREENSHOT_TARGET_ENVIRONMENT override, missing fields, dedup hit, happy path, rollback-on-invoke-failure, non-condition DDB error. - linear-issue-lookup.test.ts (18): regex covers extract / multi / bounds / case-sensitivity, prefix-routing happy path, case-insensitive prefix match, fallback for legacy rows + post-prefix-miss, null token skip, fuzzy-match guard, GraphQL errors / non-2xx / network failure. - github-webhook-processor.test.ts (15): empty / malformed body, missing fields, token resolve failure, PR retry exhaustion, OPEN-only filter, happy path with CloudFront-host URL assertion, screenshot/S3/comment failure modes (each non-fatal where appropriate), Linear branch fires / falls back to body / skips on no-id / no-resolve / non-fatal post. - agentcore-browser.test.ts (6): StartBrowserSession failures, full CDP exchange (Target.getTargets -> attach -> enable -> navigate -> loadEventFired -> captureScreenshot) returning PNG bytes, Stop invoked in finally even on CDP error, Stop's own failure logged not thrown, 403 unexpected-response surfaced, navigate errorText raised. All tests use jest mocks for AWS SDK clients + an in-test FakeWebSocket for the CDP stream so they run hermetically without real AWS or network. Existing 286/286 handler tests still pass.

theagenticguy · 2026-06-10T00:57:31Z

🏛️ Principal-architect review — preview-deploy screenshot pipeline

Reviewed with the ABCA review_pr workflow: vision and tenet alignment, the automated agent passes, principal judgment, then docs and tests. Thank you for a well-built PR. The provider-agnostic framing, the Lambda-only post-PR path, the CloudFront-plus-OAC pattern, and the care from prior review rounds all show through. Most findings below are hygiene. Two are worth fixing before merge.

1. Verdict

Request changes, on a small and well-scoped set. The architecture is sound and vision-aligned. The blockers are a thin Lambda-timeout budget and a fail-open edge in the HMAC verifier, plus a few stale comments that mislead in a security direction. None is an active exploit on the default configuration.

2. Vision alignment

Strongly aligned. It keeps the LLM out of a deterministic post-PR job (tenet 3) and preserves fire-and-forget for the submitter (tenet 1). It stays provider-agnostic through config rather than hardcoded provider names (tenet 8). It updates ROADMAP and COST_MODEL so the change is attributable (tenets 7 and 10). No tenet is traded away, so no ADR is needed.

3. Blocking issues

B1. The processor Lambda timeout budget can be exceeded on the worst-case path, and the budget comment is inaccurate.
cdk/src/constructs/github-screenshot-integration.ts:143-145, and the lookup-then-capture order in cdk/src/handlers/github-webhook-processor.ts.
The processor runs findPullRequestForShaWithRetry first. Its delay schedule is [0, 5s, 10s, 20s], so it can spend up to about 35s when a PR is found only on the final attempt. Only then does it call captureScreenshot, which has its own independent 60s default budget, followed by the S3 PUT and the comment POST. The inline comment accounts for "60s screenshot + … = 95s" but never counts the 35s retry that precedes it. The realistic worst case (PR found on the last retry, plus a slow render) is roughly 95 to 100s against a 120s timeout. Slow GitHub round-trips during the retry can push it over. A hard timeout then dies mid-capture or mid-comment. There is no DLQ (InvocationType: 'Event', no onFailure), and the dedup row stays written, so GitHub's redelivery is deduped and the work is silently lost.
Suggested fix: thread one shared wall-clock deadline. Compute deadline = now + budget at handler start, then pass the remaining budget into both the retry helper and captureScreenshot(url, { timeoutMs: remaining - reserve }). Alternatively, shorten the retry ladder and raise the Lambda timeout. Either way, correct the budget comment to count the retry that runs first.

B2. HMAC verification has no empty-or-blank-secret guard, which is a fail-open edge.
cdk/src/handlers/shared/github-webhook-verify.ts, in verifyGitHubSignature.
crypto.createHmac('sha256', webhookSecret) runs with no check that the secret is non-empty. If webhookSecret is '', an attacker who computes HMAC-SHA256('', body) produces a signature that passes. That is the only authentication on a public, unauthenticated endpoint. It is not reachable on the default path: a fresh CDK Secret gets a random value and so fails closed, and set-webhook-secret trims and rejects empty input. It only bites if an operator writes a blank secret out of band. That is still the fail-closed-on-risk case (tenet 4), and the fix is about three lines.
Suggested fix: in getGitHubWebhookSecret, treat an empty or whitespace SecretString as null so the receiver returns 401. Optionally also early-return false in verifyGitHubSignature when the secret is empty.

The rest of this path fails closed correctly. A missing signature returns 401. A malformed or length-mismatched signature is rejected. A null or unfetchable secret yields false. Transient Secrets Manager errors re-throw and return 500, so GitHub retries. The comparison uses timingSafeEqual, so there is no timing oracle. There is no way to reach the processor without a valid signature on the default configuration.

B3. Three "public-read" comments contradict the implementation, in a security-misleading direction.
cdk/src/constructs/github-screenshot-integration.ts:86 (topology JSDoc), :93 (field doc), and :117 (inline).
The bucket is fully private. It sets BlockPublicAccess.BLOCK_ALL and is served only via CloudFront OAC. screenshot-bucket.ts gets this right and explains why. Only these three comments still say "Public-read." On a security-sensitive construct, that could lead a future maintainer to "simplify" by weakening Block Public Access. Reword to "Private bucket; served anonymously via CloudFront OAC."

4. Non-blocking suggestions and nits

The primary artifact can fail quietly. The PR-comment catch in github-webhook-processor.ts logs at warn with no metric, after the browser session and S3 PUT are already spent. A systematic break such as a token-scope regression or a comment-size limit produces no operator signal. Suggest an error-level log plus a ScreenshotCommentPostFailed CloudWatch metric and alarm. The same idea applies to "no PR after retries": a …PrLookupExhausted metric would let you tune the ladder.

SSRF is present but bounded. The processor navigates the browser to deployment_status.environment_url with no validation. AgentCore Browser is AWS-managed and sits outside the customer VPC, so IMDS and private-subnet pivots are neutralized. The residual risk is that whatever renders is published to a public URL, which amplifies any read. Suggest validating the URL (require https:, reject literal-IP, localhost, and link-local hosts) and adding an SSRF bullet to the guide's hardening section, which currently omits it.

Screenshot URLs are public and enumerable. The bucket is private, but CloudFront serves anonymously and the key screenshots/<org_repo>/<sha>.png is guessable, since the repo is public and the SHA appears in the PR. Preview deploys can render PII. The guide's "Sensitive content" note understates this. It suggests WAF and logs, which do not stop a correctly-formed anonymous GET. The cheapest hardening is a high-entropy suffix on the key, which makes the URL unguessable while staying anonymously cacheable.

A few smaller items:

GitHubDeploymentStatusEnvelope and GitHubDeploymentStatusPayload describe the same wire payload in two files and already disagree on target_url (declared in the processor, never read). Hoist one shared interface, ideally a single validate(payload) narrowing function. That also closes the gap where the receiver admits a payload missing deployment.sha that the processor then drops.
The bedrock-agentcore IAM PutSecretValue grant is correct but reads as read-only. I traced the chain: resolveLinearOauthToken rotates Linear's refresh token via PutSecretValueCommand, so the write grant is necessary and matches the orchestrator. Add a one-line note on the construct comment (:168-171) and the IAM5 suppression so the write intent is explicit. This is not an over-grant.
extractLinearIdentifier uses a g-flagged module-level regex. It is handled correctly today via the manual lastIndex = 0 reset, but it deserves a back-to-back-call test (see section 6).
CdpMessage.result is Record<string, unknown>, which forces four unchecked as casts in agentcore-browser.ts, for example targetInfos as Array<…>. linear-issue-lookup.ts casts DynamoDB attributes as string. Prefer per-command result interfaces, or a typed unmarshal plus an Array.isArray guard.
buildScreenshotKey uses repo.replace('/', '_'), which replaces only the first slash. It is fine for owner/repo, but replaceAll is more honest. SCREENSHOT_KEY_PREFIX is exported but unused, since the processor hardcodes 'screenshots/'. nextCdpId is a needless module global; scope it to runCdpScreenshot. delays.indexOf(delay) in the retry log breaks if a delay value repeats, so use an indexed loop. The step-6 header comment in agentcore-browser.ts says Page.frameStoppedLoading, but the code waits on Page.loadEventFired. The GitHub fetch calls lack the AbortController timeout the Linear path uses.

5. Documentation

Coverage is solid. There is a new operator guide, and USER_GUIDE, COST_MODEL, ROADMAP, and Task-lifecycle are all updated. The Starlight mirror is in sync: re-running docs/scripts/sync-starlight.mjs produced zero diff, so CI's "Fail build on mutation" gate will pass. Three spots to fix:

DEPLOY_PREVIEW_SCREENSHOTS_GUIDE.md near line 203 still says the processor IAM grants bedrock-agentcore:*. The code already scoped it to the three Browser actions, so the doc describes a worse posture than what ships.
Step 3c references "the same secret you used in 4b." The secret is generated in step 3b. There is no 4b.
In the PR description only, the WAF exemption is described as GenericRFI_BODY. The code and all in-tree comments and docs correctly exempt SizeRestrictions_BODY. Please correct the description so the merge commit is accurate.

6. Tests and CI

This is the most consistent theme across the automated passes. The two new linear-webhook-processor.test.ts regression tests are good, because they assert the absence of side effects rather than just "doesn't throw." The 13→14 table-count bump is correct. But the new pipeline ships with zero dedicated tests, and several deferred units are pure, cheap, and security-relevant. The issue fairly scopes out the expensive AgentCore-WSS mock, but these are not expensive:

verifyGitHubSignature and verifyGitHubRequest are the only auth on a public endpoint. A table-driven test (valid, wrong-secret, no-prefix, length-mismatch, empty-secret, post-rotation) is the cheapest possible and would have caught B2 directly.
extractLinearIdentifier needs a back-to-back-call test to pin the lastIndex reset.
buildScreenshotKey is a pure string builder that determines the public URL.
CDK synth assertions for ScreenshotBucket (BlockPublicAccess all-true) and the IAM scope-down both have copy-adaptable sibling precedents in trace-artifacts-bucket.test.ts and linear-integration.test.ts. Nothing currently locks in the scope-down that a prior review round explicitly asked for.

On CI: Validate PR title passed. build (agentcore) was still pending at review time, so please confirm it is green before merge.

7. Review agents run

The named pr-review-toolkit plugin agents are not installed as agent definitions in this environment. To honor the skill's mandatory-plugin step, I ran a dedicated subagent for each toolkit role against a checkout of the PR head, and I am noting that here for transparency:

code-reviewer: surfaced B1 and the DRY and cast nits.
silent-failure-hunter: confirmed the HMAC path fails closed; flagged the warn-level comment-delivery gap.
type-design-analyzer: the duplicated payload interface and the unsafe casts.
comment-analyzer: the public-read, IAM, timeout, and step-4b stale comments.
pr-test-analyzer: the coverage assessment in section 6.
/security-review: the empty-secret fail-open, SSRF, enumerable-URL, and IAM findings.

None in scope was omitted. I did not run build, lint, or cdk-nag locally, because the review checkout has no node_modules. I am flagging that as an explicit skip, deferred to the PR's own CI.

8. Human heuristics

Proportionality. Pass. Complexity matches the problem. The hand-rolled CDP loop is justified, since it avoids roughly 150 MB of Playwright in the bundle, and it stays minimal.
Coherence. Concern. The duplicated deployment_status interface and the three "public-read" comments describe one concept two ways.
Clarity. Concern. The timeout-budget comment (B1) and the stale IAM doc would mislead the next maintainer. Names are otherwise clear and intent-revealing.
Appropriateness. Concern. The code is maintainable and idiomatic, but the absence of tests on the pure security-critical units is below this repo's bar. The empty-secret case is exactly the "assert what the code should do" test that is missing.

Automated review by Bonk via the ABCA review_pr workflow. Happy to dig into any of these or open follow-up issues for the deferred items.

Closes the principal-architect review on PR aws-samples#241. B1. Shared wall-clock deadline. github-webhook-processor.ts now threads one TOTAL_BUDGET_MS=110s deadline across findPullRequestForShaWithRetry + captureScreenshot + S3 PUT + comment POST. PR lookup is capped so the screenshot half is guaranteed at least MIN_CAPTURE_BUDGET_MS=15s; if PR lookup eats the budget we fail fast with budget_exhausted instead of starting an AgentCore session that's already doomed. Lambda timeout stays 120s; the 10s headroom covers SDK retries + shutdown grace. Construct comment updated to reflect the actual math. B2. Empty-secret HMAC fail-open closed. getGitHubWebhookSecret now returns null when SecretString is empty or whitespace-only (logs error, drops cache entry). verifyGitHubSignature adds defense-in-depth early-return when secret is empty. New table-driven test covers the exact attacker shape: HMAC('', body) signature against an empty stored secret. B3. Stale public-read comments reworded. github-screenshot-integration.ts topology JSDoc + field doc + inline all said 'Public-read screenshot S3 bucket'; the bucket has been BlockPublicAccess.BLOCK_ALL the whole time. Reworded to 'Private bucket; served via CloudFront OAC.' Non-blocking nits. - SSRF allowlist on environment_url (https-only, no literal-IP / localhost / link-local / loopback) — new shared module src/handlers/shared/screenshot-url.ts. - High-entropy 64-bit suffix on screenshot S3 keys (screenshots/<owner>_<repo>/<sha>-<id>-<suffix>.png) so the public CloudFront URL is unguessable from the public PR. - Single GitHubDeploymentStatusPayload + validateDeploymentStatusPayload hoisted to src/handlers/shared/github-deployment-status.ts; receiver and processor share one validation contract (closes the gap where the receiver admitted payloads missing deployment.sha that the processor would drop). - AbortController on the GitHub commit-pulls fetch (5s per-attempt cap; caller still owns the wall-clock). - Replaced delays.indexOf-based retry log with an indexed loop (broken if delays repeats). - replaceAll('/', '_') for repo slug. - Scoped nextCdpId to runCdpScreenshot (was module-global). - Stale 'Page.frameStoppedLoading' docstring fixed (code waits on Page.loadEventFired). - IAM PutSecretValue intent comment added (Linear refresh-token rotation; not a typo). - Type-narrowed CDP result accessors instead of `as` casts in agentcore-browser.ts. - Promoted warn → error with tagged event_id on screenshot.* paths (pr_lookup_exhausted, capture_failed, s3_put_failed, pr_comment_post_failed) so CloudWatch metric filters / alarms can fire on what was previously invisible. IAM scope rationale (with Service Authorization Reference cite). The reference confirms ConnectBrowserAutomationStream is a real published IAM action under bedrock-agentcore. Scope is narrowed to the three actions the handler calls; resource is '*' because the two Connect*Stream actions declare no resource types or condition keys (must be granted on Resource:'*'), and Browser sessions are ephemeral anyway. Source link added in both the construct comment and the screenshots guide. Doc fixes. - DEPLOY_PREVIEW_SCREENSHOTS_GUIDE.md hardening section: was 'IAM grants bedrock-agentcore:*' (stale); now describes the scope-down with the auth-reference citation, plus SSRF and screenshot-URL enumerability mitigations. - step 3c reference from '4b' → '3b' (the secret is generated in 3b, not 4b which doesn't exist). Tests (4 new files, 58 tests). - github-webhook-verify.test.ts: table-driven verifier including the exact empty-secret attack shape that B2 fixes. - screenshot-url.test.ts: buildScreenshotKey shape + entropy + replaceAll; isAllowedScreenshotUrl with public hosts, schemes, IPv4/IPv6 literals, localhost, RFC1918, IMDS. - linear-issue-lookup.test.ts: extractLinearIdentifier including the back-to-back-call test that pins the g-flag lastIndex reset. - screenshot-bucket.test.ts: synth-time assertion that BlockPublicAccess is all-true so a future 'simplify' can't drop public-access protection. Full handler suite still green: 1387/1387 across 70 suites.

…key suffix After cherry-picking the theagenticguy review fixes onto linear-vercel: - github-webhook.ts: validateDeploymentStatusPayload was 400-rejecting payloads missing environment_url, but GitHub fires success events without the URL during intermediate states. Restore the explicit 200-skip on env_url BEFORE validate so we don't make GitHub retry. - Test fixture updates for: - captureScreenshot now called with { timeoutMs } (B1 budget) - S3 key now ends with -<16hex>.png (high-entropy suffix nit)

isadeks · 2026-06-10T15:03:28Z

@theagenticguy thanks for the principal-architect pass — addressed everything end-to-end. Smoke-tested live on dev: 8.1s end-to-end, narrowed IAM holds, screenshots landed on the PR and the linked Linear issue. Re-requesting review.

Blockers (all in `2892673`)

B1 — Shared wall-clock deadline. github-webhook-processor.ts now threads one TOTAL_BUDGET_MS=110_000ms deadline across findPullRequestForShaWithRetry + captureScreenshot + S3 PUT + comment POST. PR-lookup is capped so the screenshot half always gets at least MIN_CAPTURE_BUDGET_MS=15s; if PR-lookup eats the budget on a slow-GitHub day we fail-fast with screenshot.budget_exhausted instead of starting an AgentCore session that's already doomed. Lambda timeout stays 120s; the 10s headroom covers SDK retries + shutdown grace. Construct comment updated to count the retry that runs first.

B2 — Empty-secret HMAC fail-open closed. getGitHubWebhookSecret now returns null when SecretString is empty or whitespace-only (logs error, drops cache entry). verifyGitHubSignature adds defense-in-depth early-return when the secret is empty. New table-driven test covers the exact attacker shape: HMAC('', body) against an empty stored secret.

B3 — Stale public-read comments reworded. github-screenshot-integration.ts topology JSDoc + field doc + inline all said "Public-read screenshot S3 bucket"; the bucket has been BlockPublicAccess.BLOCK_ALL the whole time. Reworded to "Private bucket; served via CloudFront OAC."

IAM action-name verification

You called out ConnectBrowserAutomationStream as a possible SDK type-name guess. I had a subagent verify against the AWS Service Authorization Reference for bedrock-agentcore: it is a real published action. The reference also notes it has no resource types or condition keys — Resource:"*" is mandatory. Construct comment + screenshots-guide hardening section now cite the reference. Smoke-tested on dev with the narrowed {StartBrowserSession, StopBrowserSession, ConnectBrowserAutomationStream} set: full pipeline succeeds in 8.1s.

Non-blocking nits

SSRF allowlist on environment_url — new cdk/src/handlers/shared/screenshot-url.ts with isAllowedScreenshotUrl: https-only, rejects literal-IP / localhost / link-local / loopback. Unit tests cover Vercel/Amplify/Netlify/GitHub-Pages happy paths plus IPv4 / IPv6 / RFC1918 / IMDS rejections.
High-entropy 64-bit key suffix — screenshots/<owner>_<repo>/<sha>-<deploymentId>-<16hex>.png. Public CloudFront URL is now unguessable from the public PR; suffix entropy verified at 100/100 distinct over 100 calls.
Single shared GitHubDeploymentStatusPayload + validateDeploymentStatusPayload in cdk/src/handlers/shared/github-deployment-status.ts. Receiver and processor share one validation contract. Closes the gap where the receiver admitted payloads missing deployment.sha that the processor would drop.
AbortController on the GitHub commit-pulls fetch (5s per-attempt cap; caller still owns the wall-clock). Mirrors the Linear path.
Type-narrowed CDP result accessors — replaced the four as casts in agentcore-browser.ts with narrowTargetInfos / narrowSessionId / narrowNavigateError / narrowScreenshotData checked accessors.
replaceAll('/', '_') for the repo slug.
Scoped nextCdpId to runCdpScreenshot (was module-global).
Stale Page.frameStoppedLoading docstring fixed (the code waits on Page.loadEventFired).
PutSecretValue IAM intent comment — explicit note that the write grant is for Linear refresh-token rotation, not a typo.
Promoted warn → error with tagged event_id on screenshot.* paths (pr_lookup_exhausted, capture_failed, s3_put_failed, pr_comment_post_failed). CloudWatch metric filters / alarms can now fire on what was previously invisible.
Indexed retry loop replaces delays.indexOf(delay) (broken if delays repeats).

Doc fixes

DEPLOY_PREVIEW_SCREENSHOTS_GUIDE.md hardening section: was "IAM grants bedrock-agentcore:*" (stale); now describes the scope-down with the auth-reference citation, plus SSRF and screenshot-URL enumerability mitigations.
Step 3c reference from "4b" → "3b" (the secret is generated in 3b; there is no 4b).

Tests (4 new files, 58 tests)

github-webhook-verify.test.ts — table-driven verifier, including the exact empty-secret attack shape that B2 fixes.
screenshot-url.test.ts — buildScreenshotKey shape + 100/100 entropy + replaceAll; isAllowedScreenshotUrl covers schemes, IPv4 / IPv6 literals, localhost, RFC1918, IMDS.
linear-issue-lookup.test.ts — extractLinearIdentifier including the back-to-back-call test that pins the g-flag lastIndex reset.
screenshot-bucket.test.ts — synth-time assertion that BlockPublicAccess is all-true so a future "simplify" can't drop public-access protection.

Full handler suite still green: 1387/1387 across 70 suites.

Smoke verification on dev

Cherry-picked onto linear-vercel, deployed, direct-invoked the processor with a real PR SHA (PR #52 on isadeks/vercel-abca-linear). Logs:

{"message":"Screenshot pipeline starting","budget_ms":110000}
{"message":"AgentCore Browser session started","session_id":"01KTS0GDGJ8204G22P97EHSQCY"}
{"message":"Posted screenshot comment to PR","pr_number":52,"public_url":".../000106996c4c5315de7fa43cf2ba0ab475cf2670-42-aa913ff858180ede.png"}
{"message":"Posted screenshot comment to Linear issue","identifier":"ABCA-108","workspace_slug":"maguireb"}
Duration: 8110.56 ms

Confirms the narrowed IAM (3 actions) is sufficient for the SigV4-presigned WSS handshake on real traffic. Earlier dev hang turned out to be transient and is not reproducing.

krokoko · 2026-06-10T18:13:03Z

🏛️ Principal-architect review — round 3

Thank you @isadeks for the thorough follow-through across both prior review rounds — this is exemplary review responsiveness. I verified every claimed fix against the code at head e553b2c rather than taking the response comments on faith, and everything checks out.

Verdict

Approve with nits. All blocking items from both prior rounds (@krokoko's and @theagenticguy's) are verifiably fixed in code, CI is fully green, and the follow-up issues (#284, #285, #286, #287) and stacked PRs (#275, #273) all exist and are correctly scoped. What remains is a small set of new, non-blocking findings.

Prior feedback — verification status

Item	Status
B1 shared wall-clock budget	✅ `github-webhook-processor.ts:55-70,165,185` — arithmetic verified consistent with the 120s Lambda timeout
B2 empty-secret HMAC fail-open	✅ Fixed at both fetch (`github-webhook-verify.ts:59`) and verify (`:107`) layers, with the exact attack-shape test
B3 stale "public-read" comments	✅ All three reworded
IAM scope-down to 3 `bedrock-agentcore` actions	✅ With Service Authorization Reference citation
`bgagent→abca` label revert	✅ Reverted; split tracked in #285
WS leak, `Array.isArray` guard, WAF rationale, SSRF allowlist, key entropy, shared payload type, warn→error promotion, CDP cast removal, doc fixes	✅ All verified
Security-boundary tests	✅ 58 in this PR (all pass; verifier at 100% statements), rest in stacked #275

Starlight mirror verified in sync (re-ran sync-starlight.mjs — zero diff). Governance ✅ (issue #240 approved, branch naming correct).

New non-blocking findings

Markdown comment injection via environment_url — github-webhook-processor.ts:457-483. previewUrl is interpolated raw into [![preview](…)](${previewUrl}). Node's URL parser preserves ) in the path, so a URL like https://preview.vercel.app/x)](https://evil/a.png) passes isAllowedScreenshotUrl (hostname is clean) yet breaks out of the markdown link syntax, injecting attacker content into a comment posted under ABCA's token. Reachable without the webhook secret in configs where a fork-PR author influences the preview path. Cheap fix: percent-encode (/) before interpolation. Fits feat(security): scheme + host allowlist on screenshot processor's environment_url #286's scope, or a one-liner here.
SSRF allowlist: IPv6 unique-local passes — screenshot-url.ts:60-68. Good news: decimal/octal/hex IPv4 literals (https://2130706433) are normalized by WHATWG URL and correctly blocked (verified empirically). But https://[fc00::1] (RFC 4193) and NAT64 literals pass — only ::1, fe80:, ::ffff: are checked. Since preview URLs are always DNS names, rejecting any IPv6 literal (hostname.includes(':')) is the simplest closure. Suggest folding into feat(security): scheme + host allowlist on screenshot processor's environment_url #286.
Stale key-layout comments — screenshot-bucket.ts:31-35,67-69 still describe screenshots/<repo>/<sha>.png; the actual key now has the slug, deployment id, and 16-hex suffix. Same drift at DEPLOY_PREVIEW_SCREENSHOTS_GUIDE.md:189 (line 205 of the same guide is correct). Also SCREENSHOT_KEY_PREFIX is exported but unused. Two-minute fix + mirror sync.
validateDeploymentStatusPayload has zero direct or indirect coverage — it's the receiver's 400-gate and pure/synchronous; a ~15-line table-driven test needs no mocks. Worth pulling forward from test(screenshot): cover the screenshot pipeline #275, along with the uncovered ::ffff: branch at screenshot-url.ts:68.
Smaller items: (a) linear-issue-lookup.ts:102-103 casts DynamoDB attributes as string unchecked — the one spot not meeting the narrowing standard this PR itself set in agentcore-browser.ts; (b) if (!resolved) continue; at :118 skips a workspace with no log — if every token breaks, Linear comments stop silently; (c) the null-secret path surfaces as "Invalid signature" 401s, sending an operator to GitHub config instead of Secrets Manager — a distinct tagged log would help; (d) cli/src/commands/github.ts:126 placeholder heuristic is inverted (CDK's default Secret is a random password, not JSON), so the overwrite warning fires on a fresh deploy — UX-only; (e) the review-citation comments ("theagenticguy PR-241 review item B1") are rot risks — keep the rationale, drop the attribution.

Human heuristics

All four dimensions now pass (coherence/clarity/appropriateness were concerns in round 2 — the shared payload type, corrected budget comments, and attack-shape tests resolved them).

Bottom line

Mergeable. My suggestion: fix the two cheapest items (1's paren-encoding, 3's comment sync) before merge, fold 2 into #286 and 4 into #275 — or merge as-is and track 1 in #286 too, since its prerequisite is configuration-dependent. Either way, really nice work — three rounds in, this pipeline is in great shape.

Review agents run: code-reviewer, silent-failure-hunter, type-design-analyzer, comment-analyzer, pr-test-analyzer, security-review (subagent against a checkout of the PR head). None in scope omitted. All 58 new tests executed locally — green.

…aws-samples#240) krokoko PR aws-samples#241 round-3 review: - Finding 1 (security): percent-encode `(`/`)` in the payload-derived `environment_url` before interpolating it into the PR/Linear comment markdown. A clean-hostname URL like `https://preview.vercel.app/x)](https://evil/a.png)` passes `isAllowedScreenshotUrl` yet breaks out of the `](…)` link and injects content into a comment posted under ABCA's token (reachable in fork-PR configs). New `encodeMarkdownUrl` helper + unit tests. - Finding 3 (cosmetic): screenshot-bucket.ts and the deploy guide still described the old `screenshots/<repo>/<sha>.png` layout; updated both to the actual `screenshots/<owner>_<repo>/<sha>-<deploymentId>-<16hex>.png`. Removed the unused `SCREENSHOT_KEY_PREFIX` export. Findings 2 (IPv6 unique-local) and 4 (validateDeploymentStatusPayload coverage) folded into aws-samples#286 and aws-samples#275 respectively per the reviewer's routing.

isadeks · 2026-06-10T18:31:25Z

@krokoko thanks for the round-3 pass — and for verifying against the code at head rather than the response comments. Addressed the two you flagged for "before merge," routed the other two to where they belong, and noted the finding-5 items below. New head: f284e04.

Fixed on this branch (`f284e04`)

Finding 1 — markdown injection via environment_url ✅ New encodeMarkdownUrl helper in screenshot-url.ts percent-encodes (→%28 / )→%29 before previewUrl is interpolated into either comment renderer (renderCommentBody, renderLinearCommentBody). The browser decodes the escapes so the link still resolves to the real preview, but it can't close the ](…) early. publicUrl is our own CloudFront key (no parens by construction) so it's left as-is. Covered by 3 new unit tests including your exact …/x)](https://evil/a.png) attack shape — asserts no ]( delimiter survives encoding.
Finding 3 — stale key-layout comments + unused export ✅ screenshot-bucket.ts and DEPLOY_PREVIEW_SCREENSHOTS_GUIDE.md:189 now describe the real screenshots/<owner>_<repo>/<sha>-<deploymentId>-<16hex>.png layout (with a pointer to buildScreenshotKey and a note that the random suffix can't be hand-constructed). Removed the unused SCREENSHOT_KEY_PREFIX export. Starlight mirror re-synced.

Routed to the stacked work, per your suggestion

Finding 2 — IPv6 unique-local (fc00::/7) / NAT64 pass → folded into feat(security): scheme + host allowlist on screenshot processor's environment_url #286 (it owns the environment_url allowlist). Proposed closure there: reject any IPv6 literal (hostname.includes(':')) since preview URLs are always DNS names — also closes the uncovered ::ffff: branch.
Finding 4 — validateDeploymentStatusPayload has no coverage → folded into test(screenshot): cover the screenshot pipeline #275 (it owns the pipeline test gap). Table-driven, no mocks; happy path + each missing-field rejection, plus the ::ffff: branch.

Finding 5 (smaller items) — disposition

(a) unchecked as string casts at linear-issue-lookup.ts:102-103 — agreed it's the one spot below the narrowing bar this PR set; folding into test(screenshot): cover the screenshot pipeline #275's scope alongside the other lookup work.
(b) silent continue on unresolved workspace token (:118) — real silent-failure risk; tracking in feat(security): scheme + host allowlist on screenshot processor's environment_url #286's neighborhood (observability on the resolve path). Will add a tagged warn so a fully-broken workspace isn't invisible.
(c) null-secret surfaces as "Invalid signature" 401 — fair; a distinct tagged log ("secret empty/missing" vs "signature mismatch") points operators at Secrets Manager instead of GitHub config. Small observability follow-up, not blocking.
(d) inverted CLI placeholder heuristic (github.ts:126) — UX-only, fires the overwrite warning on a fresh deploy. Tracking as a CLI nit.
(e) review-citation comments ("theagenticguy PR-241 review item B1") — agreed these are rot risks. I kept the rationale and dropped the attribution on the new finding-1 comments; happy to sweep the older ones in a docs/comment pass rather than churn this diff again.

CI re-running on f284e04. Thanks again — this was a genuinely sharpening review.

…us validator (aws-samples#240) krokoko PR aws-samples#241 round-3 findings 2 and 4, brought onto aws-samples#240 because the code they touch (screenshot-url.ts, the shared github-deployment-status validator) lives only on this branch — aws-samples#275 predates the aws-samples#240 refactor that extracted them, and aws-samples#286 is an issue with no code branch. - Finding 2 (security): isAllowedScreenshotUrl enumerated IPv6 ranges (::1, fe80:, ::ffff:) and missed unique-local fc00::/7 (e.g. [fc00::1]) and NAT64. Replace with a blanket reject of any IPv6 literal — a bracketed host or a `:` in the host — since preview URLs are always DNS names. Also documents that the WHATWG parser normalizes integer-form IPv4 (2130706433, 0x7f000001) to dotted-quad, so the existing regex catches those too. New table-driven tests for unique-local, NAT64, IPv4-mapped, and integer IPv4; screenshot-url.ts now at 100% branch coverage (closes the old ::ffff: gap). - Finding 4 (test gap): validateDeploymentStatusPayload had zero coverage. New github-deployment-status.test.ts — happy path, each missing/empty required field, absent nested objects, and the id:0 type-vs-truthiness edge. Validator at 100%.

isadeks · 2026-06-10T18:42:27Z

Follow-up — findings 2 and 4 are now also fixed on this branch (335eea4), not deferred after all.

When I went to fold them into #275/#286 as suggested, the routing didn't hold up: #275 (feat/97-screenshot-tests) predates the #240 refactor that extracted screenshot-url.ts and the shared github-deployment-status validator — neither module exists on #275's branch, and #240 isn't an ancestor of it. #286 is an issue with no code branch. So both findings' code lives only here; fixing them here was the only place they'd actually land against the right code.

Finding 2 — IPv6 unique-local / NAT64 SSRF gap ✅ isAllowedScreenshotUrl now rejects any IPv6 literal (bracketed host, or a : in the host) instead of enumerating ranges — covers fc00::/7 ([fc00::1]), NAT64, IPv4-mapped, loopback, and link-local in one rule, since preview URLs are always DNS names. Verified empirically that the WHATWG parser normalizes integer-form IPv4 (2130706433, 0x7f000001 → 127.0.0.1) so the dotted-quad regex catches those too. screenshot-url.ts now at 100% branch coverage — closes the old uncovered ::ffff: branch you flagged.
Finding 4 — validateDeploymentStatusPayload coverage ✅ New github-deployment-status.test.ts: happy path, each missing/empty required field (state, statusId, environmentUrl, deploymentId, sha, environment, repoFullName), absent nested objects, and the id: 0 type-vs-truthiness edge. Validator at 100%.

That leaves only the finding-5 items (small observability/UX nits) as genuine follow-ups; everything blocking-or-cheap from rounds 1–3 is now on 335eea4. CI re-running.

isadeks and others added 21 commits June 2, 2026 13:21

docs(screenshots): add the screenshot pipeline guide

bce3aa6

Adds the operator walkthrough for wiring up the AgentCore-Browser preview-deploy screenshot pipeline.

feat(screenshot): hide URL behind 'preview link' label in comments

6e57515

GitHub PR comment now reads 'From [preview link](url)' and Linear comment reads '[Preview link](url)' instead of pasting the bare URL. Cleaner visual when the same comment is posted on both surfaces.

isadeks requested a review from a team as a code owner June 2, 2026 20:51

isadeks added 2 commits June 2, 2026 13:56

fix(github-cli): replace template literal with single quotes (eslint …

dac4e31

…mutation) The local build's eslint --fix step rewrote a no-interpolation template literal to single quotes; CI's 'Fail build on mutation' guard caught that the mutation wasn't committed. Apply the fix.

isadeks mentioned this pull request Jun 2, 2026

feat(notifications): Linear dispatcher in fanout consumer (cost/turns/duration) #243

Open

5 tasks

Merge branch 'main' into feat/240-agentcore-screenshots

8c8b7e3

krokoko reviewed Jun 2, 2026

View reviewed changes

isadeks added 2 commits June 3, 2026 18:37

Merge branch 'main' into feat/240-agentcore-screenshots

4be999f

isadeks added 2 commits June 8, 2026 10:23

Merge branch 'main' into feat/240-agentcore-screenshots

984f4fc

isadeks added 2 commits June 8, 2026 11:21

isadeks requested a review from krokoko June 8, 2026 15:23

isadeks mentioned this pull request Jun 8, 2026

feat(screenshot): skip screenshot when preview URL returns non-2xx #289

Open

8 tasks

Merge remote-tracking branch 'upstream/main' into feat/240-agentcore-…

9c35160

…screenshots # Conflicts: # cdk/package.json

Merge branch 'main' into feat/240-agentcore-screenshots

57167ff

isadeks added 3 commits June 9, 2026 21:58

Merge branch 'main' into feat/240-agentcore-screenshots

79ef5b7

isadeks requested a review from theagenticguy June 10, 2026 15:03

krokoko added 2 commits June 10, 2026 11:42

Merge branch 'main' into feat/240-agentcore-screenshots

0b8fb15

Merge branch 'main' into feat/240-agentcore-screenshots

e553b2c

krokoko approved these changes Jun 10, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(notifications): preview-deploy screenshot pipeline (provider-agnostic)#241

feat(notifications): preview-deploy screenshot pipeline (provider-agnostic)#241
isadeks wants to merge 39 commits into
aws-samples:mainfrom
isadeks:feat/240-agentcore-screenshots

isadeks commented Jun 2, 2026 •

edited

Loading

Uh oh!

krokoko left a comment

Uh oh!

codecov-commenter commented Jun 2, 2026 •

edited

Loading

Uh oh!

isadeks commented Jun 8, 2026 •

edited

Loading

Uh oh!

theagenticguy commented Jun 10, 2026 •

edited

Loading

Uh oh!

isadeks commented Jun 10, 2026

Uh oh!

krokoko commented Jun 10, 2026

Uh oh!

isadeks commented Jun 10, 2026

Uh oh!

isadeks commented Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

isadeks commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Architecture

Test plan

Out of scope (followups)

Uh oh!

krokoko left a comment

Choose a reason for hiding this comment

Review — Preview-deploy screenshot pipeline

Requesting changes on

Non-blocking suggestions

Uh oh!

codecov-commenter commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

isadeks commented Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Shipped on this branch

Shipped in stacked PRs

Tracked as follow-up issues (won't hold this PR)

Uh oh!

theagenticguy commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🏛️ Principal-architect review — preview-deploy screenshot pipeline

1. Verdict

2. Vision alignment

3. Blocking issues

4. Non-blocking suggestions and nits

5. Documentation

6. Tests and CI

7. Review agents run

8. Human heuristics

Uh oh!

isadeks commented Jun 10, 2026

Blockers (all in 2892673)

IAM action-name verification

Non-blocking nits

Doc fixes

Tests (4 new files, 58 tests)

Smoke verification on dev

Uh oh!

krokoko commented Jun 10, 2026

🏛️ Principal-architect review — round 3

Verdict

Prior feedback — verification status

New non-blocking findings

Human heuristics

Bottom line

Uh oh!

isadeks commented Jun 10, 2026

Fixed on this branch (f284e04)

Routed to the stacked work, per your suggestion

Finding 5 (smaller items) — disposition

Uh oh!

isadeks commented Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

isadeks commented Jun 2, 2026 •

edited

Loading

codecov-commenter commented Jun 2, 2026 •

edited

Loading

isadeks commented Jun 8, 2026 •

edited

Loading

theagenticguy commented Jun 10, 2026 •

edited

Loading

Blockers (all in `2892673`)

Fixed on this branch (`f284e04`)