Skip to content

feat(jira): Jira Cloud integration -- parity with Linear (#288)#302

Open
mayakost wants to merge 20 commits into
aws-samples:mainfrom
mayakost:feat/288-jira-integration
Open

feat(jira): Jira Cloud integration -- parity with Linear (#288)#302
mayakost wants to merge 20 commits into
aws-samples:mainfrom
mayakost:feat/288-jira-integration

Conversation

@mayakost

Copy link
Copy Markdown
Contributor

Summary

Full Jira Cloud integration, bringing Jira to parity with the existing Linear adapter: a Jira issue gets the bgagent label → ABCA picks it up → an agent run produces a PR → progress flows back as comments on the originating issue. Implements #288.

The Linear adapter was the file-for-file template; this PR diverges only where Jira's API forces it (see Jira-specific divergences below). Design rationale is captured in ADR-014 (docs/decisions/ADR-014-jira-integration.md).

Scope: Jira Cloud only (REST v3 + Cloud webhooks), per-tenant OAuth 3LO, label-only trigger. Inbound-only adapter — no DynamoDB Streams consumer, no outbound-notify Lambda. Out of scope: Jira Server/Data Center, Forge/Connect distribution, status-transition/comment-command triggers, bidirectional state sync.

Related

This PR resolves issue #288 to add a Jira integration

Architecture

Inbound (Jira → ABCA):

Jira Cloud webhook
  → POST /jira/webhook (API GW, no Cognito, HMAC-verified)
  → JiraWebhookFn      (verify X-Hub-Signature, dedup, async-invoke processor)
  → JiraWebhookProcessorFn (resolve cloudId→tenant, project→repo, build task, createTaskCore)
  → existing orchestrator pipeline (unchanged)

Outbound (Agent → Jira): progress comments posted on the originating issue at task start and completion.

What's included

Contracts (Phase 1)

  • 'jira' added to the ChannelSource union on both sides of the wire (cdk/src/handlers/shared/types.ts, cli/src/types.ts) plus the agent/src/models.py doc comment. check-types-sync.ts allowlists JiraLinkResponse as CLI-only (parity with Slack/Linear link responses).

CDK constructs & DynamoDB (Phases 2 & 4)

  • JiraIntegration construct (cdk/src/constructs/jira-integration.ts) — 3 mapping tables + an 8-hour-TTL webhook-dedup table, 3 Lambdas (webhook / processor / link), /jira/* API routes, per-tenant bgagent-jira-oauth-* IAM grants, and cdk-nag suppressions.
  • Three DDB table constructs, all keyed on cloudId as the tenant prefix so the same project key / account id stays unambiguous across tenants:
    Table PK
    JiraProjectMappingTable {cloudId}#{projectKey}owner/repo
    JiraUserMappingTable {cloudId}#{accountId} (+ PlatformUserIndex GSI)
    JiraWorkspaceRegistryTable jira_cloud_id → OAuth provider name
  • Stack wiring (cdk/src/stacks/agent.ts): grants the orchestrator read on the workspace registry + Get/PutSecretValue on the per-tenant prefix (mirrors Linear's pre-container failure-feedback path).

Lambda handlers & shared helpers (Phase 3)

  • jira-webhook.ts — HMAC-SHA256 verify over the raw body (X-Hub-Signature: sha256=<hex>, constant-time compare), event filtering (jira:issue_created / jira:issue_updated), dedup, async invoke. Silent 200 for unsupported events.
  • jira-webhook-processor.ts — label-add detection by diffing changelog.items[] (not issue.fields.labels), ADF→markdown for the task description, cloudId→tenant resolution, createTaskCore with channelSource: 'jira'.
  • jira-link.ts — Cognito-authenticated Jira-account → platform-user linking with dry-run preview.
  • Shared: jira-oauth-resolver.ts (per-tenant token resolve/refresh), jira-verify.ts (signature), jira-feedback.ts (REST-based failure comment at the orchestrator boundary).

Agent runtime (Phase 5)

  • channel_mcp.py refactored from a single-channel gate to a CHANNEL_MCP_BUILDERS dispatch dict — adding a channel is now one entry, not a rewrite.
  • resolve_jira_oauth_token() added to config.py, mirroring the Linear resolver's race-handling and fail-closed semantics (differs only in endpoint auth.atlassian.com + JSON body + JIRA_API_TOKEN).

Outbound progress comments — agent/src/jira_reactions.py (Phase 7)

Jira-origin tasks comment on the issue at start (🤖 picked up…) and completion (✅ finished — PR: <url> / ❌ …), wired into pipeline.py at start / finish / crash paths, parallel to the Linear hooks. Comments are advisory — all network/auth errors are swallowed and never gate the pipeline (with an auth circuit-breaker mirroring linear_reactions).

Note — outbound is REST, not MCP. ADR-014 originally specified the Atlassian Remote MCP server for outbound. In practice the hosted MCP (mcp.atlassian.com) requires an interactive browser-based OAuth 2.1 flow with dynamic client registration and will not accept the stored REST OAuth token as a Bearer header, so it fails to load from a headless agent. The Jira REST API accepts the same token (it carries write:jira-work), so comments go via POST /rest/api/3/issue/{key}/comment. This is the ADR's documented "Plan B" fallback, promoted to the implemented path.

CLI (Phase 6)

bgagent jira with 4 v1 subcommands (app-template, setup, link, map) + jira-oauth.ts (ports linear-oauth.ts; Atlassian uses a JSON token endpoint and requires offline_access for a refresh token). Deferred: add-workspace, update-webhook-secret, invite-user, list-projects.

Docs

JIRA_SETUP_GUIDE.md, ADR-014, README / USER_GUIDE / ROADMAP channel listings, and regenerated Starlight mirrors.

Jira-specific divergences from the Linear copy

  1. Label-add on updates — Jira reports label changes in changelog.items[] (field: "labels", fromString/toString), not as a full label list; the processor diffs the changelog so re-saving an already-labeled issue doesn't re-trigger.
  2. Operator-chosen signing secret — Atlassian doesn't auto-generate a per-subscription secret; the operator sets it at webhook-create time and pastes it during bgagent jira setup. Stored on the per-tenant OAuth bundle with a stack-wide fallback.
  3. cloudId is the tenant key everywhere — not domain or site name.
  4. UI-created webhooks omit cloudId — the processor falls back to the sole active tenant in the registry, and deliberately drops (rather than guesses) when zero or multiple active tenants exist, preserving multi-tenant safety.
  5. ADF descriptions — flattened to markdown (text/headings/lists), not a full ADF converter.
  6. Dedup key {issueKey}#{webhookEventTimestamp}, 8-hour TTL.

Testing

  • mise //cdk:compile — clean; check-types-sync.ts — passing (verified on this branch).
  • Per the final integration commit: 294 CLI + 837 agent + 1896 CDK tests passing, mise run build green.
  • New suites: jira-webhook.test.ts, jira-webhook-processor.test.ts, jira-link.test.ts, jira-oauth-resolver.test.ts (32), test_jira_reactions.py, test_channel_mcp.py (jira cases), plus agent.test.ts updated for the 4 new tables (13→17).

Branch rebased onto the fork's main so the diff is Jira-only (46 files); the 4 unrelated upstream commits it was originally cut from have been dropped.

Acknowledgment

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of the project license.

bgagent added 12 commits June 5, 2026 14:09
Phase 1 of Jira Cloud integration (aws-samples#288). Extends the ChannelSource
discriminant on both sides of the wire and updates the agent-side
comment so the runtime knows 'jira' is a recognized channel value;
no behavior changes yet.
Phase 2 of Jira Cloud integration (aws-samples#288). Mirrors the Linear constructs
file-for-file. Composite PKs use cloudId as the tenant prefix
(`{cloudId}#{projectKey}`, `{cloudId}#{accountId}`) so the same project
key or account id stays unambiguous across distinct Atlassian tenants.
Tables are unwired until Phase 4 — JiraIntegration instantiates and
grants them.
Phase 3 of Jira Cloud integration (aws-samples#288). Mirrors Linear's adapter
shape: per-tenant OAuth resolver (auth.atlassian.com), X-Hub-Signature
HMAC verify with per-tenant + stack-wide fallback, REST-based feedback
poster (ADF-wrapped, no reaction primitive — marker folded into text),
and three Lambdas (webhook, processor, link).

Non-trivial bit: the processor diffs `changelog.items[]` where
`field === 'labels'` and tokenizes the space-separated `fromString` /
`toString` to detect a label add — Atlassian's diff format differs
from Linear's `updatedFrom.labelIds`. Includes a minimal ADF→markdown
walker for issue descriptions.

Handlers reference JIRA_* env vars set by the JiraIntegration construct
in Phase 4; they don't deploy yet.
Phase 4 of Jira Cloud integration (aws-samples#288). Mirrors LinearIntegration:
3 DDB tables, dedup table (8h TTL), 3 Lambdas (webhook/processor/link),
API routes under /jira/*, per-tenant `bgagent-jira-oauth-*` IAM grants,
cdk-nag suppressions.

Stack wiring grants the agent runtime GetSecretValue on the per-tenant
prefix and pipes the workspace registry table + Get/Put grant into the
orchestrator (matches Linear's path for pre-container failure feedback).
Synth confirms clean CloudFormation + no nag findings.
Phase 5 of Jira Cloud integration (aws-samples#288). Refactors channel_mcp.py from
a single-channel gate to a CHANNEL_MCP_BUILDERS dispatch dict so adding
future channels stays one-entry. Adds resolve_jira_oauth_token() to
config.py mirroring the Linear resolver — same race-handling, same
fail-closed semantics; only differences are the endpoint
(auth.atlassian.com, JSON body) and the env-var name (JIRA_API_TOKEN).

Pipeline now dispatches to the right resolver based on channel_source.
JIRA_MCP_URL is flagged in-source as needs-verification — Atlassian's
Remote MCP may still be preview-gated; if so, fall back to a REST shim
in a future jira_reactions.py module (Plan B).

Tests: 6 new Jira test cases in test_channel_mcp.py; full agent suite
remains green (825 passed).
Phase 6 of Jira Cloud integration (aws-samples#288). Minimal v1 surface (4 of 10
Linear subcommands), per scoping decision. Mirrors the Linear CLI shape
where the contracts are similar:

- jira-oauth.ts ports linear-oauth.ts. Atlassian's token endpoint takes
  JSON (Linear takes form-encoded). offline_access scope is required
  for a refresh_token. fetchAccessibleResources() resolves cloudId +
  siteUrl post-consent.
- commands/jira.ts: app-template prints dev-console values; setup
  drives the OAuth dance + writes the per-tenant secret + registry row
  + webhook signing secret; link does dry-run preview UX; map writes
  the project → repo row.

Deferred to follow-ups: add-workspace, update-webhook-secret,
invite-user (with self-link picker), list-projects.
Covers signature verify pass/fail, dedup, event filtering, label-add
detection (create vs update changelog), and Cognito-authenticated linking.
56 tests, mirrors the Linear handler test surface.
- docs/guides/JIRA_SETUP_GUIDE.md — OAuth 3LO app, scopes, webhook
  registration, label trigger, project mapping, troubleshooting
- docs/decisions/ADR-014-jira-integration.md — Jira Cloud only, OAuth 3LO,
  label trigger, MCP outbound; documents the Jira-vs-Linear divergences
- README, USER_GUIDE, ROADMAP — add Jira to channel listings
- sync-starlight.mjs + astro.config.mjs — register the Jira guide mirror;
  regenerate Starlight content under docs/src/content/docs/

Completes the docs phase of aws-samples#288.
Bring `mise run build` green on the jira integration branch:

- check-types-sync: allowlist JiraLinkResponse as CLI-only, matching
  SlackLinkResponse/LinearLinkResponse (link responses are inlined
  server-side; no CDK source-of-truth type)
- channel_mcp.py: move Callable into a TYPE_CHECKING block (ruff TC003;
  safe under `from __future__ import annotations`)
- agent.test.ts: bump expected DynamoDB table count 13 -> 17 for the
  four new Jira tables (project/user/workspace-registry/webhook-dedup)
- test_config.py: cover resolve_jira_oauth_token (cache, fallback,
  refresh, concurrent-refresh, malformed/expiry paths); agent coverage
  70.41% -> 72.91%
- jira-oauth-resolver.test.ts: new suite (32 tests) mirroring the Linear
  resolver tests; clears the CDK statement/line/function/branch gates
- jira.ts / jira-oauth.ts: ESLint --fix cosmetic edits (quote-props,
  redundant template literals)

Tests: 294 CLI + 837 agent + 1896 CDK, all passing.
Jira webhooks created via the Settings → System → Webhooks UI do not
include a top-level `cloudId` in their payload (only app/OAuth-registered
dynamic webhooks do). Without it the processor can't resolve the tenant,
so it dropped the event and never created a task — the inbound trigger
silently failed for the common single-tenant, UI-webhook setup.

Add a safe fallback: when `payload.cloudId` is absent, scan the workspace
registry and use the sole `active` tenant. Deliberately refuses to guess
when zero or multiple active tenants exist (returns undefined → event
dropped), so the multi-tenant design is preserved — a multi-tenant
operator must use a webhook that carries its own cloudId.

`grantReadData` on the registry table already covers the Scan, so no IAM
change is needed.

Adds tests for: sole-tenant recovery (task created), empty registry
(drop), and multiple active tenants (ambiguous → drop).
Jira-origin tasks now comment on the originating issue at start
("🤖 picked up…") and on completion ("✅ finished — PR: <url>" / "❌ …"),
matching the Linear integration's progress UX.

Why a REST shim instead of the Atlassian Remote MCP: the hosted MCP
(mcp.atlassian.com) requires an interactive, browser-based OAuth 2.1 flow
with dynamic client registration — it does NOT accept the stored Jira REST
OAuth token as a Bearer header, so it fails to connect from a headless
agent ("claude mcp list" → Failed to connect; no mcp__jira-server__* tools
load). The Jira REST API accepts the same stored token (it carries
write:jira-work), so comments go via POST /rest/api/3/issue/{key}/comment
on the cross-region api.atlassian.com/ex/jira/{cloudId} base.

- New `jira_reactions.py`: gated by channel_source=='jira' + required
  metadata; swallows all network/auth errors (comments are advisory, never
  gate the pipeline); auth circuit-breaker mirrors linear_reactions.
- Wired into pipeline.py at task start, normal finish (with PR url), and
  the crash path — parallel to the existing Linear reaction hooks.
- prompt_builder: Jira tasks now get NO MCP-comment addendum (the earlier
  Linear-only gate already skipped them); instructing the agent to use the
  non-loading MCP tools would just waste turns. Comments are out-of-band.

Adds test_jira_reactions.py (gate, ADF body, success/failure/PR variants,
error-swallowing, auth circuit breaker) and channel-addendum tests.
@mayakost mayakost requested a review from a team as a code owner June 10, 2026 13:39
mayakost and others added 2 commits June 10, 2026 09:47
The 'Merge branch main' commit (c84cc66) left an invalid import block:
a missing comma after PR_WORKFLOW_IDS (syntax error) plus a stale
PR_TASK_TYPES import that main's aws-samples#248 removed from config.py. ruff
rejected the file, aborting the agentcore build.

Drop the orphaned PR_TASK_TYPES import and fix the comma.

@krokoko krokoko left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR — it's an impressively thorough port of the Linear template. The OAuth refresh/rotation handling in jira-oauth-resolver.ts (including the Atlassian rotated-refresh-token race) and the four-way verifyJiraRequestForTenant outcome union are genuinely excellent, and the divergences from the Linear copy (changelog diffing, cloudId tenancy, ADF flattening) are all well-justified. The review below is detailed because the PR is large and security-relevant, not because the foundation is weak.

Verdict: Request changes — CI is currently red, the docs mirror will fail the mutation gate, there's a coupled multi-tenant signature-binding gap, and the ADR/docs still describe the pre-pivot MCP outbound design.

Blocking issues

  1. CI is red — agent/tests/test_config.py:11 is missing a comma after PR_WORKFLOW_IDS. This is a SyntaxError at pytest collection time, so the entire agent suite (including the new test_jira_reactions.py circuit-breaker tests and the 12-case TestResolveJiraOauthToken suite) is currently not running. Once fixed, those suites look well-designed.

  2. Stale Starlight mirrors — the "Fail build on mutation" gate will reject. Running node docs/scripts/sync-starlight.mjs at PR HEAD dirties docs/src/content/docs/using/Overview.md and getting-started/Quick-start.md (the latter inherited via the main merge, but it needs a re-sync here). Please run mise //docs:sync and commit the regenerated mirrors.

  3. Multi-tenant signature binding (coupled pair):

    • cli/src/commands/jira.ts:539-547 mirrors the stack-wide signing secret into every per-tenant OAuth bundle, so one shared secret effectively verifies for all tenants.
    • A payload verified via the stack-wide fallback carries no binding between the verified secret and payload.cloudId — the processor then trusts the body-supplied cloudId to select the tenant, project→repo mapping, and OAuth bundle (cdk/src/handlers/jira-webhook.ts:136-147, jira-webhook-processor.ts:208,234). A holder of the stack-wide secret can steer a webhook at any tenant's mappings.
    • Related fail-open: jira-webhook.ts:152 skips the replay-window check entirely when timestamp is absent (Linear rejects a missing timestamp), and timestamp-less deliveries collapse to a single …#unknown dedup key.

    Suggested fix: don't mirror the stack-wide secret into per-tenant bundles; after a stack-wide-fallback verification, refuse any cloudId other than the sole-tenant fallback result; and log (or reject) when the replay check is skipped so the bypass is observable. Given the project's fail-closed tenet, I'd treat this as blocking.

  4. ADR-014 and several docs/docstrings still describe the abandoned MCP-outbound design. The final commit correctly pivoted outbound to REST (jira_reactions.py) — and agent/src/prompt_builder.py:135-146 documents the reality perfectly — but the following still claim MCP outbound: docs/decisions/ADR-014-jira-integration.md (still says "No jira_reactions.py REST module", status proposed), JIRA_SETUP_GUIDE.md:31-40,163-168, USER_GUIDE.md:14, the new ROADMAP.md entry, agent/src/channel_mcp.py:55-96, agent/src/config.py:333-344, jira-webhook-processor.ts:172, and jira-integration.ts:70-76. Additionally, channel_mcp.py still writes a Jira MCP entry that the PR's own docstrings say cannot connect from a headless agent (and emits the SSE URL with "type": "http") — please either drop the "jira" entry from CHANNEL_MCP_BUILDERS or gate it with an in-band log explaining it's expected to fail. The ADR should also be updated for the actual dedup key ({issueKey}#{webhookEvent}#{timestamp} — the ADR says timestamp-only) and, since #296 already merged an ADR-014, renumbered to ADR-015.

  5. User-facing instructions reference a non-existent CLI command. The unmapped-project feedback comment (jira-webhook-processor.ts:247) tells admins to run bgagent jira onboard-project … — the real command is map and requires a <cloud-id> positional. The CLI's own "Next steps" hint (cli/src/commands/jira.ts:555) also omits the required <cloud-id>. Both printed commands fail verbatim.

  6. Dead orchestrator wiring with surplus IAM. cdk/src/stacks/agent.ts:878-898 grants the orchestrator registry read + GetSecretValue/PutSecretValue on bgagent-jira-oauth-* "so the concurrency-cap rejection path can post a Jira comment" — but orchestrate-task.ts only implements notifyLinearOnConcurrencyCap; there is no Jira equivalent. Please either implement notifyJiraOnConcurrencyCap (parity) or remove the grant until the feature lands — unused write-capable IAM on every tenant's credentials is worth avoiding.

Test coverage gaps (would love to see these here, given the surfaces they guard)

  • No CLI Jira tests (~993 lines across cli/src/jira-oauth.ts + commands/jira.ts) — Linear has both linear-oauth.test.ts and command tests. The token exchange/refresh-rotation and the missing-refresh_token branch are the highest-consequence untested code in the PR.
  • No multi-tenant signature test — Linear has a 9-case linear-webhook-multi-workspace.test.ts covering exactly the mismatch/revoked/no-downgrade distinctions that matter for blocking issue 3.
  • jira-feedback.ts is mocked in every test that touches it — the ADF body shape (which Jira REST v3 will 400 on if wrong), timeout, and non-2xx paths have zero direct coverage. Linear ships linear-feedback.test.ts.
  • No construct test for jira-integration.ts — the only construct-level coverage is the table-count bump in agent.test.ts; Linear's construct test asserts key schemas, the dedup TTL attribute, and env wiring.
  • Suggest parameterizing cdk/test/contracts/stored-oauth-token-parity.test.ts to also cover the Jira StoredOauthTokenStoredJiraOauthToken pair — currently that cross-language contract is prose-only.

Non-blocking suggestions

  • jira-link.ts:96-112: the Put (active mapping) + Delete (pending row) aren't atomic — consider TransactWriteItems, or return 200 with a WARN if the delete fails after a successful Put (today the user is told linking failed when it actually succeeded).
  • USER_GUIDE.md:7: "There are five ways to interact" — the list now has six.
  • Consider keeping client_secret out of the per-tenant bundle the agent role can read — a prompt-injected agent could exfiltrate client_secret + refresh_token and mint tokens outside AWS. Splitting client credentials into a separate secret would harden this.
  • resolveSoleTenantCloudId (jira-webhook-processor.ts:89-109): the ScanCommand is unguarded and the processor has no top-level catch/DLQ (parity with Linear, so platform-wide rather than this PR's regression) — a try/catch + a metric on fallback usage would make silent drops countable.
  • Tiny key-builder helpers for the composite keys ({cloudId}#{projectKey}, {cloudId}#{accountId}, pending#{code}) would single-source formats currently interpolated independently at write and read sites (jira-webhook-processor.ts:234,585, jira-link.ts:69,99,111) — the existing jiraOauthSecretName() is the precedent.
  • jira-webhook-processor.ts:123: the | string in the webhookEvent union collapses the literal narrowing; and the cloudId doc comment at jira-webhook.ts:61 sits above matchedWebhookIds rather than the field it describes.
  • The 24h replay freshness window is generous; ~1h matches Atlassian's actual retry behavior.

What's notably good

  • jira-oauth-resolver.test.ts (37 cases) is a model suite — invalid_grant with rotated vs. same refresh token, concurrent-refresh recovery that skips the second POST, non-fatal PutSecretValue persistence failure.
  • The strict (throwing) registry/secret lookups on the verification path, so a transient infra error can't silently downgrade a per-tenant-secured tenant to the stack-wide fallback — exactly the right fail-closed choice.
  • Dedup rollback on processor-invoke failure (jira-webhook.ts:200-227), including the don't-mask-the-original-error nested case.
  • The advisory comment swallowing in jira_reactions.py is logged, circuit-broken, and faithfully mirrors linear_reactions.py.
  • prompt_builder.py:135-146 is the single source of truth on the REST pivot — every other doc just needs to be brought in line with it.

Happy to re-review quickly once the CI fix + mirror sync land; the security binding (issue 3) and the ADR reconciliation (issue 4) are the two I'd most like to see addressed before merge.

test_prompts.py:_config built base from a homogeneous str literal, so ty
inferred dict[str, str] and rejected the spread into TaskConfig's
bool/int/list fields (17 invalid-argument-type errors). Annotate base as
dict[str, Any], matching the existing helper in test_runner.py. This
failure was previously masked by the lint syntax error that aborted the
build before typecheck ran.
@ayushtr-aws

ayushtr-aws commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Code Review — Jira Cloud Integration

Reviewed the security-critical paths (webhook HMAC verification, OAuth resolution/refresh, task creation, agent runtime). Overall this is solid, production-minded work that faithfully mirrors the Linear adapter with well-documented divergences. Below are the findings.

Significant issues

1. Agent-side token refresh can permanently break a tenant (Atlassian rotating refresh tokens)
agent/src/config.py::resolve_jira_oauth_token refreshes an expiring token in-memory but, by design, cannot persist it (the agent role has GetSecretValue only). The TS resolver itself documents that "Atlassian rotates refresh_tokens on every use" (jira-oauth-resolver.ts). So when the agent path actually fires a refresh:

  • It consumes the stored refresh_token; Atlassian invalidates it.
  • The new refresh_token lives only in the agent's memory and is discarded at task end.
  • The Secrets Manager copy now holds a dead refresh token → the next Lambda/agent resolve gets invalid_grant → tenant requires re-onboarding.

This differs from Linear, whose copied trust model assumes the stored refresh token survives. It only triggers if the agent catches a token within 60s of expiry before any Lambda refreshes it, but when it hits it silently breaks the tenant. Consider having the agent never refresh — fail closed and let the orchestrator/Lambda path (which has PutSecretValue) own all refreshes. Worth confirming against Atlassian's rotating-refresh-token semantics before merge.

2. Image-attachment extraction is likely dead code for real Jira issues
jira-webhook-processor.ts::extractImageUrlAttachments scans the converted markdown for ![](https://…) syntax. But the ADF walker (walkAdf) has no media/mediaSingle case — Jira embeds images as media nodes with attrs (id/url), never as markdown image text. The default branch descends into content, and media nodes carry no text, so no image URL is ever emitted. Net effect: the regex matches almost nothing for typical Jira issues. Low severity (documented as a minimal converter), but it gives a false impression of working. Either add a media case or drop the extraction until it's real.

Minor issues

3. ADF→markdown conversion runs twice per issue. buildTaskDescription calls extractDescriptionMarkdown, then the attachments line calls it again on the same issue.fields?.description. Compute once and reuse.

4. Multi-tenant security of the stack-wide fallback secret. When a cloudId isn't in the registry (or its bundle lacks webhook_signing_secret), verification falls back to the single stack-wide secret, which is not bound to any cloudId. In a multi-tenant install, anyone holding the stack-wide secret can forge events with an arbitrary cloudId. This is documented as single-tenant back-compat, but consider explicitly disabling the stack-wide fallback when more than one active tenant is registered (mirroring the resolveSoleTenantCloudId safety logic).

5. resolveSoleTenantCloudId does a full table Scan on every UI-created (cloudId-less) webhook. Fine at current scale and the >1 active tenant short-circuit helps; worth a comment noting the table is expected to stay small, or a GSI on status if it could grow.

6. isWebhookTimestampFresh uses Math.abs, so far-future timestamps (clock skew or crafted) pass the freshness check. Post-signature it's only advisory, so low risk — but a one-sided check (Date.now() - timestamp <= MAX_AGE) is more correct.

7. HMAC over event.body and isBase64Encoded. If API Gateway ever delivers the webhook base64-encoded (binary media settings), both the JSON parse and HMAC will fail. Likely fine given JSON content-type and parity with the Linear template, but worth a guard or an explicit assumption comment.

Done well

  • HMAC verification is correct: raw-body HMAC-SHA256, sha256= prefix stripped, timingSafeEqual with length-mismatch caught.
  • Strict vs non-strict registry/secret lookups prevent a transient DDB/SM throttle from silently downgrading per-tenant verification to stack-wide — a thoughtful trust-boundary detail.
  • Dedup happens after verification via conditional PutItem, with rollback on processor-invoke failure so retries aren't permanently swallowed.
  • Fail-closed parsing: parseRegistryRow treats unknown/missing status as revoked; parseOauthSecret validates required fields.
  • Per-tenant mismatch/revoked are fatal (no fallback); only no-per-tenant-secret falls through — exactly the right asymmetry.
  • Secret redaction extended to JIRA_API_TOKEN; reactions/feedback are truly best-effort with an auth circuit breaker and never gate the pipeline.

Test coverage

Strong — OAuth refresh state machine (concurrent-refresh race, invalid_grant, malformed expiry, network failure), circuit breaker, channel gating, dry-run-must-not-write, and case-sensitive link codes are all covered. Suggested additions: a test asserting the agent-side refresh's non-persistence behavior (issue # 1), and a processor test against a realistic ADF media node (would surface the dead-code behavior in # 2).

Recommendation

I'd treat # 1 (agent refresh burning the rotating refresh token) as a blocker pending confirmation of Atlassian's token semantics, and # 2 as should-fix-or-remove. The rest are fine as follow-ups.

🤖 Generated with Claude Code

@codecov-commenter

codecov-commenter commented Jun 10, 2026

Copy link
Copy Markdown

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 94.34925% with 155 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (main@8c0f5e3). Learn more about missing BASE report.

Files with missing lines Patch % Lines
cdk/src/handlers/shared/jira-verify.ts 81.55% 38 Missing ⚠️
cdk/src/handlers/jira-webhook-processor.ts 94.51% 36 Missing ⚠️
cdk/src/handlers/jira-webhook.ts 88.27% 32 Missing ⚠️
cdk/src/handlers/shared/jira-oauth-resolver.ts 96.44% 18 Missing ⚠️
cdk/src/handlers/orchestrate-task.ts 83.01% 9 Missing ⚠️
agent/src/pipeline.py 50.00% 6 Missing ⚠️
cdk/src/handlers/jira-link.ts 95.58% 6 Missing ⚠️
cli/src/api-client.ts 58.33% 5 Missing ⚠️
agent/src/config.py 94.73% 3 Missing ⚠️
agent/src/channel_mcp.py 92.85% 1 Missing ⚠️
... and 1 more
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.
Additional details and impacted files
@@           Coverage Diff           @@
##             main     #302   +/-   ##
=======================================
  Coverage        ?   86.81%           
=======================================
  Files           ?      186           
  Lines           ?    43540           
  Branches        ?     4288           
=======================================
  Hits            ?    37798           
  Misses          ?     5742           
  Partials        ?        0           

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

bgagent added 2 commits June 10, 2026 11:34
Three more issues that were masked behind the earlier lint/typecheck
failures, all surfaced once the build progressed to its 'fail on
mutation' gate:

- ruff format reflowed two long boolean/string lines in jira_reactions.py
  and test_jira_reactions.py that were committed unformatted.
- USER_GUIDE.md still referenced the retired `pr_review` task_type on the
  intro 'For example' line (a aws-samples#248 merge leftover); the rest of the guide
  uses `coding/pr-review-v1`. Fixed the source and regenerated the
  Starlight mirror (using/Overview.md).
- Quick-start mirror was missing the Node.js prerequisite line present in
  the QUICK_START.md source; docs-sync adds it.

Full `mise run build` now completes with no working-tree mutation.
…n refresh, ADR/docs

Blocking (krokoko):
- Multi-tenant signature binding: webhook receiver flags stack-wide
  verification to the processor, which then ignores the body cloudId and
  binds to the sole active tenant (drops when ambiguous). CLI no longer
  mirrors the stack-wide secret into new per-tenant bundles; stack-wide is
  seeded once from the first tenant. Missing-timestamp replay skip is logged.
- Renumber ADR-014 -> ADR-015 (collides with workflow-driven-tasks); rewrite
  to the implemented REST-outbound reality (status accepted), correct dedup
  key, binding + refresh-ownership sections. Reconcile JIRA_SETUP_GUIDE,
  USER_GUIDE ("six ways"), ROADMAP, channel_mcp.py (placeholder + in-band
  log), jira-webhook-processor.ts, jira-integration.ts, agent.ts.
- Fix non-existent CLI command in feedback: onboard-project -> map
  <cloud-id> <project-key> --repo (processor + CLI next-steps hint).
- Implement notifyJiraOnConcurrencyCap (Linear parity) so the orchestrator
  IAM grant is used and Jira users aren't silently dropped on the cap.

Significant (ayushtr):
- Agent never refreshes the Jira token (Atlassian rotates refresh_tokens;
  agent has GetSecretValue only). Use the Lambda-written token verbatim and
  fail closed when expiring; Lambda path owns all refreshes.
- ADF media nodes (external images) now render to markdown so attachment
  extraction works; ADF->markdown computed once and reused.

Minor: one-sided clock-skew-tolerant timestamp freshness, base64-body guard,
replay window 24h->1h, resolveSoleTenantCloudId Scan comment.

Tests: agent no-refresh/fail-closed suite, multi-tenant binding tests,
base64/missing-timestamp/stack-wide-flag webhook tests, ADF media-node test,
notifyJiraOnConcurrencyCap parity suite. Docs mirrors regenerated.

Relates to aws-samples#288
@mayakost

Copy link
Copy Markdown
Contributor Author

Thanks both — this was an excellent, security-focused review. Pushed 2dcfb3c addressing it. Point-by-point below.

@krokoko (blocking)

  1. CI red (missing comma). Fixed in 1e4a4a0; the agent suite collects and runs again (including TestResolveJiraOauthToken* and test_jira_reactions.py). CI is green.
  2. Stale Starlight mirrors. Re-synced in ae7e70f and again here; node docs/scripts/sync-starlight.mjs now produces no diff.
  3. Multi-tenant signature binding. Fixed end-to-end:
    • The receiver now passes verified_via_stack_wide to the processor. On a stack-wide-verified delivery the processor ignores the body cloudId and binds to the sole active tenant (dropping when zero/multiple are active), so the stack-wide secret can no longer steer a webhook at an arbitrary tenant's mappings.
    • bgagent jira setup no longer mirrors the stack-wide secret into new per-tenant bundles — each tenant stores its own operator-chosen secret, and the stack-wide fallback is seeded once from the first tenant for single-tenant back-compat.
    • A missing timestamp no longer silently skips the replay check — it's logged (replay-window check skipped), and the window is tightened 24h → 1h.
  4. ADR/docs still describing MCP outbound. Renumbered ADR-014 → ADR-015 (the feat(tasks): workflow-driven tasks — declarative steps + repo-less workflows (#248) #296 collision) and rewrote it to accepted with the implemented REST-outbound design, the actual dedup key ({issueKey}#{webhookEvent}#{timestamp}), and new Multi-tenant signature binding + Token refresh ownership sections. Reconciled JIRA_SETUP_GUIDE.md, USER_GUIDE.md ("six ways"), ROADMAP.md, channel_mcp.py, config.py, jira-webhook-processor.ts, and jira-integration.ts. The jira-server MCP entry is kept as an explicitly-labelled non-functional placeholder and now emits an in-band log saying it's expected to fail and that outbound goes via jira_reactions.py.
  5. Non-existent CLI command. Both the processor feedback and the CLI "Next steps" hint now print bgagent jira map <cloud-id> <project-key> --repo … (the real command, with the required positionals).
  6. Dead orchestrator IAM grant. Implemented notifyJiraOnConcurrencyCap (Linear parity) so the grant is used and Jira users get feedback on the concurrency cap instead of a silent drop; the grant comment now explains why PutSecretValue is needed (the orchestrator is the trusted refresh path).

@ayushtr-aws (significant)

  1. Agent burning the rotating refresh token — confirmed and fixed by removing agent-side refresh entirely. You're right about Atlassian's rotation semantics. The agent has GetSecretValue only, so a refresh would consume the stored refresh_token, keep the replacement in memory for one task, and leave Secrets Manager holding a dead token. The agent now uses the Lambda-written token verbatim and fails closed (skips the advisory comment) when it's expiring; the trusted Lambda path owns all refreshes. The old refresh-path tests are replaced by TestResolveJiraOauthTokenNoRefresh, which asserts no /oauth/token POST is ever made.
  2. ADF image extraction was dead code — fixed. Added media/mediaSingle/mediaGroup cases to the ADF walker; external media now render to ![](url) (so extraction works), while file-type attachment media (needing a Jira API round-trip) are skipped. New processor test covers a realistic media node. Also fixed the double ADF→markdown conversion (Docs: user guide hard-codes us-east-1 #3) — it's computed once and reused.

Minor: timestamp freshness is now one-sided with a clock-skew allowance (#6); a base64-encoded body is rejected before verification (#7); resolveSoleTenantCloudId's Scan has a "table stays small / add a GSI if it grows" note (#5).

Deferred (happy to do in a follow-up)

The broader Linear-parity test suites — full CLI jira-oauth/command tests, direct jira-feedback.ts tests, a jira-integration construct test, and parameterizing stored-oauth-token-parity for the Jira pair — aren't in this push. I added the security-critical tests tied to the blockers (multi-tenant binding, agent no-refresh, base64/timestamp, ADF media, concurrency-cap parity) and would rather land the parity-coverage sweep separately so this review's fixes can merge. Also left as follow-ups: the jira-link.ts Put+Delete atomicity and splitting client_secret out of the agent-readable bundle.

Verification: agent 1007 tests green; CDK Jira handler suites + synth suites pass under the pinned Node (22); check:types-sync and ESLint clean; docs mirrors in sync.

mayakost and others added 3 commits June 10, 2026 15:07
The 'Merge branch main' commit (0f47343) dropped the closing ');' on the
Jira mirrorMarkdownFile() call where it interleaved with main's new
'Deploy preview screenshots' mirror block, producing a SyntaxError that
broke the //docs:sync build step (and thus the whole build job).
The 'Merge branch main' (0f47343) dropped the closing '});' on the
JiraWorkspaceRegistryTableName CfnOutput where main's new
GitHubScreenshotIntegration block was spliced in, cascading into ~40
TS1005 errors and breaking //cdk:compile.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants