Skip to content

(orchestration): fleet task fan-out across multiple repositories #392

@krokoko

Description

@krokoko

Component

API or orchestration

Describe the feature

Provide a first-class way to submit one logical unit of work against many onboarded repositories in parallel—for example, the same workflow and task description applied to each repo in a set—with aggregated status, per-repo outcomes, and fleet-level concurrency controls.

Fleet jobs should compose with existing Blueprints, workflows, admission limits, and triggers rather than requiring external scripts that loop bgagent submit.

Use case

I'm always frustrated when org-wide maintenance (dependency upgrades, lint rule adoption, API migrations) requires N manual submissions or bespoke automation outside the platform. Teams running migrations across dozens or hundreds of repos need a batch abstraction: one intent, many PRs, single dashboard of progress, clear failure isolation per repo. Without fleet orchestration, ABCA stays optimized for single-repo tasks and under-delivers on high-ROI maintenance workloads.

Proposed solution

  1. Fleet job entityfleet_id, creator, workflow_ref, shared task_description / parameters, repo list (explicit or selector), concurrency cap, created/completed timestamps, rollup status.
  2. Fan-out worker — Durable Lambda or step function that enqueues per-repo child tasks via create-task core; child tasks reference fleet_id and inherit idempotency semantics (fleet_id, repo).
  3. API / CLIPOST /v1/fleets, bgagent fleet submit --repos-file ... --workflow ..., bgagent fleet status <id> with per-repo terminal states and PR links.
  4. Selectors (phase 2) — Optional repo filters: Blueprint tag, org-wide manifest, or CSV from operator tooling.
  5. Observability — Metrics for fleet completion rate, cost rollup, and failure taxonomy; optional notification when fleet reaches terminal state.

Acceptance criteria

  • Operator can create a fleet job targeting at least 2 repos and receive distinct child task_ids with stable linkage to the fleet.
  • Fleet-level status reflects per-repo COMPLETED/FAILED/CANCELLED counts and links to PRs where applicable.
  • Fleet fan-out respects system and per-user concurrency; behavior when limits hit is documented (queue vs fail).
  • Cancelling a fleet cancels or stops not-yet-started child tasks per documented rules.
  • Tests cover fan-out, partial failure, and fleet cancellation.

Other information

  • Related roadmap: Scheduled triggers, Agent swarm (different problem—single repo DAG vs multi-repo fan-out).
  • Related design: docs/design/ORCHESTRATOR.md, docs/design/REPO_ONBOARDING.md, docs/design/API_CONTRACT.md.
  • Child tasks should reuse existing task lifecycle; avoid a parallel state machine where possible.
  • Alternatives considered: CLI wrapper only (no aggregation, no governance); one mega-task cloning many repos in one session (poor isolation and blast radius).

Acknowledgements

  • I may be able to implement this feature
  • This might be a breaking change

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestorchestrationTask lifecycle, REST API handlers, orchestrator Lambdas, durable execution

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions