Skip to content

Add ContextBuilder for AgentContext assembly (CS-10567)#4301

Merged
habdelra merged 9 commits intomainfrom
cs-10567-implement-context-builder-for-agentcontext-assembly
Apr 2, 2026
Merged

Add ContextBuilder for AgentContext assembly (CS-10567)#4301
habdelra merged 9 commits intomainfrom
cs-10567-implement-context-builder-for-agentcontext-assembly

Conversation

@habdelra
Copy link
Copy Markdown
Contributor

@habdelra habdelra commented Apr 1, 2026

Note: This PR is based on #4292 (worktree-cs-10560-cache-prepare), which needs to be reviewed and merged first.

Summary

  • Add scripts/lib/factory-context-builder.ts — assembles a complete AgentContext from ticket, project, knowledge articles, and resolved skills
  • Aligned with CS-10566's executable tool functions model: tools are no longer in AgentContext (provided separately as FactoryTool[] to agent.run())
  • AgentContext.tools marked optional/deprecated; toolResults, previousActions, iteration no longer set by the builder
  • Supports testResults threading for iteration passes after failed test runs
  • Skill budget enforcement via enforceSkillBudget() when maxSkillTokens is configured
  • Guard context.tools access in factory-prompt-loader.ts with ?? [] for backward compat
  • Add tests/factory-context-builder.test.ts with 14 tests covering skill resolution, budget enforcement, tool exclusion, test results threading, and core fields
  • Add scripts/factory-context-smoke.ts — exercises the full pipeline with real skill files from disk

Try it out

From packages/software-factory/:

# Basic run — exercises skill resolution, loading, context assembly
pnpm factory:context-smoke

Expected output:

=== Context Builder Smoke Test ===

--- Card definition (.gts work) ---
  Ticket: Define StickyNote card

  First pass (no test results):
  ✓ project.id set
  ✓ ticket.id set
  ✓ knowledge: 2 article(s)
  ✓ skills: 3 loaded
  ✓ tools not set (provided separately as FactoryTool[])
  ✓ testResults not set
  ✓ targetRealmUrl set
  ✓ testRealmUrl set
  Skill breakdown (~69269 total tokens):
    - boxel-development: ~6969 tokens + 5 ref(s)
    - boxel-file-structure: ~1964 tokens
    - ember-best-practices: ~60336 tokens + 1 ref(s)

  Iteration pass (with failed test results):
  ✓ testResults.status = failed
  ✓ testResults.failedCount = 1
  ✓ testResults.failures[0] has error
  ✓ skills still loaded on iteration
  ✓ deprecated fields not set

--- Factory workflow ticket ---
  Ticket: Improve factory delivery pipeline

  ... (same checks, different skills resolved) ...

--- Minimal ticket (base case) ---
  Ticket: Add timestamp fields

  ... (same checks, fewer skills for a generic ticket) ...

===========================
  39 passed, 0 failed
===========================

With skill token budget enforcement:

pnpm factory:context-smoke --max-tokens 8000

This trims skills to fit within the budget. You'll see [SkillBudget] Dropping skill ... warnings as skills are trimmed, and a budget verification section at the end:

[SkillBudget] Dropping skill "boxel-file-structure" (1964 tokens) — would exceed budget of 8000 (used: 6969)
[SkillBudget] Dropping skill "ember-best-practices" (60336 tokens) — would exceed budget of 8000 (used: 6969)
  First pass (no test results):
  ✓ project.id set
  ...
  ✓ skills: 1 loaded
  ...
  Skill breakdown (~6969 total tokens):
    - boxel-development: ~6969 tokens + 5 ref(s)

  ... (more tickets, some fit within budget without trimming) ...

--- Budget enforcement (8000 tokens) ---

[SkillBudget] Dropping skill "boxel-file-structure" (1964 tokens) — would exceed budget of 8000 (used: 6969)
[SkillBudget] Dropping skill "ember-best-practices" (60336 tokens) — would exceed budget of 8000 (used: 6969)
  ✓ 1 skills within budget (~6969 tokens <= 8000)
    - boxel-development: ~6969 tokens

===========================
  40 passed, 0 failed
===========================

Test plan

  • pnpm test:node — all 321 tests pass (14 new)
  • pnpm lint — clean (only pre-existing ../base/ glint errors)
  • pnpm factory:context-smoke — 39 passed, 0 failed
  • pnpm factory:context-smoke --max-tokens 8000 — 40 passed, 0 failed

🤖 Generated with Claude Code

habdelra and others added 6 commits March 31, 2026 21:18
cache:prepare now works in a completely fresh worktree with no running
services. Previously it would fail if host dist, boxel-ui dist,
boxel-icons dist, or postgres weren't already available.

Changes:

- Auto-build host dist: when no pre-built host dist exists in the
  worktree or root repo, build it automatically instead of erroring.
  In worktrees, symlinks from the root repo's built dist when available.

- Auto-provision boxel-ui and boxel-icons dist: symlink from root repo
  in worktrees (fast path), or build from source as fallback.

- Managed icon server lifecycle: instead of detecting an external icon
  server and hoping it stays alive, always start our own managed
  process. Falls back to the existing dev server if port 4206 is
  already taken.

- Indexing progress bar: polls the boxel_index table every 2s during
  template builds and renders an in-place progress bar on stderr:

    indexing [===============               ] 199/373 files (53%) 80s
    indexing [==============================] 603/373 files (100%) 212s
    indexing complete: 603 files indexed in 229.5s

- Graceful pg unavailability: databaseExists() now returns false
  instead of throwing when postgres isn't running yet, allowing
  startFactorySupportServices() to start it.

- Stale context validation: cache-realm.ts now validates both hostURL
  and matrixURL in cached support.json, discarding stale contexts
  where either service is unreachable.

Tested: cache:prepare succeeds in a fresh worktree with zero services
running, and also succeeds alongside a running dev-all without
disrupting it.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Replace spawnSync('ln', ...) with Node fs.symlinkSync for
  cross-platform symlink creation in ensureBoxelUIDist and
  ensureBoxelIconsDist
- Re-validate boxel-icons dist after build to catch partial output
- Add child.on('error', ...) handler to icon server process to catch
  spawn failures
- Use fallback child.kill() when process group kill fails in icon
  server stop
- Keep polling probe URL when icon server exits with EADDRINUSE
  instead of throwing immediately (allows external server startup time)
- Guard against overlapping DB polls in progress reporter with
  in-flight flag

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Pre-allocate the worker-manager port via findAvailablePort() and pass
it to both startIsolatedRealmStack and the progress reporter. This
lets the reporter poll the /_indexing-status JSON endpoint which has
the exact invalidation graph computed by the index runner.

Progress output shows the current realm, file counts, and queued realms:

    indexing: waiting for status 3s
    indexing base: discovering files... 8s (2 realms remaining)
    indexing base [=====>                        ] 50/153 files (33%) 30s (2 realms remaining)
    indexing software-factory [========>             ] 22/204 files (11%) 94s
    indexing test [==============================] 13/13 files (100%) 210s
    indexing complete: 13/13 files in 229.5s

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Each harness instance now gets a fresh port via findAvailablePort()
instead of falling back to the static DEFAULT_WORKER_MANAGER_PORT
env var, which caused EADDRINUSE collisions in parallel test runs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Each harness instance now gets a fresh port via findAvailablePort()
instead of using a pre-allocated port from the test fixture, which
caused EADDRINUSE collisions due to TOCTOU races.

- Remove DEFAULT_WORKER_MANAGER_PORT constant and
  SOFTWARE_FACTORY_WORKER_MANAGER_PORT env var entirely
- Remove workerManagerPort from TestWorkerPortSet (no longer
  pre-allocated per worker)
- startIsolatedRealmStack always uses findAvailablePort() when no
  explicit port is provided
- cache:prepare path unchanged: still pre-allocates port for progress
  monitoring via /_indexing-status

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Implement factory-context-builder.ts that assembles a complete AgentContext
by resolving skills, loading them from disk, applying token budgets, and
gathering tool manifests. ToolRegistry is constructed with SCRIPT_TOOLS +
REALM_API_TOOLS only (boxel-cli tools excluded per CS-10520 constraint).

Tests cover skill resolution delegation, budget enforcement, tool manifest
filtering (no boxel-cli tools), and iteration state threading (testResults,
toolResults, previousActions, iteration).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a reusable ContextBuilder to assemble full AgentContext objects (skills, tools, iteration state, and core ticket/project fields) and extends the software-factory harness to be more self-sufficient in fresh worktrees (auto-building/symlinking dist assets, improving cached-context validation, and adding indexing progress output).

Changes:

  • Add ContextBuilder (factory-context-builder.ts) plus comprehensive QUnit coverage for skill resolution/budgeting, tool manifests, and iteration-state threading.
  • Update harness startup to auto-ensure host/boxel-ui/boxel-icons build artifacts exist (symlink from root checkout when possible; otherwise build).
  • Improve harness/template build ergonomics: remove worker-manager env port plumbing in tests, add worker-manager port override support, and print indexing progress by polling /_indexing-status.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
packages/software-factory/scripts/lib/factory-context-builder.ts New ContextBuilder for assembling AgentContext (skills/tools + iteration state).
packages/software-factory/tests/factory-context-builder.test.ts New test suite (17 tests) covering context assembly behavior.
packages/software-factory/src/harness/support-services.ts Auto-ensure/build/symlink host + boxel-ui + boxel-icons dist artifacts; manage icon server lifecycle.
packages/software-factory/src/harness/database.ts Add indexing progress reporter; adjust PG connection behavior for existence checks.
packages/software-factory/src/harness/isolated-realm-stack.ts Allow explicit worker-manager port (used for progress polling).
packages/software-factory/src/harness/shared.ts Remove exported default worker-manager port env constant.
packages/software-factory/tests/fixtures.ts Remove stable worker-manager port allocation/env wiring for Playwright harness; shift prerender port.
packages/software-factory/src/cli/cache-realm.ts Validate cached support context URLs (host + matrix) before reuse.
Comments suppressed due to low confidence (1)

packages/software-factory/src/harness/isolated-realm-stack.ts:304

  • actualWorkerManagerPort and actualRealmServerPort can be derived from separate findAvailablePort() calls without reserving either port. This can (rarely but definitively) result in the same port being selected for both processes, causing one to fail to bind. Consider ensuring uniqueness (e.g., if the second selection equals the first, pick again) or switching to a port-allocation helper that reserves ports until the child processes are listening.
  let actualWorkerManagerPort =
    explicitWorkerManagerPort ?? (await findAvailablePort());
  let actualRealmServerPort =
    DEFAULT_REALM_SERVER_PORT === 0
      ? await findAvailablePort()
      : DEFAULT_REALM_SERVER_PORT;
  let actualRealmServerURL = withPort(realmServerURL, actualRealmServerPort);

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread packages/software-factory/src/harness/support-services.ts
Comment thread packages/software-factory/src/harness/support-services.ts
Comment thread packages/software-factory/tests/fixtures.ts
Comment thread packages/software-factory/src/harness/database.ts
habdelra and others added 3 commits April 1, 2026 14:14
Align with CS-10566's shift to executable FactoryTool[]: the context
builder no longer assembles tool manifests or threads toolResults,
previousActions, or iteration number. Tools are provided separately
to agent.run(), and tool call history is managed by the orchestrator.

AgentContext.tools is now optional (deprecated), and the context builder
only sets: project, ticket, knowledge, skills, testResults, realm URLs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Guard context.tools access with ?? [] in factory-prompt-loader.ts
since tools is now optional on AgentContext.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Exercises the full ContextBuilder pipeline using real skill files from
disk: skill resolution, loading, budget enforcement, test results
threading, and verification that deprecated fields are not set.

Run: pnpm factory:context-smoke
With budget: pnpm factory:context-smoke --max-tokens 8000

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@habdelra habdelra requested a review from a team April 1, 2026 18:56
@habdelra habdelra merged commit b281123 into main Apr 2, 2026
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants