Implement executable tool functions for factory agent by habdelra · Pull Request #4302 · cardstack/boxel

habdelra · 2026-04-01T16:15:33Z

Note: This PR is based on #4292 which needs to be reviewed and merged first.

Summary

Architecture shift from declarative AgentAction[] DSL to executable tool functions:

Adds factory-tool-builder.ts that builds FactoryTool[] — each tool has a JSON Schema definition (for the LLM) and an execute function wrapped with auth + safety middleware
The agent calls tools directly via native tool-use protocol instead of returning a flat action array for a dispatcher to interpret
Updates ToolExecutor.execute() to accept (toolName, toolArgs) directly — removes the old AgentAction object form
Refactors executeRealmApi to delegate to realm-operations functions instead of raw HTTP calls
Adds deleteCard(), atomicOperation(), getServerSession() to realm-operations.ts
Updates all test files (factory-tool-executor.test.ts, .spec.ts, .integration.test.ts) to use new signature
Updates factory-tools-smoke.ts to use the new call signature and exercise the ToolBuilder
Updates one-shot-factory-go-plan.md design doc with the new architecture

Key Design

write_file routes .gts/.ts → writeModuleSource(), .json → writeCardSource()
read_file, search_realm wrap realm operations with per-realm JWT auth
update_ticket, create_knowledge write card source to target realm
signal_done, request_clarification return control flow signals
Registered script/realm-api tools are wrapped as FactoryTool delegating to ToolExecutor
Auth model: per-realm JWTs via realmTokens for realm-url tools, server JWT via serverToken for realm-server-url tools (realm-create, realm-auth, realm-server-session)
realm-read/realm-write always use application/vnd.card+source Accept header — no custom overrides

Try it out

No running services needed. From packages/software-factory/:

pnpm factory:tools-smoke

Expected output:

=== Tool Registry ===

Registered tools: 18

  script (4):
    - search-realm  [json]  required: realm  ...
    - pick-ticket  [json]  required: realm  ...
    ...

  realm-api (8):
    - realm-read  [json]  required: realm-url, path  ...
    - realm-write  [json]  required: realm-url, path, content  ...
    ...

  ✓ has script tools
  ✓ has boxel-cli tools
  ✓ has realm-api tools
  ✓ all names unique

=== Argument Validation ===

  ✓ valid args -> no errors
  ✓ missing required arg -> error
  ✓ unknown tool -> error

=== Safety Constraints ===

  ✓ rejects unregistered tool
  ✓ rejects source realm
  ✓ rejects unknown realm

=== Realm API Round-Trip (mock) ===

  -> GET https://realms.example.test/user/target/CardDef/hello.gts
  ✓ realm-read exitCode=0
  ✓ realm-read has output
  ✓ realm-read duration 34ms
  -> QUERY https://realms.example.test/user/target/_search
  ✓ realm-search exitCode=0
  -> POST https://realms.example.test/user/target/CardDef/new.gts
  ✓ realm-write exitCode=0
  ✓ mock fetch called 3 times

=== Factory Tool Builder ===

  Built 19 tools:
    factory: write_file, read_file, search_realm, update_ticket,
             create_knowledge, signal_done, request_clarification
    registered: search-realm, pick-ticket, get-session, run-realm-tests,
                realm-read, realm-write, realm-delete, realm-atomic, ...

  ✓ has write_file tool
  ✓ has read_file tool
  ✓ has search_realm tool
  ✓ has signal_done tool
  ✓ has request_clarification tool
  ✓ includes registered script tools
  ✓ includes registered realm-api tools
  -> POST https://realms.example.test/user/target/my-card.gts
  ✓ write_file .gts succeeds
  ✓ write_file .gts made HTTP call
  -> POST https://realms.example.test/user/target/Card/1.json
  ✓ write_file .json succeeds
  ✓ write_file .json made HTTP call
  -> POST https://realms.example.test/user/target-tests/Tests/spec.ts
  ✓ write_file to test realm made HTTP call
  ✓ signal_done returns DONE_SIGNAL
  ✓ request_clarification returns CLARIFICATION_SIGNAL
  ✓ request_clarification has message

===========================
  31 passed, 0 failed
===========================

Test plan

Closes CS-10566

🤖 Generated with Claude Code

chatgpt-codex-connector · 2026-04-01T16:15:40Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

Copilot

Pull request overview

Implements a new ActionDispatcher for applying AgentAction[] side-effects (realm writes + tool invocations) and enhances the software-factory harness to be more self-sufficient in fresh worktrees (auto-building/symlinking dist artifacts, improved template build progress reporting, and more resilient cached-context validation).

Changes:

Added factory-action-dispatcher.ts plus a comprehensive QUnit test suite covering routing, auth, signals, tool delegation, and error isolation.
Updated harness startup/template-building flow to auto-build/symlink required dist artifacts and to report indexing progress via worker-manager’s /_indexing-status.
Simplified/remodeled worker-manager port handling (dynamic by default; optional explicit port for monitoring) and improved stale cached context URL validation.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
packages/software-factory/scripts/lib/factory-action-dispatcher.ts	New dispatcher that executes AgentAction side effects (writes/tools/signals) across target/test realms.
packages/software-factory/tests/factory-action-dispatcher.test.ts	New unit tests covering dispatcher routing, auth, tool execution, signals, and error isolation.
packages/software-factory/tests/fixtures.ts	Removes fixed worker-manager port allocation/env wiring in test fixture setup.
packages/software-factory/src/harness/support-services.ts	Adds auto symlink/build paths for host + boxel-ui/boxel-icons dist; adjusts icon server lifecycle management.
packages/software-factory/src/harness/shared.ts	Removes `DEFAULT_WORKER_MANAGER_PORT` env-derived constant.
packages/software-factory/src/harness/isolated-realm-stack.ts	Adds optional explicit `workerManagerPort`; otherwise picks dynamically and passes through to worker-manager.
packages/software-factory/src/harness/database.ts	Makes `databaseExists` more resilient, adds worker-manager indexing progress reporter during template builds.
packages/software-factory/src/cli/cache-realm.ts	Validates cached `support.json` context by probing hostURL and matrixURL before reuse.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

packages/software-factory/scripts/lib/factory-action-dispatcher.ts

packages/software-factory/src/harness/database.ts

Copilot

Pull request overview

Copilot reviewed 15 out of 15 changed files in this pull request and generated 6 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

packages/software-factory/scripts/lib/factory-tool-builder.ts

packages/software-factory/src/harness/support-services.ts

packages/software-factory/docs/one-shot-factory-go-plan.md

packages/software-factory/tests/fixtures.ts

Add factory-tool-builder.ts that builds FactoryTool[] — executable tool functions the agent calls directly via the LLM's native tool-use protocol. Each tool has a JSON Schema definition and an execute function wrapped with auth + safety middleware. Tools provided: - write_file: routes .gts/.ts → writeModuleSource, .json → writeCardSource - read_file: reads from realm via readCardSource - search_realm: wraps searchRealm() with per-realm JWT auth - update_ticket, create_knowledge: card writes to target realm - signal_done, request_clarification: control flow signals - All registered script/realm-api tools wrapped as FactoryTool delegates Also updates: - ToolExecutor.execute() accepts (toolName, toolArgs) directly - factory-tools-smoke.ts uses the new string-based call signature - one-shot-factory-go-plan.md design doc reflects the new architecture Closes CS-10566 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Resolves TS2345 lint error: hoist matrixAuth variable above if/else branches so both the ok and error paths can use the narrowed local. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- write_file uses writeModuleSource for ALL files (no JSON.parse routing). Card+source MIME type means the realm server accepts raw content as-is regardless of extension. Caller is responsible for file extensions. - buildFetchOptions uses resolveAuthForUrl for trailing-slash-safe token lookup - Add run_tests tool wrapping executeTestRunFromRealm with both per-realm JWT and server JWT - Update doc ToolExecutor.execute signature to include options param - Remove unused isModuleFile/MODULE_EXTENSIONS - Update tests: remove JSON routing assertions, remove invalid-JSON test for write_file Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Make executeTestRun injectable via ToolBuilderConfig for testing - Add 4 tests: tool shape/parameters, full options threading (auth, serverToken, matrixAuth, testResultsModuleUrl), default module URL fallback, and target realm JWT verification Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 16 out of 16 changed files in this pull request and generated 5 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

packages/software-factory/scripts/lib/factory-tool-executor.ts

packages/software-factory/scripts/lib/realm-operations.ts

packages/software-factory/src/harness/support-services.ts

packages/software-factory/src/harness/database.ts

packages/software-factory/docs/one-shot-factory-go-plan.md

…ssion - realm-search: fail with exitCode=1 and clear error on non-string or invalid JSON query args instead of silently coercing to {} - getServerSession: return explicit error when 200 OK response has no token (empty body, non-JSON body, or missing token field) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The doc still described write_file as routing .gts/.ts to writeModuleSource and .json to writeCardSource. Updated to reflect the current implementation: all files written via writeModuleSource with card+source MIME type. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Keep our updated write_file description (raw content, no routing), ToolExecutor.execute options parameter, and simplified example code. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

habdelra requested a review from Copilot April 1, 2026 17:37

Copilot started reviewing on behalf of habdelra April 1, 2026 17:38 View session

Copilot AI reviewed Apr 1, 2026

View reviewed changes

habdelra changed the title ~~Implement action dispatcher for AgentAction[] realm writes~~ Implement executable tool functions for factory agent Apr 1, 2026

habdelra force-pushed the worktree-cs-10566-action-dispatcher branch 4 times, most recently from 9c3f80f to 92f4489 Compare April 1, 2026 18:25

habdelra requested a review from Copilot April 1, 2026 18:29

Copilot started reviewing on behalf of habdelra April 1, 2026 18:30 View session

habdelra force-pushed the worktree-cs-10566-action-dispatcher branch from 92f4489 to 970aab3 Compare April 1, 2026 18:32

Copilot AI reviewed Apr 1, 2026

View reviewed changes

habdelra force-pushed the worktree-cs-10566-action-dispatcher branch 3 times, most recently from 0594305 to 8684cc3 Compare April 1, 2026 18:53

habdelra force-pushed the worktree-cs-10566-action-dispatcher branch from 8684cc3 to 6bceca7 Compare April 1, 2026 19:05

habdelra and others added 3 commits April 1, 2026 15:14

Merge main into worktree-cs-10566-action-dispatcher

42a701e

Resolves TS2345 lint error: hoist matrixAuth variable above if/else branches so both the ok and error paths can use the narrowed local. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Simplify write_file tool description

5e69649

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

habdelra requested a review from Copilot April 1, 2026 19:47

Copilot started reviewing on behalf of habdelra April 1, 2026 19:48 View session

Copilot AI reviewed Apr 1, 2026

View reviewed changes

habdelra and others added 3 commits April 1, 2026 15:58

Fix prettier formatting on design doc

c8e83cc

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

habdelra requested a review from a team April 1, 2026 20:29

backspace approved these changes Apr 2, 2026

View reviewed changes

Merge main and resolve doc conflicts

c9f4939

Keep our updated write_file description (raw content, no routing), ToolExecutor.execute options parameter, and simplified example code. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

habdelra merged commit 33c3e50 into main Apr 2, 2026
19 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement executable tool functions for factory agent#4302

Implement executable tool functions for factory agent#4302
habdelra merged 9 commits intomainfrom
worktree-cs-10566-action-dispatcher

habdelra commented Apr 1, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector bot commented Apr 1, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

habdelra commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Key Design

Try it out

Test plan

Uh oh!

chatgpt-codex-connector bot commented Apr 1, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

habdelra commented Apr 1, 2026 •

edited

Loading