Skip to content

agentHost/claude: ground Phase 3 in the production reference#313686

Closed
TylerLeonhardt wants to merge 1 commit intomainfrom
tyler/claude-spike-phase3
Closed

agentHost/claude: ground Phase 3 in the production reference#313686
TylerLeonhardt wants to merge 1 commit intomainfrom
tyler/claude-spike-phase3

Conversation

@TylerLeonhardt
Copy link
Copy Markdown
Member

@TylerLeonhardt TylerLeonhardt commented May 1, 2026

The Copilot extension at extensions/copilot/src/extension/chatSessions/claude/ already ships a working integration of @anthropic-ai/claude-agent-sdk 0.2.112 with a local proxy. That implementation is higher-fidelity evidence than any ad-hoc spike could produce.

This replaces the original "Phase 3 spike + findings" framing with a reference-grounded scoping for Phase 4. The Copilot extension is the reference and the guiding path — not a verbatim blueprint. It has accreted ~20 concerns (MCP gateway, plugins, edit tracker, settings change tracker, OTel forwarding, hook events, debug file logger, ripgrep PATH munging, runtime data caching, folder MRU, …) layered on top of the core SDK ↔ proxy contract. Copying the whole shape on day one would obscure which pieces are essential and break incremental review.

What the new Phase 3 section captures

  1. Required-for-Phase-4 Options table. The minimum set of Options fields to produce a working turn (cwd, executable, abortController, allowDangerouslySkipPermissions + canUseTool pair, model, permissionMode, systemPrompt preset, settings.env with proxy URL/auth/disable-nonessential-traffic, disallowedTools: ['WebSearch'], stderr). Each row cites the line in claudeCodeAgent.ts where the production code uses it.
  2. Deferred-concerns table. Every additional knob the extension uses (mcpServers, plugins, settingSources, OTel env, ripgrep PATH, hook events, edit/settings trackers, etc.) mapped to the phase that should pull it in. Phase 4 starts small.
  3. One genuine open question the extension cannot answer for us: byte-equivalence between our ClaudeProxyService and the extension's ClaudeLanguageModelServer. Closed by a Phase 4 unit test that points the SDK (with a stubbed CAPI) at our proxy and asserts the same message sequence the extension's tests assert. No throw-away spike committed.
  4. Phase 2 design divergences with the extension noted (e.g. our proxy's /v1/models and count_tokens handling, the anthropic-beta whitelist size). Phase 4 decides whether to converge.

Downstream phases aligned

  • Phase 6: enableFileCheckpointing note now points at Phase 8 owning the validation step.
  • Phase 8: undo mechanism keeps the SDK rewind path as preferred and the in-agent edit-history mechanism as the fallback if rewind misbehaves.
  • Phase 9: abortSession defaults to _abortController.abort() (matching the reference), with Query.interrupt() evaluated as a follow-up only if the default path orphans the subprocess.
  • Phase 15 + Open questions: dropped "Phase 3 spike answers" framing in favor of phase-owned validation steps.

What's NOT in this PR

  • No spike code. The earlier draft committed a small throwaway script; that was over-confident given how much the production extension already proves. The script is deleted.
  • No code changes to ClaudeProxyService or anything else under src/vs/platform/agentHost/.

Just a roadmap update.

Copilot AI review requested due to automatic review settings May 1, 2026 07:58
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the Claude agent-host roadmap to record the Phase 3 SDK-integration spike results, so future implementation phases (Phase 4+) can proceed with validated assumptions about the Claude Agent SDK/proxy integration.

Changes:

  • Marks Phase 3 as completed in the Claude roadmap.
  • Adds a “Phase 3 — Findings (2026-05-01)” section documenting validated behaviors and deferred follow-ups.
Show a summary per file
File Description
src/vs/platform/agentHost/node/claude/roadmap.md Marks Phase 3 as done and appends detailed spike findings + open items for Phase 4+.

Copilot's findings

  • Files reviewed: 1/1 changed files
  • Comments generated: 2

Comment on lines +269 to +270
already validated proxy ↔ real CAPI). Run with `tsx --tsconfig src/tsconfig.json`
from `extensions/copilot/node_modules/.bin/`.
Copy link

Copilot AI May 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The run instructions are ambiguous/incorrect: "Run with tsx --tsconfig src/tsconfig.json from extensions/copilot/node_modules/.bin/" suggests changing the working directory to .bin, but then src/tsconfig.json won’t resolve. Consider rephrasing to run from repo root while pointing at the tsx binary under extensions/copilot/node_modules/.bin, or clarify the intended cwd explicitly.

Suggested change
already validated proxy ↔ real CAPI). Run with `tsx --tsconfig src/tsconfig.json`
from `extensions/copilot/node_modules/.bin/`.
already validated proxy ↔ real CAPI). Run from the repo root with
`extensions/copilot/node_modules/.bin/tsx --tsconfig src/tsconfig.json`.

Copilot uses AI. Check for mistakes.
Exit criteria: an SDK-driven session completes one turn through the proxy,
including at least one tool call round-trip, with no traffic to anthropic.com.

#### Phase 3 — Findings (2026-05-01) ✅ DONE
Copy link

Copilot AI May 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Status formatting is inconsistent with earlier phases (which use "✅ DONE"). Consider using the same bolded DONE marker here as well so headings render consistently when scanning the roadmap.

Suggested change
#### Phase 3 — Findings (2026-05-01) ✅ DONE
#### Phase 3 — Findings (2026-05-01) ✅ **DONE**

Copilot uses AI. Check for mistakes.
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 1, 2026

Screenshot Changes

Base: 061bc2c8 Current: abfcbb1f

Changed (1)

agentSessionsViewer/BackgroundProvider/Light
Before After
before after

@TylerLeonhardt TylerLeonhardt force-pushed the tyler/claude-spike-phase3 branch from 40fa284 to 02d543a Compare May 1, 2026 15:52
@TylerLeonhardt TylerLeonhardt changed the title agentHost/claude: capture Phase 3 SDK spike findings in roadmap agentHost/claude: ground Phase 3 in the production reference May 1, 2026
The Copilot extension at extensions/copilot/src/extension/chatSessions/claude/
already ships a working integration of @anthropic-ai/claude-agent-sdk 0.2.112
with a local proxy. That implementation is higher-fidelity evidence than any
ad-hoc spike could produce.

Replace the Phase 3 'spike + findings' narrative with a reference-grounded
scoping for Phase 4:

1. Required-for-Phase-4 Options table — the minimum set to produce a working
   turn, each row citing the line in claudeCodeAgent.ts where the production
   code uses it and the reason it's required.
2. Deferred-concerns table — every additional knob the extension uses
   (mcpServers, plugins, settingSources, OTel env, ripgrep PATH, hook events,
   edit/settings trackers, etc.) mapped to the phase that should pull it in.
3. The one genuine open question the extension cannot answer for us:
   byte-equivalence between our ClaudeProxyService and the extension's
   ClaudeLanguageModelServer, to be closed by a Phase 4 unit test that points
   the SDK (with a stubbed CAPI) at our proxy.

Stance: use the extension as a reference and a guiding path, not a verbatim
blueprint. The extension has accreted ~20 concerns layered on top of the core
SDK <-> proxy contract; copying it whole on day one would obscure which pieces
are essential and make incremental review impossible.

Also align downstream phases:
- Phase 6: enableFileCheckpointing note now points at Phase 8 owning the
  validation step rather than a spike.
- Phase 8: undo mechanism keeps the rewind path as preferred and the
  in-agent edit-history mechanism as the fallback if rewind misbehaves.
- Phase 9: abortSession defaults to _abortController.abort() (matching the
  reference), with Query.interrupt() as a follow-up only if the default
  path orphans the subprocess.
- Phase 15 + Open questions: drop 'Phase 3 spike answers' framing in favor
  of phase-owned validation steps.
@TylerLeonhardt
Copy link
Copy Markdown
Member Author

Folded into #313780 — the Phase 3 roadmap commit is included as the first commit on that branch and is now reviewable alongside the Phase 4 code.

@TylerLeonhardt
Copy link
Copy Markdown
Member Author

Folded into #313780 — the Phase 3 roadmap commit is the first commit on that branch and is reviewable alongside the Phase 4 code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants