agentHost/claude: ground Phase 3 in the production reference#313686
agentHost/claude: ground Phase 3 in the production reference#313686TylerLeonhardt wants to merge 1 commit intomainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Updates the Claude agent-host roadmap to record the Phase 3 SDK-integration spike results, so future implementation phases (Phase 4+) can proceed with validated assumptions about the Claude Agent SDK/proxy integration.
Changes:
- Marks Phase 3 as completed in the Claude roadmap.
- Adds a “Phase 3 — Findings (2026-05-01)” section documenting validated behaviors and deferred follow-ups.
Show a summary per file
| File | Description |
|---|---|
| src/vs/platform/agentHost/node/claude/roadmap.md | Marks Phase 3 as done and appends detailed spike findings + open items for Phase 4+. |
Copilot's findings
- Files reviewed: 1/1 changed files
- Comments generated: 2
| already validated proxy ↔ real CAPI). Run with `tsx --tsconfig src/tsconfig.json` | ||
| from `extensions/copilot/node_modules/.bin/`. |
There was a problem hiding this comment.
The run instructions are ambiguous/incorrect: "Run with tsx --tsconfig src/tsconfig.json from extensions/copilot/node_modules/.bin/" suggests changing the working directory to .bin, but then src/tsconfig.json won’t resolve. Consider rephrasing to run from repo root while pointing at the tsx binary under extensions/copilot/node_modules/.bin, or clarify the intended cwd explicitly.
| already validated proxy ↔ real CAPI). Run with `tsx --tsconfig src/tsconfig.json` | |
| from `extensions/copilot/node_modules/.bin/`. | |
| already validated proxy ↔ real CAPI). Run from the repo root with | |
| `extensions/copilot/node_modules/.bin/tsx --tsconfig src/tsconfig.json`. |
| Exit criteria: an SDK-driven session completes one turn through the proxy, | ||
| including at least one tool call round-trip, with no traffic to anthropic.com. | ||
|
|
||
| #### Phase 3 — Findings (2026-05-01) ✅ DONE |
There was a problem hiding this comment.
Status formatting is inconsistent with earlier phases (which use "✅ DONE"). Consider using the same bolded DONE marker here as well so headings render consistently when scanning the roadmap.
| #### Phase 3 — Findings (2026-05-01) ✅ DONE | |
| #### Phase 3 — Findings (2026-05-01) ✅ **DONE** |
40fa284 to
02d543a
Compare
The Copilot extension at extensions/copilot/src/extension/chatSessions/claude/ already ships a working integration of @anthropic-ai/claude-agent-sdk 0.2.112 with a local proxy. That implementation is higher-fidelity evidence than any ad-hoc spike could produce. Replace the Phase 3 'spike + findings' narrative with a reference-grounded scoping for Phase 4: 1. Required-for-Phase-4 Options table — the minimum set to produce a working turn, each row citing the line in claudeCodeAgent.ts where the production code uses it and the reason it's required. 2. Deferred-concerns table — every additional knob the extension uses (mcpServers, plugins, settingSources, OTel env, ripgrep PATH, hook events, edit/settings trackers, etc.) mapped to the phase that should pull it in. 3. The one genuine open question the extension cannot answer for us: byte-equivalence between our ClaudeProxyService and the extension's ClaudeLanguageModelServer, to be closed by a Phase 4 unit test that points the SDK (with a stubbed CAPI) at our proxy. Stance: use the extension as a reference and a guiding path, not a verbatim blueprint. The extension has accreted ~20 concerns layered on top of the core SDK <-> proxy contract; copying it whole on day one would obscure which pieces are essential and make incremental review impossible. Also align downstream phases: - Phase 6: enableFileCheckpointing note now points at Phase 8 owning the validation step rather than a spike. - Phase 8: undo mechanism keeps the rewind path as preferred and the in-agent edit-history mechanism as the fallback if rewind misbehaves. - Phase 9: abortSession defaults to _abortController.abort() (matching the reference), with Query.interrupt() as a follow-up only if the default path orphans the subprocess. - Phase 15 + Open questions: drop 'Phase 3 spike answers' framing in favor of phase-owned validation steps.
02d543a to
1ff0e83
Compare
|
Folded into #313780 — the Phase 3 roadmap commit is included as the first commit on that branch and is now reviewable alongside the Phase 4 code. |
|
Folded into #313780 — the Phase 3 roadmap commit is the first commit on that branch and is reviewable alongside the Phase 4 code. |
The Copilot extension at
extensions/copilot/src/extension/chatSessions/claude/already ships a working integration of@anthropic-ai/claude-agent-sdk0.2.112 with a local proxy. That implementation is higher-fidelity evidence than any ad-hoc spike could produce.This replaces the original "Phase 3 spike + findings" framing with a reference-grounded scoping for Phase 4. The Copilot extension is the reference and the guiding path — not a verbatim blueprint. It has accreted ~20 concerns (MCP gateway, plugins, edit tracker, settings change tracker, OTel forwarding, hook events, debug file logger, ripgrep PATH munging, runtime data caching, folder MRU, …) layered on top of the core SDK ↔ proxy contract. Copying the whole shape on day one would obscure which pieces are essential and break incremental review.
What the new Phase 3 section captures
Optionsfields to produce a working turn (cwd, executable, abortController, allowDangerouslySkipPermissions + canUseTool pair, model, permissionMode, systemPrompt preset, settings.env with proxy URL/auth/disable-nonessential-traffic, disallowedTools: ['WebSearch'], stderr). Each row cites the line inclaudeCodeAgent.tswhere the production code uses it.mcpServers,plugins,settingSources, OTel env, ripgrep PATH, hook events, edit/settings trackers, etc.) mapped to the phase that should pull it in. Phase 4 starts small.ClaudeProxyServiceand the extension'sClaudeLanguageModelServer. Closed by a Phase 4 unit test that points the SDK (with a stubbed CAPI) at our proxy and asserts the same message sequence the extension's tests assert. No throw-away spike committed./v1/modelsandcount_tokenshandling, theanthropic-betawhitelist size). Phase 4 decides whether to converge.Downstream phases aligned
enableFileCheckpointingnote now points at Phase 8 owning the validation step.abortSessiondefaults to_abortController.abort()(matching the reference), withQuery.interrupt()evaluated as a follow-up only if the default path orphans the subprocess.What's NOT in this PR
ClaudeProxyServiceor anything else undersrc/vs/platform/agentHost/.Just a roadmap update.