feat(examples): add a reference host/server pair — cli-client + todos-server#2380
Conversation
…and todos-server examples/cli-client is a complete LLM-connected MCP host: an interactive chat CLI with no built-in tools, where everything comes from the servers it connects to (a URL via --server with OAuth on 401, a spawned command line, or an mcpServers-style config). The model sits behind a small LLMProvider seam with Scripted (keyless, used by CI), Anthropic, OpenAI, and Gemini implementations that resolve the latest mid-tier model from each provider's models API. The host wires the full client feature surface: namespaced tool loop, @-mention resources as provenance-labelled context, /watch subscriptions on both protocol eras, prompts as slash commands with tab completion backed by completion/complete, approval-gated sampling, schema-driven elicitation forms, roots, per-call progress and server logs, Ctrl-C cancellation, and full OAuth (PKCE, dynamic client registration, state-checked loopback callback). examples/todos-server is the workload it pairs with: a todo board serving both protocol revisions from one codebase over stdio and Streamable HTTP, where every server feature has a real job — CRUD tools (one with structuredContent), sampling- backed prioritize, an elicitation-confirmed bulk delete, a multi-round brainstorm flow whose requestState is a step-discriminated union signed via createRequestStateCodec, paced progress with cancellation observation, request-tied logging, resources with a completable template, and per-resource subscriptions. cli-client/client.ts replays a scripted conversation against todos-server as a self-verifying run:examples story across both transports and eras, asserting the loop, sampling, the multi-round + signed-state flow, completions, cancellation, progress, logging, and subscriptions end to end; story-local vitest covers the provider mappings, routing, config parsing, forms, and OAuth helpers. docs/host-integration.md is the companion "Building a host" guide: who should (and should not) build a host, the provider seam and the tool loop (snippets synced from the example source), then per-feature guidance narrated against the pair.
|
@modelcontextprotocol/client
@modelcontextprotocol/codemod
@modelcontextprotocol/core
@modelcontextprotocol/server
@modelcontextprotocol/server-legacy
@modelcontextprotocol/express
@modelcontextprotocol/fastify
@modelcontextprotocol/hono
@modelcontextprotocol/node
commit: |
…ol-consent policy - URL-mode elicitation now applies the same https-or-loopback check as the OAuth flow before offering to open a server-supplied URL (file:, javascript:, and plain-http phishing URLs fail closed to a decline). The check is a shared isSafeBrowserUrl helper, unit-tested, and the guide's URL-mode section now states the gate. - The host-integration guide's security section gains a tool-consent bullet: the spec expects a human in the loop able to deny tool invocations; cli-client auto-executes because an interactive user watches every call, and a one-line comment at the execution site says an unattended host must gate execution on user consent.
…ient The browser-authorization flow logged strings derived from the authorization URL (origin in the consent/refusal lines, the full URL in an "opening …" status and in the could-not-open fallback). None of it is a credential — the URL is exactly what the user's browser is about to show — but log lines are the wrong channel for it: - the consent and refusal lines now use static text (the URL adds nothing there), - the "opening …" status no longer echoes the URL, - the could-not-open fallback now presents the URL through the interactive prompt and waits for the user to confirm before polling the callback, instead of printing it and racing ahead. This also clears CodeQL's js/clear-text-logging findings on the UI sinks, which taint-tracked everything read off the OAuth provider into console output.
- Tab completion now offers /watch and the /exit alias (BUILTIN_COMMANDS was missing
both, so the documented commands didn't complete).
- Declining a sampling request now answers with the spec's application-level code -1
("User rejected sampling request") instead of the reserved JSON-RPC InvalidRequest
(-32600), matching the convention the e2e suite encodes; the guide's hand-written
snippet follows.
- The /server:prompt dispatch regex (and the completer's prompt-args branch) accept
the same server-name shapes mention parsing does, so dotted config keys advertised
by /prompts actually dispatch instead of falling through to chat.
…EADME Replace the separate host-integration guide page with a Design notes section in the cli-client README covering the choices a copier should understand first: the example-local provider seam, error results fed back to the model, untrusted-display handling of server text, prompt role preservation, explicit fail-closed approvals, the deliberate absence of a tool-execution gate in an interactive terminal (and what an unattended host must add), and child-process env hygiene. The standalone guide needs more rounds of refinement before it earns a docs-tree slot; the example and its README stand on their own meanwhile.
| const messages: Anthropic.MessageParam[] = []; | ||
|
|
||
| for (const message of request.messages) { | ||
| if (message.role === 'tool') { | ||
| const resultBlock: Anthropic.ToolResultBlockParam = { | ||
| type: 'tool_result', | ||
| tool_use_id: message.toolCallId, | ||
| is_error: message.isError ?? false, | ||
| content: message.content.map(part => partToBlock(part)) | ||
| }; | ||
| const previous = messages.at(-1); |
There was a problem hiding this comment.
🟡 The role:'tool' branch builds tool_result content with message.content.map(partToBlock) without the empty-text filter that toContentBlocks() applies elsewhere, so an MCP tool result containing { type: 'text', text: '' } produces a tool_result with an empty text block, which the Anthropic Messages API rejects with a 400 — and since the broken tool message stays in session.messages, every later turn on that session fails too. Apply the same filter (part.type !== 'text' || part.text.length > 0) to the tool_result content; an empty content array is accepted, unlike an empty text block.
Extended reasoning...
The bug. toAnthropicRequest() filters out empty text parts for user and assistant messages via toContentBlocks() (part.type !== 'text' || part.text.length > 0) — that filter exists precisely because the Anthropic Messages API rejects text content blocks with an empty string (text: String should have at least 1 character). But the role: 'tool' branch builds the tool_result block with content: message.content.map(part => partToBlock(part)), with no filter, so an empty text part survives into the tool_result content array. The same min-length-1 validation applies to text blocks nested inside tool_result content, so the request 400s.
How it gets triggered. An MCP tool may legally return { content: [{ type: 'text', text: '' }] } — e.g. a shell/exec-style tool whose command produced no output, on any server connected via --server. host/content.ts passes it through unchanged: contentBlockToParts() returns [{ type: 'text', text: truncate('') }] = [{ type: 'text', text: '' }], and toolResultToParts() only substitutes the (tool returned no content) placeholder when the parts array is empty (parts.length === 0), not when it contains a single empty-string text part. The loop pushes it as a role: 'tool' message, and the next provider.generate() round emits the invalid tool_result.
Step-by-step proof.
- Model issues a tool call; the MCP server returns
{ content: [{ type: 'text', text: '' }] }. toolResultToParts()→[{ type: 'text', text: '' }](placeholder not triggered, length is 1).runModelRounds()pushes{ role: 'tool', toolCallId, content: [{ type: 'text', text: '' }] }intosession.messages.- The next
provider.generate()callstoAnthropicRequest(); the tool branch maps the part to{ type: 'text', text: '' }inside thetool_resultcontent. - The Messages API responds 400
invalid_request_error("text content blocks must be non-empty"),cli.tsprintserror: …and drops the turn. - Worse: the offending tool message is already in
session.messages, so every subsequentgenerate()re-sends the same broken history and fails again — the conversation never recovers without restarting.
Why nothing else prevents it. The empty-text filter only lives in toContentBlocks(), which the tool branch deliberately bypasses (it needs the raw blocks to wrap in tool_result); toolResultToParts() only guards the zero-parts case. The OpenAI and Gemini mappings flatten tool results to plain text (partsToText), so they are unaffected — this is specific to the Anthropic mapping. The paired todos-server never returns an empty text block, so the e2e legs and unit tests don't catch it.
Impact and fix. This is example code, but providers/anthropic.ts is explicitly advertised (README and docs/host-integration.md) as "a complete, copyable mapping" for host authors, so the gap propagates to copies. The fix is one line — apply the same filter to the tool_result content:
content: message.content.filter(p => p.type !== 'text' || p.text.length > 0).map(part => partToBlock(part))An empty content array on a tool_result is accepted by the API, unlike an empty text block, so no further fallback is needed.
| const result: CallToolResult = await server.client.callTool( | ||
| { | ||
| name: route.toolName, | ||
| arguments: call.arguments, | ||
| // On 2026-07-28 connections servers only emit log notifications for requests | ||
| // that opt in via this _meta key; on 2025 the setLoggingLevel call covers it. | ||
| ...(server.era === 'modern' ? { _meta: { [LOG_LEVEL_META_KEY]: 'info' } } : {}) | ||
| }, | ||
| { | ||
| // Aborting this signal cancels the call: the SDK sends notifications/cancelled | ||
| // and the server can stop work via its own request signal. | ||
| signal: options?.signal, | ||
| onprogress: progress => { | ||
| const total = progress.total === undefined ? '' : `/${progress.total}`; | ||
| this.ui.status(`${call.name}: ${progress.message ?? 'working'} (${progress.progress}${total})`); | ||
| }, | ||
| resetTimeoutOnProgress: true |
There was a problem hiding this comment.
🟡 On a 2025-era (--legacy) connection, the interactive todos-server tools (brainstorm_tasks, prioritize, clear_done) wait on the human inside the pending tools/call, but executeToolCall passes no timeout and the legacy-arm elicitInput/requestSampling calls in todos.ts don't either — so the SDK's 60s default applies and the documented --legacy interactive tour fails if the user takes more than a minute to answer the approval prompt or the brainstorm form. Passing a generous timeout (or maxTotalTimeout) here for interactive tool calls, and on the legacy-arm elicitInput/requestSampling calls, fixes it.
Extended reasoning...
The bug. executeToolCall (examples/cli-client/host/host.ts:170-186) calls server.client.callTool(...) with only { signal, onprogress, resetTimeoutOnProgress: true }. With no timeout set, the SDK default of 60 seconds applies (DEFAULT_REQUEST_TIMEOUT_MSEC = 60_000 in packages/core-internal/src/shared/protocol.ts:95, applied at protocol.ts:1509). On the server side, todos-server's legacy arms call ctx.mcpReq.elicitInput / ctx.mcpReq.requestSampling (examples/todos-server/todos.ts:452, 461, 470, 629, 681) with no options either, so the server→client request carries the same 60s default.\n\nWhy resetTimeoutOnProgress doesn't help. The only thing that resets the request timeout is a notifications/progress message (protocol.ts:1130-1132). An incoming server→client request (elicitation/create, sampling/createMessage) does not reset it, and the interactive tools emit no progress while they wait for the human. maxTotalTimeout is also unset, so even progress-based resets wouldn't cover the wait.\n\nThe code path that triggers it (legacy era only). On a 2025-11-25 connection the original tools/call stays pending while the server pushes elicitation/sampling requests back to the client and awaits them inline. The user's think time — reading the multi-line sampling-approval block the host deliberately prints in full, or filling the brainstorm theme/count form one field at a time (with retries on invalid input) — happens entirely inside both 60-second timers.\n\nStep-by-step proof. (1) Run the README's documented tour: terminal A serves todos-server, terminal B runs cli-client with --legacy (or spawn the server over stdio with --legacy). (2) Say "prioritize my open tasks". (3) The model calls mcp__todos__prioritize; todos-server's legacy arm calls ctx.mcpReq.requestSampling(...) and the host shows the full request text in an attention block, asking "Allow?". (4) The user reads the multi-line request and takes 70 seconds to answer. (5) At t=60s the host-side callTool rejects with a timeout — executeToolCall reports "Tool call failed: ... timed out" to the model — and/or the server-side requestSampling rejects, so the tool errors. The answer the user is mid-way through typing is discarded, and the readline question can be left dangling on screen while the model has already been told the call failed. The same happens with the brainstorm_tasks form and the clear_done confirmation.\n\nWhy nothing catches it. The modern (2026-07-28) era is unaffected because input_required completes the original request and the user wait happens between requests, outside any timer. The scripted e2e (ScriptedUI) answers instantly, so all four CI legs pass. The repo's own docs (docs/client.md, the timeout section) document the 60-second default and show passing a longer timeout as the remedy, but this reference host — explicitly meant to be copied — doesn't do it.\n\nHow to fix. Pass a generous timeout (and/or maxTotalTimeout) in the callTool options in executeToolCall — interactive tool calls in a host with a human in the loop should tolerate minutes, not seconds — and pass a longer timeout to the legacy-arm elicitInput / requestSampling calls in todos.ts. Both are one-line option additions.\n\nSeverity. This is example code, legacy-era only, and requires >60s of human latency, so it should not block the PR — but it is a concrete functional failure of the documented --legacy interactive tour rather than a style point, and the fix is the exact pattern the client guide already recommends.
Adds a reference pair to
examples/:cli-client/, a complete LLM-connected MCP host (interactive chat CLI with no built-in tools — everything comes from the servers it connects to), andtodos-server/, the dual-era server it pairs with, where every server-side feature has a real job.Motivation and Context
The existing examples are single-feature stories: excellent for seeing one primitive in isolation, but nothing shows how a real host composes them — the tool loop, resources as context, prompts as commands, sampling/elicitation round-tripping through a UI, OAuth, cancellation — or how a real server wires the same features across both protocol revisions. Host authors had no "start here" example.
cli-client: provider seam (scriptedkeyless + Anthropic/OpenAI/Gemini mappings), namespaced tool loop,@server:uriresources with provenance,/watchsubscriptions on both eras, prompts as slash commands withcompletion/completetab completion, approval-gated sampling, schema-driven elicitation forms, roots, progress/logging, Ctrl-C cancellation, and OAuth (PKCE + DCR + state-checked loopback callback) for protected servers —--server <url>connects it to anything. The README's Design notes section records the choices a copier should understand first.todos-server: serves 2026-07-28 and 2025-11-25 from one codebase over stdio and Streamable HTTP; CRUD tools (one withstructuredContent), sampling-backedprioritize, an elicitation-confirmed bulk delete, a multi-roundbrainstorm_tasksflow whoserequestStateis a step-discriminated union signed viacreateRequestStateCodec, paced progress with cancellation observation, request-tied logging, a completable resource template, and per-resource subscriptions.How Has This Been Tested?
cli-client/client.tsis a self-verifyingrun:examplesstory: it replays a scripted conversation against todos-server over stdio + Streamable HTTP × both protocol eras (4 legs), asserting the tool loop, sampling approval, the multi-round elicitation + signed-requestStateflow, completions, cancellation, progress, logging, subscriptions, and the final board state. Runs in the per-PRexamples (build + e2e)job; the full suite passes locally (27 stories / 69 legs).pnpm -r test.--serverincluding a full browser OAuth (dynamic client registration, PKCE, state validation) sign-in flow.typecheck,lint,sync:snippets --check, anddocs:checkare green.Breaking Changes
None — additive examples only; nothing is published.
Types of changes
Checklist
Additional context
server.ts/client.tspair, SDK calls inline): the host wiring acrosshost/is what the example documents. Both packages carry ashapeExemptnote inpackage.json#exampleexplaining this;cli-client/server.tsis a 5-line shim so the example runner's HTTP legs can spawn the paired server.LLMProviderseam is intentionally example-local, not SDK API — the SDK stays a protocol library; a host's message shapes belong to the host. Same for themcpServersconfig parsing.todos-serverkeeps explicit era branches in its interactive tools (push-style 2025 vsinput_required2026) so the two conversation models can be read side by side; if the server-side legacy fulfilment loop lands later, the legacy arms can be deleted.createMcpHandler's default stateless posture there (same caveat as thesampling/story).