Skip to content

feat(examples): add a reference host/server pair — cli-client + todos-server#2380

Merged
felixweinberger merged 5 commits into
mainfrom
fweinberger/cli-client-todos-server
Jun 29, 2026
Merged

feat(examples): add a reference host/server pair — cli-client + todos-server#2380
felixweinberger merged 5 commits into
mainfrom
fweinberger/cli-client-todos-server

Conversation

@felixweinberger

@felixweinberger felixweinberger commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Adds a reference pair to examples/: cli-client/, a complete LLM-connected MCP host (interactive chat CLI with no built-in tools — everything comes from the servers it connects to), and todos-server/, the dual-era server it pairs with, where every server-side feature has a real job.

Motivation and Context

The existing examples are single-feature stories: excellent for seeing one primitive in isolation, but nothing shows how a real host composes them — the tool loop, resources as context, prompts as commands, sampling/elicitation round-tripping through a UI, OAuth, cancellation — or how a real server wires the same features across both protocol revisions. Host authors had no "start here" example.

  • cli-client: provider seam (scripted keyless + Anthropic/OpenAI/Gemini mappings), namespaced tool loop, @server:uri resources with provenance, /watch subscriptions on both eras, prompts as slash commands with completion/complete tab completion, approval-gated sampling, schema-driven elicitation forms, roots, progress/logging, Ctrl-C cancellation, and OAuth (PKCE + DCR + state-checked loopback callback) for protected servers — --server <url> connects it to anything. The README's Design notes section records the choices a copier should understand first.
  • todos-server: serves 2026-07-28 and 2025-11-25 from one codebase over stdio and Streamable HTTP; CRUD tools (one with structuredContent), sampling-backed prioritize, an elicitation-confirmed bulk delete, a multi-round brainstorm_tasks flow whose requestState is a step-discriminated union signed via createRequestStateCodec, paced progress with cancellation observation, request-tied logging, a completable resource template, and per-resource subscriptions.

How Has This Been Tested?

  • cli-client/client.ts is a self-verifying run:examples story: it replays a scripted conversation against todos-server over stdio + Streamable HTTP × both protocol eras (4 legs), asserting the tool loop, sampling approval, the multi-round elicitation + signed-requestState flow, completions, cancellation, progress, logging, subscriptions, and the final board state. Runs in the per-PR examples (build + e2e) job; the full suite passes locally (27 stories / 69 legs).
  • 34 story-local vitest tests (provider mappings, routing, config parsing, elicitation forms, OAuth helpers including the browser-URL gate) run under the workspace pnpm -r test.
  • Tested interactively against the todos server over both transports and eras with real providers, and against third-party remote MCP servers via --server including a full browser OAuth (dynamic client registration, PKCE, state validation) sign-in flow.
  • typecheck, lint, sync:snippets --check, and docs:check are green.

Breaking Changes

None — additive examples only; nothing is published.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update

Checklist

  • I have read the MCP Documentation
  • My code follows the repository's style guidelines
  • New and existing tests pass locally
  • I have added appropriate error handling
  • I have added or updated documentation as needed

Additional context

  • The pair deliberately deviates from the canonical story shape (single server.ts/client.ts pair, SDK calls inline): the host wiring across host/ is what the example documents. Both packages carry a shapeExempt note in package.json#example explaining this; cli-client/server.ts is a 5-line shim so the example runner's HTTP legs can spawn the paired server.
  • The LLMProvider seam is intentionally example-local, not SDK API — the SDK stays a protocol library; a host's message shapes belong to the host. Same for the mcpServers config parsing.
  • todos-server keeps explicit era branches in its interactive tools (push-style 2025 vs input_required 2026) so the two conversation models can be read side by side; if the server-side legacy fulfilment loop lands later, the legacy arms can be deleted.
  • On the legacy-era HTTP leg the sampling/elicitation steps are skipped: push-style server→client requests need a session, and the server runs createMcpHandler's default stateless posture there (same caveat as the sampling/ story).

…and todos-server

examples/cli-client is a complete LLM-connected MCP host: an interactive chat CLI with
no built-in tools, where everything comes from the servers it connects to (a URL via
--server with OAuth on 401, a spawned command line, or an mcpServers-style config).
The model sits behind a small LLMProvider seam with Scripted (keyless, used by CI),
Anthropic, OpenAI, and Gemini implementations that resolve the latest mid-tier model
from each provider's models API. The host wires the full client feature surface:
namespaced tool loop, @-mention resources as provenance-labelled context, /watch
subscriptions on both protocol eras, prompts as slash commands with tab completion
backed by completion/complete, approval-gated sampling, schema-driven elicitation
forms, roots, per-call progress and server logs, Ctrl-C cancellation, and full OAuth
(PKCE, dynamic client registration, state-checked loopback callback).

examples/todos-server is the workload it pairs with: a todo board serving both
protocol revisions from one codebase over stdio and Streamable HTTP, where every
server feature has a real job — CRUD tools (one with structuredContent), sampling-
backed prioritize, an elicitation-confirmed bulk delete, a multi-round brainstorm
flow whose requestState is a step-discriminated union signed via
createRequestStateCodec, paced progress with cancellation observation, request-tied
logging, resources with a completable template, and per-resource subscriptions.

cli-client/client.ts replays a scripted conversation against todos-server as a
self-verifying run:examples story across both transports and eras, asserting the
loop, sampling, the multi-round + signed-state flow, completions, cancellation,
progress, logging, and subscriptions end to end; story-local vitest covers the
provider mappings, routing, config parsing, forms, and OAuth helpers.

docs/host-integration.md is the companion "Building a host" guide: who should (and
should not) build a host, the provider seam and the tool loop (snippets synced from
the example source), then per-feature guidance narrated against the pair.
@felixweinberger felixweinberger requested a review from a team as a code owner June 29, 2026 11:11
@changeset-bot

changeset-bot Bot commented Jun 29, 2026

Copy link
Copy Markdown

⚠️ No Changeset found

Latest commit: 4ea873d

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

Comment thread examples/cli-client/host/ui.ts Fixed
Comment thread examples/cli-client/host/ui.ts Fixed
Comment thread examples/cli-client/host/ui.ts Fixed
Comment thread examples/cli-client/host/ui.ts Fixed
Comment thread examples/cli-client/script/scriptedUi.ts Fixed
Comment thread examples/cli-client/script/scriptedUi.ts Fixed
Comment thread examples/cli-client/script/scriptedUi.ts Fixed
@pkg-pr-new

pkg-pr-new Bot commented Jun 29, 2026

Copy link
Copy Markdown

Open in StackBlitz

@modelcontextprotocol/client

npm i https://pkg.pr.new/@modelcontextprotocol/client@2380

@modelcontextprotocol/codemod

npm i https://pkg.pr.new/@modelcontextprotocol/codemod@2380

@modelcontextprotocol/core

npm i https://pkg.pr.new/@modelcontextprotocol/core@2380

@modelcontextprotocol/server

npm i https://pkg.pr.new/@modelcontextprotocol/server@2380

@modelcontextprotocol/server-legacy

npm i https://pkg.pr.new/@modelcontextprotocol/server-legacy@2380

@modelcontextprotocol/express

npm i https://pkg.pr.new/@modelcontextprotocol/express@2380

@modelcontextprotocol/fastify

npm i https://pkg.pr.new/@modelcontextprotocol/fastify@2380

@modelcontextprotocol/hono

npm i https://pkg.pr.new/@modelcontextprotocol/hono@2380

@modelcontextprotocol/node

npm i https://pkg.pr.new/@modelcontextprotocol/node@2380

commit: 4ea873d

…ol-consent policy

- URL-mode elicitation now applies the same https-or-loopback check as the OAuth flow
  before offering to open a server-supplied URL (file:, javascript:, and plain-http
  phishing URLs fail closed to a decline). The check is a shared isSafeBrowserUrl
  helper, unit-tested, and the guide's URL-mode section now states the gate.
- The host-integration guide's security section gains a tool-consent bullet: the spec
  expects a human in the loop able to deny tool invocations; cli-client auto-executes
  because an interactive user watches every call, and a one-line comment at the
  execution site says an unattended host must gate execution on user consent.
Comment thread examples/cli-client/host/ui.ts
Comment thread examples/cli-client/host/host.ts
Comment thread examples/cli-client/host/loop.ts Outdated
…ient

The browser-authorization flow logged strings derived from the authorization URL
(origin in the consent/refusal lines, the full URL in an "opening …" status and in
the could-not-open fallback). None of it is a credential — the URL is exactly what
the user's browser is about to show — but log lines are the wrong channel for it:

- the consent and refusal lines now use static text (the URL adds nothing there),
- the "opening …" status no longer echoes the URL,
- the could-not-open fallback now presents the URL through the interactive prompt
  and waits for the user to confirm before polling the callback, instead of
  printing it and racing ahead.

This also clears CodeQL's js/clear-text-logging findings on the UI sinks, which
taint-tracked everything read off the OAuth provider into console output.
- Tab completion now offers /watch and the /exit alias (BUILTIN_COMMANDS was missing
  both, so the documented commands didn't complete).
- Declining a sampling request now answers with the spec's application-level code -1
  ("User rejected sampling request") instead of the reserved JSON-RPC InvalidRequest
  (-32600), matching the convention the e2e suite encodes; the guide's hand-written
  snippet follows.
- The /server:prompt dispatch regex (and the completer's prompt-args branch) accept
  the same server-name shapes mention parsing does, so dotted config keys advertised
  by /prompts actually dispatch instead of falling through to chat.
…EADME

Replace the separate host-integration guide page with a Design notes section in the
cli-client README covering the choices a copier should understand first: the
example-local provider seam, error results fed back to the model, untrusted-display
handling of server text, prompt role preservation, explicit fail-closed approvals,
the deliberate absence of a tool-execution gate in an interactive terminal (and what
an unattended host must add), and child-process env hygiene. The standalone guide
needs more rounds of refinement before it earns a docs-tree slot; the example and
its README stand on their own meanwhile.
@felixweinberger felixweinberger changed the title feat(examples,docs): reference host/server pair — cli-client + todos-server, and a host-integration guide feat(examples): add a reference host/server pair — cli-client + todos-server Jun 29, 2026
@felixweinberger felixweinberger merged commit ba27d20 into main Jun 29, 2026
18 checks passed
@felixweinberger felixweinberger deleted the fweinberger/cli-client-todos-server branch June 29, 2026 12:58
Comment on lines +36 to +46
const messages: Anthropic.MessageParam[] = [];

for (const message of request.messages) {
if (message.role === 'tool') {
const resultBlock: Anthropic.ToolResultBlockParam = {
type: 'tool_result',
tool_use_id: message.toolCallId,
is_error: message.isError ?? false,
content: message.content.map(part => partToBlock(part))
};
const previous = messages.at(-1);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 The role:'tool' branch builds tool_result content with message.content.map(partToBlock) without the empty-text filter that toContentBlocks() applies elsewhere, so an MCP tool result containing { type: 'text', text: '' } produces a tool_result with an empty text block, which the Anthropic Messages API rejects with a 400 — and since the broken tool message stays in session.messages, every later turn on that session fails too. Apply the same filter (part.type !== 'text' || part.text.length > 0) to the tool_result content; an empty content array is accepted, unlike an empty text block.

Extended reasoning...

The bug. toAnthropicRequest() filters out empty text parts for user and assistant messages via toContentBlocks() (part.type !== 'text' || part.text.length > 0) — that filter exists precisely because the Anthropic Messages API rejects text content blocks with an empty string (text: String should have at least 1 character). But the role: 'tool' branch builds the tool_result block with content: message.content.map(part => partToBlock(part)), with no filter, so an empty text part survives into the tool_result content array. The same min-length-1 validation applies to text blocks nested inside tool_result content, so the request 400s.

How it gets triggered. An MCP tool may legally return { content: [{ type: 'text', text: '' }] } — e.g. a shell/exec-style tool whose command produced no output, on any server connected via --server. host/content.ts passes it through unchanged: contentBlockToParts() returns [{ type: 'text', text: truncate('') }] = [{ type: 'text', text: '' }], and toolResultToParts() only substitutes the (tool returned no content) placeholder when the parts array is empty (parts.length === 0), not when it contains a single empty-string text part. The loop pushes it as a role: 'tool' message, and the next provider.generate() round emits the invalid tool_result.

Step-by-step proof.

  1. Model issues a tool call; the MCP server returns { content: [{ type: 'text', text: '' }] }.
  2. toolResultToParts()[{ type: 'text', text: '' }] (placeholder not triggered, length is 1).
  3. runModelRounds() pushes { role: 'tool', toolCallId, content: [{ type: 'text', text: '' }] } into session.messages.
  4. The next provider.generate() calls toAnthropicRequest(); the tool branch maps the part to { type: 'text', text: '' } inside the tool_result content.
  5. The Messages API responds 400 invalid_request_error ("text content blocks must be non-empty"), cli.ts prints error: … and drops the turn.
  6. Worse: the offending tool message is already in session.messages, so every subsequent generate() re-sends the same broken history and fails again — the conversation never recovers without restarting.

Why nothing else prevents it. The empty-text filter only lives in toContentBlocks(), which the tool branch deliberately bypasses (it needs the raw blocks to wrap in tool_result); toolResultToParts() only guards the zero-parts case. The OpenAI and Gemini mappings flatten tool results to plain text (partsToText), so they are unaffected — this is specific to the Anthropic mapping. The paired todos-server never returns an empty text block, so the e2e legs and unit tests don't catch it.

Impact and fix. This is example code, but providers/anthropic.ts is explicitly advertised (README and docs/host-integration.md) as "a complete, copyable mapping" for host authors, so the gap propagates to copies. The fix is one line — apply the same filter to the tool_result content:

content: message.content.filter(p => p.type !== 'text' || p.text.length > 0).map(part => partToBlock(part))

An empty content array on a tool_result is accepted by the API, unlike an empty text block, so no further fallback is needed.

Comment on lines +170 to +186
const result: CallToolResult = await server.client.callTool(
{
name: route.toolName,
arguments: call.arguments,
// On 2026-07-28 connections servers only emit log notifications for requests
// that opt in via this _meta key; on 2025 the setLoggingLevel call covers it.
...(server.era === 'modern' ? { _meta: { [LOG_LEVEL_META_KEY]: 'info' } } : {})
},
{
// Aborting this signal cancels the call: the SDK sends notifications/cancelled
// and the server can stop work via its own request signal.
signal: options?.signal,
onprogress: progress => {
const total = progress.total === undefined ? '' : `/${progress.total}`;
this.ui.status(`${call.name}: ${progress.message ?? 'working'} (${progress.progress}${total})`);
},
resetTimeoutOnProgress: true

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 On a 2025-era (--legacy) connection, the interactive todos-server tools (brainstorm_tasks, prioritize, clear_done) wait on the human inside the pending tools/call, but executeToolCall passes no timeout and the legacy-arm elicitInput/requestSampling calls in todos.ts don't either — so the SDK's 60s default applies and the documented --legacy interactive tour fails if the user takes more than a minute to answer the approval prompt or the brainstorm form. Passing a generous timeout (or maxTotalTimeout) here for interactive tool calls, and on the legacy-arm elicitInput/requestSampling calls, fixes it.

Extended reasoning...

The bug. executeToolCall (examples/cli-client/host/host.ts:170-186) calls server.client.callTool(...) with only { signal, onprogress, resetTimeoutOnProgress: true }. With no timeout set, the SDK default of 60 seconds applies (DEFAULT_REQUEST_TIMEOUT_MSEC = 60_000 in packages/core-internal/src/shared/protocol.ts:95, applied at protocol.ts:1509). On the server side, todos-server's legacy arms call ctx.mcpReq.elicitInput / ctx.mcpReq.requestSampling (examples/todos-server/todos.ts:452, 461, 470, 629, 681) with no options either, so the server→client request carries the same 60s default.\n\nWhy resetTimeoutOnProgress doesn't help. The only thing that resets the request timeout is a notifications/progress message (protocol.ts:1130-1132). An incoming server→client request (elicitation/create, sampling/createMessage) does not reset it, and the interactive tools emit no progress while they wait for the human. maxTotalTimeout is also unset, so even progress-based resets wouldn't cover the wait.\n\nThe code path that triggers it (legacy era only). On a 2025-11-25 connection the original tools/call stays pending while the server pushes elicitation/sampling requests back to the client and awaits them inline. The user's think time — reading the multi-line sampling-approval block the host deliberately prints in full, or filling the brainstorm theme/count form one field at a time (with retries on invalid input) — happens entirely inside both 60-second timers.\n\nStep-by-step proof. (1) Run the README's documented tour: terminal A serves todos-server, terminal B runs cli-client with --legacy (or spawn the server over stdio with --legacy). (2) Say "prioritize my open tasks". (3) The model calls mcp__todos__prioritize; todos-server's legacy arm calls ctx.mcpReq.requestSampling(...) and the host shows the full request text in an attention block, asking "Allow?". (4) The user reads the multi-line request and takes 70 seconds to answer. (5) At t=60s the host-side callTool rejects with a timeout — executeToolCall reports "Tool call failed: ... timed out" to the model — and/or the server-side requestSampling rejects, so the tool errors. The answer the user is mid-way through typing is discarded, and the readline question can be left dangling on screen while the model has already been told the call failed. The same happens with the brainstorm_tasks form and the clear_done confirmation.\n\nWhy nothing catches it. The modern (2026-07-28) era is unaffected because input_required completes the original request and the user wait happens between requests, outside any timer. The scripted e2e (ScriptedUI) answers instantly, so all four CI legs pass. The repo's own docs (docs/client.md, the timeout section) document the 60-second default and show passing a longer timeout as the remedy, but this reference host — explicitly meant to be copied — doesn't do it.\n\nHow to fix. Pass a generous timeout (and/or maxTotalTimeout) in the callTool options in executeToolCall — interactive tool calls in a host with a human in the loop should tolerate minutes, not seconds — and pass a longer timeout to the legacy-arm elicitInput / requestSampling calls in todos.ts. Both are one-line option additions.\n\nSeverity. This is example code, legacy-era only, and requires >60s of human latency, so it should not block the PR — but it is a concrete functional failure of the documented --legacy interactive tour rather than a style point, and the fix is the exact pattern the client guide already recommends.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants