Skip to content

fix(wiki-explorer): prevent path-traversal SSRF bypass in URL validation#648

Open
sebastiondev wants to merge 1 commit intomodelcontextprotocol:mainfrom
sebastiondev:fix/cwe918-server-wiki-ba62
Open

fix(wiki-explorer): prevent path-traversal SSRF bypass in URL validation#648
sebastiondev wants to merge 1 commit intomodelcontextprotocol:mainfrom
sebastiondev:fix/cwe918-server-wiki-ba62

Conversation

@sebastiondev
Copy link
Copy Markdown

Summary

The get-first-degree-links tool in examples/wiki-explorer-server validates the user-supplied url argument with a regex applied to the raw string:

if (!url.match(/^https?:\/\/[a-z]+\.wikipedia\.org\/wiki\//)) {
  throw new Error("Not a valid Wikipedia URL");
}
// ...
const response = await fetch(url);

A URL like https://en.wikipedia.org/wiki/../../w/api.php?action=query&list=allusers matches this regex, but fetch() performs WHATWG URL normalization which collapses the ../.. segments. The request that actually leaves the server targets https://en.wikipedia.org/w/api.php?... — the MediaWiki API, which the /wiki/ filter was meant to keep out of reach.

Confirmed in a Node REPL:

> new URL("https://en.wikipedia.org/wiki/../../w/api.php?action=query&list=allusers").pathname
'/w/api.php'
  • CWE: CWE-918 (Server-Side Request Forgery) — scoped variant
  • File / function: examples/wiki-explorer-server/server.ts, get-first-degree-links tool handler
  • Data flow: tool arguments.url (attacker-controlled) → regex check (string-level only) → fetch(url) (URL-normalized) → response body returned to the caller

Impact (honest scoping)

The host regex itself is correct and unchanged, so the request still lands on *.wikipedia.org. This is not a classic SSRF that reaches 127.0.0.1 or cloud metadata. The realistic impact is that an LLM-supplied argument can cause the tool to hit MediaWiki API endpoints (e.g. action=query&list=allusers) and surface their JSON/HTML response as if it were a Wikipedia article — bypassing the namespace allowlist the tool tries to enforce, and producing unexpected data in the model's context. It's a defensive hardening fix rather than a critical SSRF, but the bypass of the explicit /wiki/ filter is real.

Fix

Two small, targeted changes in server.ts:

  1. Replace the raw-string regex with a parsed-URL check (isValidWikipediaUrl) that validates parsed.protocol, parsed.hostname, and crucially parsed.pathname.startsWith("/wiki/") after normalization. This closes the path-traversal bypass.
  2. Pass { redirect: "error" } to fetch() so a Wikipedia redirect cannot be used to send the request to a different host or path that the validator never saw.

The host allowlist (<lang>.wikipedia.org) is preserved exactly.

Tests

Added examples/wiki-explorer-server/server.test.ts with cases that:

  • reject non-Wikipedia hosts (https://evil.com/wiki/Test)
  • reject the path-traversal payload that bypassed the old regex (https://en.wikipedia.org/wiki/../../w/api.php?...)
  • reject non-/wiki/ paths on a valid host
  • accept legitimate article URLs

The traversal test fails against the old code and passes against the patched code.

Adversarial review

Before submitting we tried to talk ourselves out of this. The host check is unchanged and correct, so reachable targets are still limited to public Wikipedia mirrors — there's no path to internal services or cloud metadata. The server is an example MCP server typically run locally over stdio, so the attacker model is "a malicious tool-call argument coming from the LLM/client", not a remote network attacker. We still think it's worth fixing: the /wiki/ namespace filter is a security boundary the tool explicitly tries to enforce, the bypass is trivial to trigger, and the fix is two lines plus tests with no behaviour change for legitimate URLs.

cc @lewiswigmore

Replace raw regex matching on the URL string with parsed URL validation
using the URL constructor. This prevents path-traversal attacks where
URLs like `/wiki/../../w/api.php` pass the regex but resolve to paths
outside `/wiki/` after URL normalization.

Also set `redirect: "error"` on fetch() to prevent following redirects
to non-Wikipedia domains.

CWE-918
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant