feat: add DocsGPT semantic search with backwards-compatible fallback#18
Merged
critesjosh merged 4 commits intomainfrom May 1, 2026
Merged
feat: add DocsGPT semantic search with backwards-compatible fallback#18critesjosh merged 4 commits intomainfrom
critesjosh merged 4 commits intomainfrom
Conversation
Integrate DocsGPT as an optional semantic search backend for documentation queries and error lookup fallback, activated when API_KEY is set. Preserves the existing ripgrep-based search as a fallback when DocsGPT is unavailable or unconfigured, ensuring no public-contract regression for existing callers. - Add DocsGPT HTTP client (src/backends/docsgpt-client.ts) - aztec_search_docs uses semantic search when API_KEY is configured, falls back to ripgrep on DocsGPT errors if local docs are cloned - aztec_lookup_error falls back to semantic doc search when static catalog produces no matches - Schema always advertises section/maxResults for backwards compat; descriptions clarify semantic-mode behavior - maxResults maps to chunks in semantic mode (chunks ?? maxResults ?? 5) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…anitized error-lookup Addresses operational footguns and missing safety rails on top of the initial DocsGPT integration in this PR: - Default API_URL → https://aztec.adjacentpossible.dev. The previous http://localhost:7091 default sent the user's API key to whatever was on their loopback port 7091 if API_URL was forgotten, then swallowed the failure as a silent ripgrep fallback so the user never realized semantic search was off. - Stop swallowing DocsGPT errors. searchAztecDocs now returns kind="error" with a clear remediation message when the semantic call fails. New `useLocalFallback?: boolean` opt-in (default false) preserves the previous behavior for callers that want it. When fallback is enabled AND both backends fail, surface BOTH errors so the user sees the full picture. - Version-sync gate. New /api/version endpoint on docsgpt returns the corpus version it indexed; MCP fetches it (POST + redirect:manual for CF-Access compatibility) via DocsGPTClient.getCorpusVersion(), caches per-baseUrl with 5-min positive / 30-s negative TTLs, and compares against the local aztec-packages clone tag (or DEFAULT_AZTEC_VERSION fallback). On mismatch + no override → return kind="version-mismatch" with both versions and remediation. `allowVersionMismatch?: boolean` overrides the gate. On mismatch + useLocalFallback=true → search local docs (which match the local clone version) with an explanatory message. 404/network/shape errors degrade to "unknown" so older or temporarily-broken backends don't permanently block search. - chunks validation. Schema gets minimum:1, maximum:20. Client clamps Math.trunc, rejects non-finite, truncates floats. Backend also clamps for defense-in-depth. - Non-array response shape now throws DocsGPTClientError("Unexpected response shape") instead of silently returning []. Future contract drift surfaces loudly. - lookupAztecError gets a `semanticHealth` field separating "static catalog hit"/"semantic ran"/"semantic failed"/"version mismatch" so callers can distinguish "no docs exist" from "the backend is broken". On semantic failure: 401 → sanitized "/mcp-key" hint, everything else → generic "currently unavailable" — never echoes raw upstream error strings to the user. - Auto-resync skip refined: only skip when useLocalFallback !== true. When the caller has opted into local fallback, we WILL ripgrep cloned docs on a semantic failure, so they need to be fresh. Tests: 241 passing locally. Added tests/utils/version-check.test.ts, tests/tools/error-lookup.test.ts, tests/backends/docsgpt-client.test.ts. Rewrote tests/tools/search.test.ts to cover error reporting, useLocalFallback both with-and-without local docs, version-mismatch + useLocalFallback interaction, version-mismatch + override, and the version-cache "unknown" degradation paths. Companion docsgpt change ships /api/version + /api/search global rerank + URL rewriting in critesjosh/docsgpt-aztec#63. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- README: new "API Key (optional, recommended)" section with the Noir Discord invite (https://discord.gg/xMud5StFyA) and step-by-step for obtaining a key via `/mcp-key`. Configuration table now lists every env var (API_KEY, API_URL, REQUEST_TIMEOUT, AZTEC_DEFAULT_VERSION, AZTEC_MCP_REPOS_DIR). Added the missing `aztec_lookup_error` tool section. `aztec_search_docs` parameters expanded to cover the new semantic-only flags (chunks, useLocalFallback, allowVersionMismatch). - Tool descriptions in src/index.ts (the text the LLM consuming the MCP actually reads): when no API_KEY is configured, the descriptions for `aztec_search_docs` and `aztec_lookup_error` now explicitly tell the model that local-only mode is active AND instruct it to suggest the user get a free API key via `/mcp-key` in the Aztec/Noir Discord if a query exceeds what local search can answer. This makes the upgrade path discoverable through normal model use, not just docs. - Startup log when no API_KEY: was "code search only (set API_KEY for docs)" — now names the Discord invite link so an operator running with `--debug` or watching stderr immediately sees the path forward. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The README is invisible to LLMs consuming the MCP — the only surfaces the model sees are tool descriptions, parameter docs, and the server-level `instructions` field returned in InitializeResult. Tool descriptions cover per-call hints (added in the previous commit), but they're only seen when the model decides to call that specific tool. The `instructions` field is read once per session as system context by Claude Desktop, Cursor, and other MCP-aware clients — better placement for the persistent "what is this server + how do I get more" message. Two variants: - LOCAL_ONLY_INSTRUCTIONS (no API_KEY): tells the model the server is in degraded mode and explicitly directs it to recommend `/mcp-key` in the Aztec/Noir Discord (https://discord.gg/xMud5StFyA) when a user query exceeds what local search can answer. - SEMANTIC_INSTRUCTIONS (API_KEY set): tells the model semantic search is enabled, lists the 12 indexed corpora, and explains the version-sync gate behavior (so it can guide the user through a `version-mismatch` result intelligently — sync repos vs. pass `allowVersionMismatch: true`). Verified end-to-end via stdio: a fresh `initialize` request returns the matching instructions string in result.instructions for both modes, including the Discord invite link in local-only mode and absent from semantic mode (no nag once configured). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
🎉 This PR is included in version 1.20.0 🎉 The release is available on: Your semantic-release bot 📦🚀 |
3 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
aztec_search_docsandaztec_lookup_error, activated whenAPI_KEYis setsection/maxResults, and DocsGPT errors gracefully fall back to ripgrep over local docsaztec_lookup_errorgains a semantic fallback that searches documentation when the static error catalog has no matchesReview feedback addressed
sectionandmaxResultsare always present in the advertised schema regardless of DocsGPT config.maxResultsmaps tochunksin semantic mode (chunks ?? maxResults ?? 5). Schema descriptions clarify thatsectionapplies to local fallback only.Test plan
tsc --noEmitpassesvitest run tests/tools/search.test.ts— 19 tests pass, including:chunksparameter respectedmaxResultsused as fallback forchunksin semantic modechunkstakes precedence overmaxResultswhen both providedvitest run tests/utils/error-lookup.test.ts— 15 tests pass🤖 Generated with Claude Code