Skip to content

feat: add DocsGPT semantic search with backwards-compatible fallback#18

Merged
critesjosh merged 4 commits intomainfrom
feat/docsgpt-semantic-search
May 1, 2026
Merged

feat: add DocsGPT semantic search with backwards-compatible fallback#18
critesjosh merged 4 commits intomainfrom
feat/docsgpt-semantic-search

Conversation

@critesjosh
Copy link
Copy Markdown
Collaborator

Summary

  • Adds DocsGPT as an optional semantic search backend for aztec_search_docs and aztec_lookup_error, activated when API_KEY is set
  • Preserves full backwards compatibility: schema always advertises section/maxResults, and DocsGPT errors gracefully fall back to ripgrep over local docs
  • aztec_lookup_error gains a semantic fallback that searches documentation when the static error catalog has no matches

Review feedback addressed

  • [P1] Schema regression: section and maxResults are always present in the advertised schema regardless of DocsGPT config. maxResults maps to chunks in semantic mode (chunks ?? maxResults ?? 5). Schema descriptions clarify that section applies to local fallback only.
  • [P2] No fallback on DocsGPT failure: DocsGPT errors now fall through to ripgrep search instead of returning failure immediately. If local docs are cloned, the search still works; if not, the user gets the standard "run aztec_sync_repos" message.

Test plan

  • tsc --noEmit passes
  • vitest run tests/tools/search.test.ts — 19 tests pass, including:
    • Semantic results returned from DocsGPT client
    • chunks parameter respected
    • maxResults used as fallback for chunks in semantic mode
    • chunks takes precedence over maxResults when both provided
    • DocsGPT error falls back to ripgrep when local docs exist
    • DocsGPT error returns not-cloned message when no local docs
  • vitest run tests/utils/error-lookup.test.ts — 15 tests pass

🤖 Generated with Claude Code

critesjosh and others added 4 commits April 13, 2026 15:02
Integrate DocsGPT as an optional semantic search backend for documentation
queries and error lookup fallback, activated when API_KEY is set. Preserves
the existing ripgrep-based search as a fallback when DocsGPT is unavailable
or unconfigured, ensuring no public-contract regression for existing callers.

- Add DocsGPT HTTP client (src/backends/docsgpt-client.ts)
- aztec_search_docs uses semantic search when API_KEY is configured,
  falls back to ripgrep on DocsGPT errors if local docs are cloned
- aztec_lookup_error falls back to semantic doc search when static
  catalog produces no matches
- Schema always advertises section/maxResults for backwards compat;
  descriptions clarify semantic-mode behavior
- maxResults maps to chunks in semantic mode (chunks ?? maxResults ?? 5)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…anitized error-lookup

Addresses operational footguns and missing safety rails on top of the
initial DocsGPT integration in this PR:

- Default API_URL → https://aztec.adjacentpossible.dev. The previous
  http://localhost:7091 default sent the user's API key to whatever
  was on their loopback port 7091 if API_URL was forgotten, then
  swallowed the failure as a silent ripgrep fallback so the user
  never realized semantic search was off.

- Stop swallowing DocsGPT errors. searchAztecDocs now returns
  kind="error" with a clear remediation message when the semantic
  call fails. New `useLocalFallback?: boolean` opt-in (default false)
  preserves the previous behavior for callers that want it. When
  fallback is enabled AND both backends fail, surface BOTH errors so
  the user sees the full picture.

- Version-sync gate. New /api/version endpoint on docsgpt returns
  the corpus version it indexed; MCP fetches it (POST + redirect:manual
  for CF-Access compatibility) via DocsGPTClient.getCorpusVersion(),
  caches per-baseUrl with 5-min positive / 30-s negative TTLs, and
  compares against the local aztec-packages clone tag (or
  DEFAULT_AZTEC_VERSION fallback). On mismatch + no override → return
  kind="version-mismatch" with both versions and remediation.
  `allowVersionMismatch?: boolean` overrides the gate. On mismatch +
  useLocalFallback=true → search local docs (which match the local
  clone version) with an explanatory message. 404/network/shape
  errors degrade to "unknown" so older or temporarily-broken backends
  don't permanently block search.

- chunks validation. Schema gets minimum:1, maximum:20. Client clamps
  Math.trunc, rejects non-finite, truncates floats. Backend also
  clamps for defense-in-depth.

- Non-array response shape now throws DocsGPTClientError("Unexpected
  response shape") instead of silently returning []. Future contract
  drift surfaces loudly.

- lookupAztecError gets a `semanticHealth` field separating "static
  catalog hit"/"semantic ran"/"semantic failed"/"version mismatch" so
  callers can distinguish "no docs exist" from "the backend is
  broken". On semantic failure: 401 → sanitized "/mcp-key" hint,
  everything else → generic "currently unavailable" — never echoes
  raw upstream error strings to the user.

- Auto-resync skip refined: only skip when useLocalFallback !== true.
  When the caller has opted into local fallback, we WILL ripgrep
  cloned docs on a semantic failure, so they need to be fresh.

Tests: 241 passing locally. Added tests/utils/version-check.test.ts,
tests/tools/error-lookup.test.ts, tests/backends/docsgpt-client.test.ts.
Rewrote tests/tools/search.test.ts to cover error reporting, useLocalFallback
both with-and-without local docs, version-mismatch + useLocalFallback
interaction, version-mismatch + override, and the version-cache "unknown"
degradation paths.

Companion docsgpt change ships /api/version + /api/search global rerank +
URL rewriting in critesjosh/docsgpt-aztec#63.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- README: new "API Key (optional, recommended)" section with the Noir
  Discord invite (https://discord.gg/xMud5StFyA) and step-by-step for
  obtaining a key via `/mcp-key`. Configuration table now lists every
  env var (API_KEY, API_URL, REQUEST_TIMEOUT, AZTEC_DEFAULT_VERSION,
  AZTEC_MCP_REPOS_DIR). Added the missing `aztec_lookup_error` tool
  section. `aztec_search_docs` parameters expanded to cover the new
  semantic-only flags (chunks, useLocalFallback, allowVersionMismatch).

- Tool descriptions in src/index.ts (the text the LLM consuming the
  MCP actually reads): when no API_KEY is configured, the descriptions
  for `aztec_search_docs` and `aztec_lookup_error` now explicitly tell
  the model that local-only mode is active AND instruct it to suggest
  the user get a free API key via `/mcp-key` in the Aztec/Noir Discord
  if a query exceeds what local search can answer. This makes the
  upgrade path discoverable through normal model use, not just docs.

- Startup log when no API_KEY: was "code search only (set API_KEY for
  docs)" — now names the Discord invite link so an operator running
  with `--debug` or watching stderr immediately sees the path forward.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The README is invisible to LLMs consuming the MCP — the only surfaces
the model sees are tool descriptions, parameter docs, and the
server-level `instructions` field returned in InitializeResult. Tool
descriptions cover per-call hints (added in the previous commit), but
they're only seen when the model decides to call that specific tool.
The `instructions` field is read once per session as system context
by Claude Desktop, Cursor, and other MCP-aware clients — better
placement for the persistent "what is this server + how do I get more"
message.

Two variants:

- LOCAL_ONLY_INSTRUCTIONS (no API_KEY): tells the model the server is
  in degraded mode and explicitly directs it to recommend `/mcp-key`
  in the Aztec/Noir Discord (https://discord.gg/xMud5StFyA) when a
  user query exceeds what local search can answer.

- SEMANTIC_INSTRUCTIONS (API_KEY set): tells the model semantic search
  is enabled, lists the 12 indexed corpora, and explains the
  version-sync gate behavior (so it can guide the user through a
  `version-mismatch` result intelligently — sync repos vs. pass
  `allowVersionMismatch: true`).

Verified end-to-end via stdio: a fresh `initialize` request returns
the matching instructions string in result.instructions for both
modes, including the Discord invite link in local-only mode and
absent from semantic mode (no nag once configured).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@critesjosh critesjosh merged commit 977185b into main May 1, 2026
6 checks passed
@critesjosh critesjosh deleted the feat/docsgpt-semantic-search branch May 1, 2026 19:56
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 1, 2026

🎉 This PR is included in version 1.20.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant