Skip to content

feat(docs): publish OpenAPI spec, auth docs and developer portal#638

Merged
LinoGiger merged 2 commits into
mainfrom
feat(docs)/publish-openapi-and-auth-docs
Jun 26, 2026
Merged

feat(docs): publish OpenAPI spec, auth docs and developer portal#638
LinoGiger merged 2 commits into
mainfrom
feat(docs)/publish-openapi-and-auth-docs

Conversation

@RapidPoseidon

Copy link
Copy Markdown
Contributor

Context

Follow-up to #637, which took docs.rapidata.ai from 15 → 45/100 on orank. This pass addresses the remaining Access gaps — but only the ones that are real and honestly fixable in a docs repo.

Triage of the remaining gaps

orank gap Verdict What I did
OpenAPI spec published ✅ Real & valuable Publish the combined spec at /openapi.json
OAuth 2.0 support ⚠️ Already implemented, just undocumented Document it; do not re-implement
Scoped permissions ⚠️ Already exist (scopes in every operation) Surfaced via the spec + auth page
Developer portal ⚠️ Partly valid Add /developers aggregator (no fake sandbox)
Developer discoverability ⚠️ Mostly external (SERP indexing) Predictable URLs + llms.txt entries

The key realisation: OAuth 2.0, scopes and the OpenAPI specs all already exist and are live. auth.rapidata.ai/.well-known/openid-configuration returns the issuer, client_credentials grant and scopes; api.rapidata.ai/<service>/openapi/v1.json serves public specs. orank reported them missing only because nothing on docs.rapidata.ai pointed to them.

Changes

  • /openapi.jsonscripts/build_public_openapi.py takes the combined spec the SDK is generated from and rewrites the internal rabbitdata.ch host → public rapidata.ai (2 occurrences; matches what api.rapidata.ai already serves). All 264 operations already declare OpenIdConnect scopes, so the published spec advertises OAuth 2.0 and scoped permissions machine-readably. Generated at deploy time so it tracks the repo's CI-updated spec.
  • docs/authentication.md — the real OAuth 2.0 / OIDC client-credentials flow, token endpoint, scopes table, SDK + curl examples, link to the live discovery doc. Added to nav and llms.txt.
  • /developers/ — a developer-portal landing page aggregating SDK, OpenAPI, auth, quickstart, API reference and the agent skill, with WebAPI JSON-LD.
  • llms.txt / landing page — now list the OpenAPI spec, auth and developer portal.

Deliberately NOT done (to avoid chasing points dishonestly)

  • No OAuth implementation. It already exists; this is documentation only.
  • No fake /.well-known/oauth-authorization-server on the docs origin. Per RFC 8414 that metadata must be served by the issuer to be valid; serving a mismatched copy from docs would mislead agents. We link the real openid-configuration on auth.rapidata.ai instead. (Optional backend follow-up: the identity service could expose the oauth-authorization-server alias next to the existing openid-configuration.)
  • No "sandbox". The portal links real resources only.

Verification

mkdocs build passes with the new page and nav; the generated per-version llms.txt now lists Authentication; build_public_openapi.py produces a valid 229-path spec with the correct host (0 rabbitdata refs) and OpenIdConnect scheme; YAML and all JSON-LD validated; script black-formatted. Root files take effect on the next Deploy Documentation run.

🔗 Session: https://session-b4f9bfe9.poseidon.rapidata.internal/

Second pass on agent-readiness (orank 45/100). The remaining Access gaps were
all about resources that exist but aren't surfaced on docs.rapidata.ai:

- OpenAPI: the API already serves public per-service specs at
  api.rapidata.ai/<service>/openapi/v1.json. Publish the combined spec at the
  predictable /openapi.json, rewriting the internal rabbitdata.ch host to the
  public rapidata.ai one. Every operation already declares its OpenIdConnect
  scopes, so this also advertises OAuth 2.0 + scoped permissions machine-readably.
- Auth: add an Authentication page documenting the real OAuth 2.0 / OIDC
  client-credentials flow (auth.rapidata.ai), scopes and token endpoint, and
  link the live discovery document.
- Developer portal: add a /developers landing page aggregating the SDK,
  OpenAPI spec, auth, quickstart, API reference and agent skill.

Deliberately not done: OAuth 2.0 is already implemented (not a docs change), so
nothing new is "implemented" here — only documented. No fake
/.well-known/oauth-authorization-server is served from the docs origin, since
that metadata must be served by the issuer (auth.rapidata.ai) to be valid; we
link the real openid-configuration instead.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: lino <lino@rapidata.ai>
@claude

claude Bot commented Jun 26, 2026

Copy link
Copy Markdown

Code Review

This PR improves docs.rapidata.ai's developer discoverability by publishing a combined OpenAPI spec, adding an authentication guide, and adding a developer portal landing page. The changes are well-scoped and the PR description is honest about what was and wasn't done. A few issues worth addressing:


1. git add hardcoded list does not cover all files copied by cp -R — new site_root files will be silently unpublished

File: .github/workflows/deploy_doc.yml:80

The recursive copy brings everything from site_root/ into the worktree, but only the hardcoded filenames get staged. Any file added to site_root/ in the future (e.g. sitemap.xml, a new portal sub-page not under developers/) will be copied to the worktree but never committed — the workflow will silently report "already up to date". Consider replacing the hardcoded git add with git add . inside the ghpages worktree, or at minimum add a check that no untracked files remain after the explicit git add.


2. rm -f ghpages/README.md is never staged — README.md cannot be removed from gh-pages if it lands there

File: .github/workflows/deploy_doc.yml:72-73

site_root/README.md exists in the repo. The rm -f prevents it from being accidentally committed on this run (since git add does not list it), which is fine for the initial case. However, if README.md ever ends up tracked in gh-pages (e.g. via a manual commit), this approach cannot remove it — the deletion stays unstaged, git diff --cached --quiet passes, and README.md persists in the branch indefinitely. A safer pattern: after cd ghpages, run git rm --cached README.md --ignore-unmatch so the deletion is staged when the file is in the index.


3. Minified JSON output makes openapi.json git history unusable

File: scripts/build_public_openapi.py:51

out.write_text(json.dumps(spec), encoding="utf-8") has no indent argument, so every time the spec changes the entire openapi.json appears as one enormous line in git diff. Since this file is committed to gh-pages and the workflow checks git diff --cached --quiet to skip redundant commits, reviewable history matters here. Fix: json.dumps(spec, indent=2) + "\n".


4. Comment and string replacement are misleading — server URL replacement is dead work

File: scripts/build_public_openapi.py:29-34

The comment says "rewrite both the server and the OIDC discovery URL" but the servers list is unconditionally overwritten two lines later by spec["servers"] = [...], so the string replace does no useful work for the server URL. The code is correct (the openIdConnectUrl replacement in the security scheme is necessary and works), but the comment is inaccurate and the server-URL pass in the string replace is dead. Either update the comment or make the OIDC-URL replacement explicit (target spec["components"]["securitySchemes"][...]["openIdConnectUrl"] directly) to match the style used for the server assignment.


Minor notes:

  • The build() function reads from a module-level SOURCE constant, making it impossible to call with a fixture spec without monkey-patching. Accepting an optional source: Path = SOURCE parameter would cost nothing at runtime and make the function independently testable.
  • The rabbitdata.ch global string replace is safe in practice (only 2 occurrences in the current spec, both in URL fields), but worth noting for anyone who extends the script.

The previous cp -R copied all of site_root/ (including the repo-only README.md)
then deleted the README afterwards. Switch to an explicit allowlist so repo-only
files can never leak onto the published site and no delete is needed.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: lino <lino@rapidata.ai>
@claude

claude Bot commented Jun 26, 2026

Copy link
Copy Markdown

Code Review

This PR publishes the combined OpenAPI spec at `/openapi.json`, adds proper OAuth 2.0 / OIDC authentication documentation, and adds a developer portal landing page. The intent and content are solid — a few concrete issues to address before merging.


Confirmed bugs

1. `cp -R site_root/developers ghpages/developers` nests the directory on every re-deployment
.github/workflows/deploy_doc.yml line 76

`cp -R src dst` when `dst` already exists copies `src` into `dst`, producing `ghpages/developers/developers/index.html` instead of overwriting `ghpages/developers/index.html`. On the second deployment, `git add developers` stages the wrong nested path, the top-level `index.html` is never updated, and the directory grows deeper on every subsequent run.

Fix — use the trailing-dot form to always merge-in-place:

mkdir -p ghpages/developers
cp -R site_root/developers/. ghpages/developers/

Or `rm -rf ghpages/developers` before the copy.


2. `info.setdefault("description", …)` silently keeps the internal description
scripts/build_public_openapi.py line 37–41

The source spec already has `info.description = "The API for the Rapidata Asset service"`. `setdefault` is a no-op when the key exists, so the public `openapi.json` will publish that internal-facing string instead of the intended public description. The `title` field is correctly overwritten with direct assignment — `description` should be too:

info["description"] = (
    "Public Rapidata API. Authentication uses OAuth 2.0 (OpenID Connect) — "
    "see https://docs.rapidata.ai/latest/authentication/."
)

3. `"EntryPoint"` is the wrong JSON-LD property casing in both HTML files
site_root/developers/index.html line 29 · site_root/index.html line 59

JSON-LD is case-sensitive. The Schema.org property is `entryPoint` (lowercase-e); `"EntryPoint"` (uppercase) is treated as an unknown term and silently dropped by every structured-data parser and search-engine crawler. The type value inside (`"@type": "EntryPoint"`) is correct — only the outer key is wrong.

// wrong
"EntryPoint": { "@type": "EntryPoint", ... }

// correct
"entryPoint": { "@type": "EntryPoint", ... }

Same fix needed in both site_root/developers/index.html and site_root/index.html.


Notable (worth a comment, not blocking)

4. Shell-history exposure of `client_secret` in the curl example
docs/authentication.md lines 39–43

The curl snippet passes the secret as a `-d` flag value — which is safe for the HTTP request itself (sent in the POST body, not the URL) but will persist the full command — including any substituted real secret — in the user's shell history. Worth adding a one-liner note:

# Tip: use an env var to avoid recording the secret in shell history:
# export RAPIDATA_CLIENT_SECRET=your_secret
# -d "client_secret=$RAPIDATA_CLIENT_SECRET"

5. Fragile hostname rewrite via raw string replace
scripts/build_public_openapi.py line 31

Currently safe (exactly 2 occurrences of rabbitdata.ch, both hostnames), but a description or example added to the internal spec that mentions rabbitdata.ch for non-hostname reasons would be silently mutated. An explicit rewrite of only the two known locations (the server URL and the `openIdConnectUrl`) would be more robust and self-documenting.

6. The "Overview & core concepts" page lost its root-level entry point
site_root/developers/index.html / site_root/index.html

The `starting_page/` link was removed from the root `index.html`. The page is still reachable through the versioned nav and `llms.txt`, but the new `developers/index.html` — the intended canonical hub — has no link to it either. Adding it under "Guides & examples" would close the gap for non-JS crawlers and AI agents that index from HTML.

@LinoGiger LinoGiger merged commit 670e636 into main Jun 26, 2026
2 checks passed
@LinoGiger LinoGiger deleted the feat(docs)/publish-openapi-and-auth-docs branch June 26, 2026 14:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants