Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 26 additions & 0 deletions .github/workflows/deploy_doc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -56,3 +56,29 @@ jobs:
else
uv run mike deploy --push ${{ github.event.inputs.version_alias }}
fi
- name: Publish site-root files (llms.txt, robots.txt, landing page)
# mike serves every version under a subdirectory, so the generated
# llms.txt/robots.txt live under /latest/ and are invisible at the site
# root where crawlers and AI agents look. Mirror the curated root files
# (and the generated llms-full.txt) into the gh-pages root after mike has
# written the version. mike rewrites the root index.html as a bare
# redirect on set-default, so this must run last to keep our richer one.
if: hashFiles('site_root/**') != ''
run: |
set -euo pipefail
git fetch origin gh-pages
git worktree add ghpages gh-pages
cp site_root/robots.txt ghpages/robots.txt
cp site_root/llms.txt ghpages/llms.txt
cp site_root/index.html ghpages/index.html
if [ -f ghpages/latest/llms-full.txt ]; then
cp ghpages/latest/llms-full.txt ghpages/llms-full.txt
fi
cd ghpages
git add robots.txt llms.txt index.html llms-full.txt
if git diff --cached --quiet; then
echo "Site-root files already up to date."
else
git commit -m "chore(docs): publish root llms.txt, robots.txt and landing page"
git push origin gh-pages
fi
20 changes: 20 additions & 0 deletions site_root/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# `site_root/`

Files published to the **root** of the documentation site (`docs.rapidata.ai/`),
outside of `mike`'s per-version subdirectories.

`mike` serves each docs version under its own path (`/latest/`, `/3.x/`, …), so
files generated by the build — including the per-version `llms.txt` and
`llms-full.txt` — are only reachable under `/latest/…`. Crawlers and AI agents
look for `llms.txt` and `robots.txt` at the site root, so the `Deploy
Documentation` workflow copies these files into the gh-pages root after `mike`
has written the version (see `.github/workflows/deploy_doc.yml`).

| File | Purpose |
|------|---------|
| `robots.txt` | Allows all crawlers (AI crawlers listed explicitly) and points to the sitemap. |
| `llms.txt` | Curated [llms.txt](https://llmstxt.org/) index of the docs and how to integrate. |
| `index.html` | Root landing page: real content + structured data for crawlers, JS redirect to `/latest/` for humans. |

`llms-full.txt` is **not** stored here — it is generated per build and copied to
the root from `/latest/llms-full.txt` by the workflow.
88 changes: 88 additions & 0 deletions site_root/index.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>Rapidata Python SDK — Documentation</title>
<meta name="description" content="Documentation for the Rapidata Python SDK: request human feedback at scale — crowd-sourced labeling, model evaluation, ranking, and preference data — directly from Python.">
<link rel="canonical" href="https://docs.rapidata.ai/latest/">

<meta property="og:type" content="website">
<meta property="og:site_name" content="Rapidata Documentation">
<meta property="og:title" content="Rapidata Python SDK — Documentation">
<meta property="og:description" content="Request human feedback at scale — labeling, model evaluation, ranking, and preference data — directly from Python.">
<meta property="og:url" content="https://docs.rapidata.ai/">
<meta name="twitter:card" content="summary">
<meta name="twitter:title" content="Rapidata Python SDK — Documentation">
<meta name="twitter:description" content="Request human feedback at scale, directly from Python.">

<script type="application/ld+json">
{
"@context": "https://schema.org",
"@graph": [
{
"@type": "Organization",
"@id": "https://www.rapidata.ai/#organization",
"name": "Rapidata",
"url": "https://www.rapidata.ai",
"logo": "https://docs.rapidata.ai/media/rapidata.svg",
"sameAs": [
"https://github.com/RapidataAI",
"https://pypi.org/project/rapidata/",
"https://www.linkedin.com/company/rapidata-ai"
]
},
{
"@type": "WebSite",
"@id": "https://docs.rapidata.ai/#website",
"name": "Rapidata Python SDK Documentation",
"url": "https://docs.rapidata.ai/",
"publisher": { "@id": "https://www.rapidata.ai/#organization" }
},
{
"@type": "SoftwareApplication",
"name": "Rapidata Python SDK",
"applicationCategory": "DeveloperApplication",
"operatingSystem": "OS Independent",
"url": "https://docs.rapidata.ai/latest/",
"downloadUrl": "https://pypi.org/project/rapidata/",
"softwareHelp": "https://docs.rapidata.ai/latest/quickstart/",
"publisher": { "@id": "https://www.rapidata.ai/#organization" },
"offers": { "@type": "Offer", "price": "0", "priceCurrency": "USD" }
}
]
}
</script>

<!-- Send human visitors to the default (latest) version. Crawlers and agents
that do not execute JavaScript still get the content and links below. -->
<script>
window.location.replace("latest/" + window.location.search + window.location.hash);
</script>
<noscript><meta http-equiv="refresh" content="0; url=latest/"></noscript>
</head>
<body>
<main>
<h1>Rapidata Python SDK</h1>
<p>
Rapidata provides human feedback at scale — crowd-sourced labeling, model
evaluation, ranking, and preference data collected from real people. The
supported way to access it programmatically is the Rapidata Python SDK,
documented here.
</p>
<p>Install with <code>pip install rapidata</code>, then authenticate with a token from
<a href="https://app.rapidata.ai/settings/tokens">app.rapidata.ai/settings/tokens</a>.</p>
<ul>
<li><a href="latest/">Documentation home</a></li>
<li><a href="latest/quickstart/">Quick Start</a></li>
<li><a href="latest/starting_page/">Overview &amp; core concepts</a></li>
<li><a href="latest/api/">API reference</a></li>
<li><a href="latest/ai_agents/">Use Rapidata from your AI agent</a></li>
<li><a href="llms.txt">llms.txt</a> · <a href="llms-full.txt">llms-full.txt</a></li>
<li><a href="https://github.com/RapidataAI/rapidata-python-sdk">Source on GitHub</a></li>
<li><a href="https://pypi.org/project/rapidata/">Package on PyPI</a></li>
</ul>
<p>Redirecting to the <a href="latest/">latest documentation</a>…</p>
</main>
</body>
</html>
54 changes: 54 additions & 0 deletions site_root/llms.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# Rapidata

> Rapidata provides human feedback at scale: crowd-sourced labeling, model
> evaluation, ranking, and preference data (RLHF-style) collected from real
> people. The supported way to access it programmatically is the Rapidata
> Python SDK, documented on this site (docs.rapidata.ai).

Integrate via the Python SDK (`pip install rapidata`). Authenticate either with
an interactive browser login on first run, or with a client ID/secret token
created at https://app.rapidata.ai/settings/tokens.

## Guides

- [Overview](https://docs.rapidata.ai/latest/starting_page/): what Rapidata does and its core concepts
- [Quick Start](https://docs.rapidata.ai/latest/quickstart/): install, authenticate, and create your first order
- [Custom Audiences](https://docs.rapidata.ai/latest/audiences/): target responses by country, language, and qualification
- [Signals](https://docs.rapidata.ai/latest/signals/)
- [Parameter Reference](https://docs.rapidata.ai/latest/job_definition_parameters/)
- [Understanding Results](https://docs.rapidata.ai/latest/understanding_the_results/)
- [Early Stopping](https://docs.rapidata.ai/latest/confidence_stopping/)
- [Instruction Design](https://docs.rapidata.ai/latest/human_prompting/)
- [Error Handling](https://docs.rapidata.ai/latest/error_handling/)
- [Logging & Config](https://docs.rapidata.ai/latest/config/)

## Examples

- [Classification](https://docs.rapidata.ai/latest/examples/classify_job/)
- [Comparison](https://docs.rapidata.ai/latest/examples/compare_job/)
- [Locate](https://docs.rapidata.ai/latest/examples/locate_job/)
- [Draw](https://docs.rapidata.ai/latest/examples/draw_job/)
- [Select Words](https://docs.rapidata.ai/latest/examples/select_words_job/)
- [Free Text](https://docs.rapidata.ai/latest/examples/free_text_job/)
- [Ranking](https://docs.rapidata.ai/latest/examples/ranking_job/)

## Model ranking & benchmarks

- [Getting Started](https://docs.rapidata.ai/latest/mri/)
- [Advanced](https://docs.rapidata.ai/latest/mri_advanced/)

## AI agents & API

- [Use Rapidata from your AI agent](https://docs.rapidata.ai/latest/ai_agents/): an official skill that teaches coding agents (Claude Code, Cursor, Copilot, and others) to write Rapidata integrations
- [API reference](https://docs.rapidata.ai/latest/api/): the `RapidataClient` class and its managers

## Access

- Install: `pip install rapidata`
- API tokens: https://app.rapidata.ai/settings/tokens
- Source: https://github.com/RapidataAI/rapidata-python-sdk
- PyPI: https://pypi.org/project/rapidata/

## Optional

- [llms-full.txt](https://docs.rapidata.ai/llms-full.txt): the full documentation concatenated into a single file
38 changes: 38 additions & 0 deletions site_root/robots.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# Rapidata SDK documentation — https://docs.rapidata.ai
# All crawlers, including AI/agent crawlers, are welcome.
User-agent: *
Allow: /

# Named AI/agent crawlers, listed explicitly so operators that only honour
# their own user-agent block still see an Allow.
User-agent: GPTBot
Allow: /

User-agent: OAI-SearchBot
Allow: /

User-agent: ChatGPT-User
Allow: /

User-agent: ClaudeBot
Allow: /

User-agent: Claude-User
Allow: /

User-agent: anthropic-ai
Allow: /

User-agent: PerplexityBot
Allow: /

User-agent: Google-Extended
Allow: /

User-agent: Applebot-Extended
Allow: /

User-agent: CCBot
Allow: /

Sitemap: https://docs.rapidata.ai/sitemap.xml
Loading