Skip to content

Adopt Content Signals Policy in robots.txt#1653

Open
Aliserag wants to merge 1 commit intomainfrom
seo-geo-robots-content-signals-2026-04-23
Open

Adopt Content Signals Policy in robots.txt#1653
Aliserag wants to merge 1 commit intomainfrom
seo-geo-robots-content-signals-2026-04-23

Conversation

@Aliserag
Copy link
Copy Markdown
Contributor

Summary

Adds an explicit Content Signals Policy declaration to static/robots.txt:

  • search=yes — appear in traditional search indexes
  • ai-input=yes — used in live AI retrieval / citation (ChatGPT, Claude, Perplexity, etc.)
  • ai-train=yes — used to train or fine-tune AI models

The Content Signals Policy — rolled out by Cloudflare to 3.8M+ domains since Sept 2025 — is the emerging post-user-agent-allowlist mechanism for expressing AI policy in robots.txt. Flow developer docs are open source and we want maximum discoverability for humans and AI agents, so all three signals are set to yes.

What this is NOT

  • Not a verbose AI-bot allowlist. Every major dev-docs site audited (Mintlify, Supabase, Vercel, Anthropic, OpenAI, Stripe, Python docs, Cloudflare, developers.google.com) uses wildcard + targeted disallows rather than enumerating bots. Per-bot allowlists drift quickly (e.g., Anthropic deprecated anthropic-ai in favor of Claude-User / Claude-SearchBot).
  • Not a functional change to who can crawl. The current file already allows everything via User-agent: * / Disallow:. This PR just adds the explicit AI policy declaration.

Test plan

  • File serves locally and on prod already (this is a minor metadata/comment addition)
  • Verify Vercel preview serves the updated file before merge
  • Spot-check: curl https://<preview>/robots.txt returns the new content

Companion work

A parallel PR on onflow/cadence-lang.org (#292) applies the same Content Signals policy plus adds llms.txt generation. These two sites should stay in lockstep on AI policy.

Adds an explicit Content Signals declaration (contentsignals.org,
2025) to the existing minimal wildcard file:

  Content-Signal: search=yes, ai-input=yes, ai-train=yes

The Content Signals Policy — rolled out by Cloudflare to 3.8M+
domains since Sept 2025 — is the emerging post-user-agent-allowlist
mechanism for expressing AI policy in robots.txt. Three signals:

  search      — appear in traditional search indexes
  ai-input    — used in live retrieval / AI answer citations
                (ChatGPT, Claude, Perplexity, etc.)
  ai-train    — used to train or fine-tune AI models

Flow developer docs are open source and we want maximum
discoverability for humans and AI agents, so all three are set to
yes. Rest of the file unchanged: still a wildcard allow-all (matches
the pattern used by Mintlify, Supabase, Vercel, Anthropic, OpenAI,
Stripe, Python docs, and developers.google.com).

Also documents the companion /llms.txt and /llms-full.txt AI-indexed
documentation paths plus the cadence-lang.org companion index.
@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 23, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
docs Ready Ready Preview, Comment Apr 23, 2026 10:23pm

Request Review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant