antonbabenko · antonbabenko · May 16, 2026 · May 16, 2026 · May 16, 2026 · May 16, 2026
diff --git a/.claude-plugin/marketplace.json b/.claude-plugin/marketplace.json
@@ -33,6 +33,22 @@
         "terraform-ls"
       ],
       "version": "1.8.0"
+    },
+    {
+      "name": "code-intelligence",
+      "source": "./plugins/code-intelligence",
+      "description": "Use when navigating or refactoring code with a language server - choosing between semantic (LSP), exact-text (rg), and fuzzy/semantic search; anchoring LSP calls by position; gating degraded results; and disclosing tool substitutions, in any language.",
+      "category": "development",
+      "keywords": [
+        "lsp",
+        "code-intelligence",
+        "code-navigation",
+        "language-server",
+        "refactoring",
+        "search-precedence",
+        "tool-disclosure"
+      ],
+      "version": "0.1.0"
     }
   ]
 }
diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md
@@ -45,6 +45,10 @@ Agent response: [verbatim or screenshot]
 Improvements: [what improved / patterns now followed]
 ```
 
+- [ ] Ran the plugin's `tests/baseline-scenarios.md` per its
+      `## Running These Tests` (plugin OFF then ON)
+- [ ] Every scenario meets its `### Success Criteria`; no scenario fails
+- [ ] Added/updated a scenario for any new or changed behavior
 - [ ] Agent references new content
 - [ ] Agent applies new patterns proactively
 - [ ] No new rationalizations introduced

diff --git a/.github/workflows/validate.yml b/.github/workflows/validate.yml
@@ -80,6 +80,50 @@ jobs:
           print(f"\n✅ {len(skills)} inline skill file(s) valid")
           EOF
 
+      - name: Validate inline plugin tests
+        run: |
+          python3 << 'EOF'
+          import sys, glob, os
+
+          skills = sorted(glob.glob('plugins/*/skills/*/SKILL.md'))
+          if not skills:
+              print("ℹ️  No inline plugins - nothing to test-gate.")
+              sys.exit(0)
+
+          errors = []
+          for skill in skills:
+              plugin_root = skill.split('/skills/')[0]   # plugins/<plugin>
+              name = plugin_root.split('/', 1)[1]
+              tf = os.path.join(plugin_root, 'tests', 'baseline-scenarios.md')
+              if not os.path.isfile(tf):
+                  errors.append(f"{name}: missing {tf} (regression scenarios "
+                                "are required for inline plugins)")
+                  continue
+              text = open(tf).read()
+              n_scn = text.count('\n## Scenario ')
+              checks = {
+                  "a '## Scenario' section": n_scn >= 1,
+                  "a '## Running These Tests' section":
+                      '\n## Running These Tests' in text,
+                  "a '### Success Criteria' section":
+                      '\n### Success Criteria' in text,
+              }
+              missing = [why for why, ok in checks.items() if not ok]
+              if missing:
+                  errors.append(f"{name}: {tf} needs " + "; ".join(missing))
+              else:
+                  print(f"   ✅ {name}: {n_scn} scenario(s), run protocol present")
+
+          if errors:
+              print("\n".join(f"❌ {e}" for e in errors))
+              print("\nEvery inline plugin must ship "
+                    "tests/baseline-scenarios.md with at least one scenario, "
+                    "a '## Running These Tests' protocol, and "
+                    "'### Success Criteria'. See CONTRIBUTING.md > Testing.")
+              sys.exit(1)
+          print(f"\n✅ {len(skills)} inline plugin test suite(s) present")
+          EOF
+
       - name: Validate marketplace.json
         run: |
           python3 << 'EOF'
@@ -208,7 +252,6 @@ jobs:
             plugins/**/*.md
             README.md
             CONTRIBUTING.md
-        continue-on-error: true
 
       - name: Summary
         if: success()

diff --git a/.gitignore b/.gitignore
@@ -1,3 +1,5 @@
 .claude/settings.local.json
 docs/
 tmp-*/
+.DS_Store
+*.swp
diff --git a/.markdownlint.jsonc b/.markdownlint.jsonc
@@ -0,0 +1,15 @@
+{
+  // Repo-wide markdownlint config. markdownlint-cli2 (the CI action)
+  // auto-discovers this file.
+  "default": true,
+  // Prose and tables in this repo intentionally exceed 80 cols (decision
+  // tables, links, scenario text). Line length is not a useful signal here.
+  "MD013": false,
+  // Baseline scenario docs repeat sub-headings ("Test Prompt",
+  // "Success Criteria", ...) under different scenario parents. Allow
+  // duplicates when they are not siblings.
+  "MD024": { "siblings_only": true },
+  // Table pipe spacing/alignment is cosmetic and varies across the repo's
+  // tables; not a useful signal (newer markdownlint only).
+  "MD060": false
+}
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -76,6 +76,10 @@ newer version, bump `source.ref` and the mirrored `version` in the manifest.
 3. Add `plugins/<plugin>/CHANGELOG.md` (can be empty; CI prepends to it).
 4. The manifest `version` must equal the SKILL.md `metadata.version`. CI
    enforces this.
+5. Add `plugins/<plugin>/tests/baseline-scenarios.md` - **required**, CI
+   enforces it: at least one `## Scenario`, a `## Running These Tests`
+   protocol, and a `### Success Criteria` list. Copy the shape of
+   `plugins/code-intelligence/tests/baseline-scenarios.md`.
 
 ## Development Workflow
 
@@ -116,10 +120,15 @@ grep -oP '\[.*?\]\(references/.*?\.md.*?\)' SKILL.md references/*.md | \
 No automated suite. Manual flow:
 
 1. Edit a `SKILL.md` or `references/*.md` file.
-2. Reload the plugin in your agent host.
-3. Run real queries the skill targets.
-4. Confirm the agent applies the new patterns.
-5. Re-check that plugin's `tests/` for regressions (if it has them).
+2. Run that plugin's `tests/baseline-scenarios.md` per its
+   `## Running These Tests`: each prompt with the plugin OFF (baseline) then
+   ON (target).
+3. Every scenario must meet its `### Success Criteria` with no new
+   rationalizations; one failure blocks the change.
+4. Add or update a scenario whenever a PR adds or changes a behavior.
+5. Attach baseline + target transcripts to the PR (or `/tmp`), never under
+   `plugins/`. Tests are required, not optional - CI fails an inline plugin
+   with no scenario file.
 
 ## Commit Conventions & Releases
 

diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -49,6 +49,9 @@ standards, and the per-plugin release model before contributing.
 3. `plugins/<plugin>/CHANGELOG.md` (may be empty; CI prepends to it).
 4. The manifest `version` must equal the SKILL.md `metadata.version`. CI
    enforces this.
+5. `plugins/<plugin>/tests/baseline-scenarios.md` is **required** and
+   CI-enforced (see Testing). It must contain at least one `## Scenario ...`,
+   a `## Running These Tests` protocol, and a `### Success Criteria` list.
 
 See CLAUDE.md "SKILL.md Architecture" and the "LLM Consumption Rules" for
 content shape and token discipline.
@@ -71,22 +74,55 @@ the squash commit subject is what drives the release; set it deliberately.
 
 ## Testing
 
-This is documentation, not code. There is no build. Validate locally with the
-commands in [CLAUDE.md](CLAUDE.md#validation), then verify behavior:
+Tests are **required**, not optional. This is documentation, not code, so
+"tests" are behavioral regression scenarios run against a real agent host.
 
-1. Reload the plugin in your agent host.
-2. Run real queries the skill targets.
-3. Confirm the agent applies the new patterns and introduces no new
-   rationalizations.
+**Every inline plugin must ship `plugins/<plugin>/tests/baseline-scenarios.md`**
+with this structure (CI fails the PR if it is missing or incomplete):
 
-Content PRs must include baseline (without change) and improved (with change)
-agent transcripts in the PR template.
+```text
+# Baseline Scenarios
+<intro: compare WITHOUT vs WITH the skill>
+
+## Running These Tests
+<the WITHOUT -> WITH -> compare -> gate protocol>
+
+## Scenario 1: <name>
+### Test Prompt
+### Expected Baseline Behavior (WITHOUT skill)
+### Target Behavior (WITH skill)
+### Pressure Variations
+### Success Criteria        <- checkbox list, the pass/fail bar
+## Scenario 2: ...
+```
+
+`plugins/code-intelligence/tests/baseline-scenarios.md` is the canonical
+example - copy its shape.
+
+**Every content PR must:**
+
+1. First validate locally with the commands in
+   [CLAUDE.md](CLAUDE.md#validation).
+2. Run the scenarios per that file's `## Running These Tests`: capture each
+   prompt's output with the plugin OFF (baseline), then ON (target).
+3. Confirm every scenario meets its `### Success Criteria` and introduces no
+   new rationalizations. A single failing scenario blocks the PR.
+4. When a PR adds or changes a behavior, add or update a scenario so the
+   behavior stays covered.
+5. Paste the baseline and target transcripts into the PR template (or `/tmp`) -
+   never commit them under `plugins/`.
 
 ## CI
 
 `validate.yml` runs on every PR touching `plugins/**` or `.claude-plugin/**`:
-frontmatter, size, manifest validity, manifest <-> SKILL.md version sync,
-broken links, and markdown lint. Fix failures before requesting review.
+frontmatter, size, **inline plugin tests present** (baseline-scenarios.md with
+scenarios + run protocol + success criteria), manifest validity, manifest <->
+SKILL.md version sync, broken links, and markdown lint.
+
+The **Validate Skill Files** check is a **required status check** on `master`
+(branch protection): a PR cannot be merged while it is failing. Every check
+above is blocking - markdown lint included (no `continue-on-error`). Fix all
+failures; do not request review or merge with a red check.
 
 ## Reporting Issues
 

diff --git a/README.md b/README.md
@@ -13,6 +13,31 @@ Plugins are either **external** (referenced from their own repo) or **inline**
 | Plugin | Type | Description |
 |--------|------|-------------|
 | [terraform-skill](https://github.com/antonbabenko/terraform-skill) | external | Writing, reviewing, and debugging Terraform/OpenTofu modules, tests, CI, scans, and state ops. Pinned via `source.ref`. |
+| [code-intelligence](plugins/code-intelligence/skills/code-intelligence/SKILL.md) | inline | Language-agnostic code navigation discipline: when to use a language server vs exact-text vs fuzzy search, position-anchored LSP calls, a degradation gate, and first-line tool-substitution disclosure. |
+
+## Why these plugins
+
+These are not prose guides - they are executable discipline the agent loads on
+demand and applies while it works.
+
+- **Fewer wrong tools, fewer silent failures.** `code-intelligence` stops the
+  common failure modes directly: blind text-replace renames, accepting "the
+  tool is broken" without proof, presenting a keyword grep as a complete
+  answer. `terraform-skill` routes a request to its actual failure mode
+  (identity churn, secret exposure, blast radius, state corruption) before
+  generating code.
+- **Honest by construction.** Any tool substitution or skipped step is stated
+  on the first line of the response, not buried later - so you can trust what
+  the agent says it did.
+- **Token-lean.** Progressive disclosure: a short `SKILL.md` entry point routes
+  to reference files that load only when the task needs them. The agent does
+  not carry the whole guide in context.
+- **Portable.** One discipline across Claude Code, Cursor, Copilot, Gemini CLI,
+  OpenCode, and Codex - no per-host retraining.
+- **Composable and pinned.** Generic skills (`code-intelligence`) provide the
+  base discipline; domain skills (`terraform-skill`) extend it. Each plugin is
+  versioned and released independently, so an upgrade to one never moves
+  another.
 
 ## Installation
 
@@ -33,8 +58,10 @@ directory, for example:
 
 ```bash
 git clone https://github.com/antonbabenko/agent-plugins.git
-# Claude Code (manual): symlink a plugin into ~/.claude/plugins
-ln -s "$(pwd)/agent-plugins/plugins/terraform-skill" ~/.claude/plugins/terraform-skill
+# Inline plugins live under plugins/<name>/ - symlink one into ~/.claude/plugins:
+ln -s "$(pwd)/agent-plugins/plugins/code-intelligence" ~/.claude/plugins/code-intelligence
+# External plugins (e.g. terraform-skill) are not in this repo - install them
+# from their own repo / marketplace ref instead.
 ```
 
 For per-host instructions (Cursor, Copilot, Gemini CLI, OpenCode, Codex,

diff --git a/plugins/code-intelligence/CHANGELOG.md b/plugins/code-intelligence/CHANGELOG.md
@@ -0,0 +1,9 @@
+# Changelog
+
+All notable changes to the `code-intelligence` plugin are documented here.
+This file is managed by the per-plugin release pipeline; entries are prepended
+on release.
+
+## [Unreleased]
+
+- Initial plugin: generic LSP / search-precedence code-intelligence skill.
diff --git a/plugins/code-intelligence/skills/code-intelligence/SKILL.md b/plugins/code-intelligence/skills/code-intelligence/SKILL.md
@@ -0,0 +1,85 @@
+---
+name: code-intelligence
+description: Use when navigating or refactoring code with a language server - choosing between semantic (LSP), exact-text (rg), and fuzzy/semantic search; anchoring LSP calls by position; gating degraded results; and disclosing tool substitutions, in any language.
+license: Apache-2.0
+metadata:
+  author: Anton Babenko
+  version: 0.1.0
+---
+
+# Code Intelligence
+
+Pick the search tool by task, not by habit. Generic and language-agnostic;
+domain skills extend it with server capability matrices and ecosystem
+prerequisites. It is model-triggered guidance, not enforcement.
+
+## Tool Precedence
+
+| Goal | Use | Tradeoff |
+|------|-----|----------|
+| Symbol relationships: definition, references, call sites, rename safety | Language server (LSP) at a position | Needs a running server + indexed workspace |
+| Exact text, known name, exhaustive enumeration, config/value files | `rg` then Read | No semantic scope; matches strings in comments too |
+| Conceptual / fuzzy / "where might this live" / cross-repo discovery | A semantic/neural search tool, if the host provides one | Not exact; never use for counts or completeness claims |
+
+Detail: [Precedence Table](references/tool-precedence.md#precedence-table),
+[When LSP Is Wrong](references/tool-precedence.md#when-lsp-is-wrong).
+
+## Calling the LSP
+
+- DO call at a position (`file:line:character`). Anchor the position with a
+  text search for a known occurrence first.
+- DON'T pass a bare symbol name and expect resolution. A name-only call that
+  returns empty is a usage defect, not server failure.
+- DO Read the returned locations for source text; LSP returns locations and
+  symbols, not the lines.
+- DO retry once on a cold start: the first call after launch may return empty
+  while the server indexes.
+- DO prefer the server's own operation when it advertises it: use `rename` /
+  `prepareRename` for renames and call hierarchy for callers - they carry
+  language-specific semantics a manual pass misses.
+- DON'T report an unsupported operation as a finding. When the server lacks
+  one, redirect: `findReferences` (then filter to call sites) instead of call
+  hierarchy; enumerate references then hand-edit instead of a rename provider.
+
+Detail: [Position Anchoring](references/lsp-calls.md#position-anchoring),
+[Unsupported Operations](references/lsp-calls.md#unsupported-operations).
+
+## Degradation Gate
+
+Two distinct cases:
+
+- **No LSP at all** (host exposes no language-server tool, or the server fails
+  to start): that IS unavailability. Disclose it on the first line (see below)
+  and use text search. The gate does not apply - there is nothing to gate.
+- **LSP callable but a position-anchored call returns empty:** do NOT conclude
+  "unavailable" yet. Pass ALL three:
+  1. `documentSymbol` on an in-scope file returns symbols -> server responsive
+     (responsiveness only, NOT proof of complete reference coverage).
+  2. The failing call was position-anchored (not symbol-name-only).
+  3. That anchored call still returned empty after a cold-start retry.
+
+Only after the three-part case passes is a disclosed text fallback warranted.
+
+Detail: [Degradation Gate](references/degradation-and-disclosure.md#degradation-gate).
+
+## Disclose Substitutions
+
+State any tool substitution OR omission on the FIRST line of the response, not
+in a later summary (post-hoc accounting is a rule violation):
+
+`Intended: <tool>. Actual: <tool>. Reason: <why>. Impact: <completeness/confidence>.`
+
+Detail: [Disclosure Format](references/degradation-and-disclosure.md#disclosure-format).
+
+## Do Not Invent a Missing Tool
+
+Before claiming a tool (e.g. `rg`) is shimmed, aliased, or absent, prove it:
+`type -a <tool>`, `ls -l` the resolved path, `<tool> --version` shows the
+expected banner. An unproven "tool is missing" claim followed by a fallback is
+a verification failure, not a sanctioned substitution.
+
+If genuinely absent or aliased: prefer the LSP for semantic tasks; for exact
+text use the host-approved text search; `git grep` / `grep` only as an
+explicitly disclosed last resort, never the default substitute.
+
+Detail: [Anti-Phantom-Shim Proof](references/degradation-and-disclosure.md#anti-phantom-shim-proof).