Skip to content

feat(scanner): enable SkillSpector LLM semantic pass (Anthropic Sonnet)#10

Open
DevelopmentCats wants to merge 5 commits into
cat/step-1-scanner-pipelinefrom
cat/scanner-llm-mode
Open

feat(scanner): enable SkillSpector LLM semantic pass (Anthropic Sonnet)#10
DevelopmentCats wants to merge 5 commits into
cat/step-1-scanner-pipelinefrom
cat/scanner-llm-mode

Conversation

@DevelopmentCats

@DevelopmentCats DevelopmentCats commented Jun 22, 2026

Copy link
Copy Markdown
Collaborator

What this does

Flips the scheduled scan from --no-llm to SkillSpector's two-stage analyser (static rules + LLM semantic pass). Upstream's published precision goes from ~70% to ~87% by filtering context-aware false positives.

Provider is Anthropic (claude-sonnet-4-6). Sonnet is roughly 5x cheaper than the upstream default Opus and is well matched to the finding-classification work the LLM pass actually does.

Stack

Stacked on PR #1. Merge that first, then this can target main.

Setup before merge

  1. Add ANTHROPIC_API_KEY as a repo secret (Settings > Secrets and variables > Actions).
  2. Apply the workflow diff to .github/workflows/scan.yaml (see collapsible below). The Coder Agents app on this repo doesn't have workflows: write, so the bot couldn't commit that file. Either grant the permission and ping me to re-push, or paste the diff yourself.
  3. After merge: Actions > scan > Run workflow.

Graceful fallback

If the secret isn't set, the workflow emits a warning and runs --no-llm. A fresh fork keeps producing valid (lower-precision) scans without any setup.

Expected impact on the five in-tree skills

Skill static-only today with LLM on
coder/coder-modules clean (0) clean
coder/coder-templates clean (0) clean
coder/modules clean (0) clean
coder/templates clean (10) clean
coder/setup malicious (100) still high, findings list shrinks

Bringing coder/setup below suspicious needs the Phase 3 permissions-manifest layer (different PR). This change is the precision prerequisite, not the false-positive fix.

Workflow file diff (to paste into .github/workflows/scan.yaml)

Three edits inside the scan job:

1. Add new step after Verify skill path exists:

      - name: Determine LLM mode
        id: llm
        env:
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
        run: |
          set -euo pipefail
          if [[ -n "${ANTHROPIC_API_KEY:-}" ]]; then
            echo "extra_flags=" >> "$GITHUB_OUTPUT"
            echo "SkillSpector LLM mode: enabled (anthropic provider)." >&2
          else
            echo "extra_flags=--no-llm" >> "$GITHUB_OUTPUT"
            echo "::warning::ANTHROPIC_API_KEY secret not set; SkillSpector will run with --no-llm. See README \"One-time setup\" to enable the LLM semantic pass."
          fi

2. Replace the SkillSpector (JSON) step:

      - name: SkillSpector (JSON)
        if: steps.path_check.outputs.drift == 'false'
        continue-on-error: true
        env:
          SKILLSPECTOR_PROVIDER: anthropic
          SKILLSPECTOR_MODEL: claude-sonnet-4-6
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
        run: |
          mkdir -p out
          skillspector scan "source/${{ matrix.skill_path }}" \
            ${{ steps.llm.outputs.extra_flags }} \
            --format json \
            --output "out/skillspector.json" || true

3. Same swap for SkillSpector (SARIF):

      - name: SkillSpector (SARIF)
        if: steps.path_check.outputs.drift == 'false'
        continue-on-error: true
        env:
          SKILLSPECTOR_PROVIDER: anthropic
          SKILLSPECTOR_MODEL: claude-sonnet-4-6
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
        run: |
          mkdir -p out
          skillspector scan "source/${{ matrix.skill_path }}" \
            ${{ steps.llm.outputs.extra_flags }} \
            --format sarif \
            --output "out/skillspector.sarif" || true
Decision log
  • Anthropic over NVIDIA Build: Coder already has the billing relationship and the credential is one secret away. No second vendor signup.
  • Sonnet over Opus: SkillSpector's LLM pass is per-finding intent classification, not long-form reasoning. Sonnet at ~5x lower cost is the better cost/quality choice for periodic scans.
  • Hardcoded SKILLSPECTOR_PROVIDER in workflow: simpler than reading from config.yaml at runtime. One line to change in two places when we add a second provider in practice.
  • Graceful --no-llm fallback: a fresh fork should produce something useful immediately, not 404 the publish pipeline because the operator hasn't added the secret yet.
  • No CI change: ci.yaml uses inline pytest fixtures and never invokes skillspector live, so no inference cost on PR review.
  • No schema or verdict-math change: LLM mode shifts which findings reach the verdict, not how the verdict is computed. CALIBRATION.md walks through this explicitly.

This PR was prepared with help from Coder Agents.

Document the new llm.provider config knob (default nv_build) and the
workflow contract: empty flags + workflow appends --no-llm dynamically
when the matching credential secret is unset. Removes --no-llm from
the static flags list now that the workflow drives it.

This commit was prepared with help from Coder Agents.
Document what flipping LLM mode on does (and does not do) to the
verdict math, the precision delta we expect, and what to expect for
the five in-tree skills. Adds "LLM provider changes" to the
"When to revisit" list.

This commit was prepared with help from Coder Agents.
Update step 3 of the architecture summary to reflect that the
scheduled scan now runs SkillSpector with the LLM semantic pass on
by default. Add a new "One-time setup on the repo" section that
lists the three repo-level configurations needed for a useful scan,
including the new LLM credential secret. Mirror the LLM secret note
into "Forking for your own catalogue".

This commit was prepared with help from Coder Agents.
Copilot AI review requested due to automatic review settings June 22, 2026 19:55

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the scanner configuration and documentation to support running NVIDIA SkillSpector with an optional LLM semantic pass (with a fallback to static-only mode), as part of the scheduled scan pipeline described in this repo.

Changes:

  • Extend config.yaml with a scanners.skillspector.llm configuration block and make scanners.skillspector.flags empty by default.
  • Document the LLM semantic pass behavior, expected precision delta, and repo one-time setup steps in README.md and docs/CALIBRATION.md.
  • Shift responsibility for driving --no-llm to the scheduled workflow (though the workflow change itself is not included in this PR’s diff).

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 7 comments.

File Description
README.md Updates architecture and adds one-time setup guidance, including LLM credential setup.
docs/CALIBRATION.md Adds an “LLM semantic pass” section explaining expected effects and limitations.
config.yaml Adds scanners.skillspector.llm block and removes the default --no-llm flag from config-driven flags.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread README.md
Comment on lines +11 to +14
3. Runs [NVIDIA SkillSpector](https://github.com/NVIDIA/SkillSpector) over
the upstream content. The scheduled scan uses LLM semantic analysis
when the credential secret is configured, and falls back to
`--no-llm` static-only mode otherwise.
Comment thread docs/CALIBRATION.md
Comment on lines +117 to +121
The scheduled scan runs LLM mode when the workflow's chosen credential
secret (`NVIDIA_INFERENCE_KEY` for the default `nv_build` provider) is
configured. The fallback to `--no-llm` is automatic when the secret is
missing, so an unset secret on a fresh fork degrades the scan rather
than breaking it.
Comment thread config.yaml
Comment on lines +42 to +46
# Extra CLI flags passed to every SkillSpector invocation. Empty by
# default; the scan workflow appends --no-llm dynamically when the
# LLM credential secret is not set (see llm: block below). CI runs
# do not invoke SkillSpector live.
flags: []
Comment thread config.yaml
Comment on lines +53 to +57
# The scheduled scan reads the credential matching the provider
# below from a repository secret. When the secret is configured,
# LLM mode is on. When the secret is missing, the workflow falls
# back to --no-llm automatically so a fresh fork is never broken
# by an unset secret.
Comment thread README.md Outdated
Comment on lines +74 to +78
3. **Settings > Secrets and variables > Actions**: add the LLM
credential matching the provider in `config.yaml`'s
`scanners.skillspector.llm.provider`. For the default `nv_build`
provider this is `NVIDIA_INFERENCE_KEY` (sign up free at
[build.nvidia.com](https://build.nvidia.com)). Without the secret
Comment thread README.md Outdated
Comment on lines +79 to +83
the scan still runs, but SkillSpector falls back to
`--no-llm` static-only mode and precision drops from roughly 87%
to roughly 70%. See `docs/CALIBRATION.md` for the precision
discussion. The optional `SLACK_WEBHOOK_URL` secret enables the
`notify-slack-on-failure` job; without it that job is a no-op.
Comment thread README.md
Comment on lines +122 to +125
5. Add the LLM credential secret matching your chosen provider
(see "One-time setup on the repo" above). Optional; static-only
mode works without it.
6. Enable Actions.
Swap the default LLM provider from nv_build (free NVIDIA Build) to
anthropic with model pinned to claude-sonnet-4-6. Rationale:

- Removes the second-vendor signup. The Coder org already has an
  Anthropic billing relationship, so the credential is one secret
  away from working.
- Sonnet 4.6 is roughly 5x cheaper than the anthropic default
  (Opus 4.6) and is well matched to SkillSpector's LLM pass, which
  is finding-by-finding intent classification rather than long-form
  reasoning. Cost ballpark for 5 skills x 4 scans/day is small.
- The other provider options (anthropic_proxy via Vertex, openai
  via any OpenAI-compatible gateway, nv_build) stay documented in
  the config comments and are still a one-line swap.

This commit was prepared with help from Coder Agents.
Follow-up to the provider swap. The one-time-setup section now points
at console.anthropic.com and ANTHROPIC_API_KEY instead of
build.nvidia.com / NVIDIA_INFERENCE_KEY.

This commit was prepared with help from Coder Agents.
@DevelopmentCats DevelopmentCats changed the title feat(scanner): enable SkillSpector LLM semantic pass feat(scanner): enable SkillSpector LLM semantic pass (Anthropic Sonnet) Jun 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants