From f4f9e3cb47d6b1f39ef8ae33449122b8ff2afee7 Mon Sep 17 00:00:00 2001 From: jdalton Date: Tue, 28 Apr 2026 12:45:50 -0400 Subject: [PATCH 01/17] docs(claude): add programmatic-Claude lockdown rule + skill MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Cascaded from socket-repo-template. CLAUDE.md gains one bullet alongside the other security 🚨 rules; the skill at .claude/skills/programmatic-claude-lockdown/SKILL.md holds the four-flag table (`tools`/`allowedTools`/`disallowedTools`/ `permissionMode: 'dontAsk'`), both recipes (read-only and Bash-needing), and the never-do list. Reference impl: socket-lib/tools/prim/src/disambiguate.mts (SDK form); socket-registry weekly-update.yml uses the Bash-needing CLI form. --- .../programmatic-claude-lockdown/SKILL.md | 84 +++++++++++++++++++ CLAUDE.md | 1 + 2 files changed, 85 insertions(+) create mode 100644 .claude/skills/programmatic-claude-lockdown/SKILL.md diff --git a/.claude/skills/programmatic-claude-lockdown/SKILL.md b/.claude/skills/programmatic-claude-lockdown/SKILL.md new file mode 100644 index 00000000..f2561013 --- /dev/null +++ b/.claude/skills/programmatic-claude-lockdown/SKILL.md @@ -0,0 +1,84 @@ +--- +name: programmatic-claude-lockdown +description: Reference for locking down programmatic Claude invocations (the `claude` CLI in workflows/scripts, the `@anthropic-ai/claude-agent-sdk` `query()` in code). Loads on demand when writing or reviewing any callsite that runs Claude programmatically. Source: https://code.claude.com/docs/en/agent-sdk/permissions. +user-invocable: false +allowed-tools: Read, Grep, Glob +--- + +# Programmatic Claude lockdown + +**Rule:** every programmatic Claude callsite sets four flags. Skip any one and a future edit silently widens the surface. + +## The four flags + +| Layer | SDK option | CLI flag | What it does | +|---|---|---|---| +| Definition | `tools` | `--tools` | Base set the model is told about. Tools not listed are invisible β€” no `tool_use` block possible. | +| Auto-approve | `allowedTools` | `--allowedTools` | Step 4. Listed tools run without invoking `canUseTool`. | +| Deny | `disallowedTools` | `--disallowedTools` | Step 2. Wins even against `bypassPermissions`. Defense-in-depth. | +| Mode | `permissionMode: 'dontAsk'` | `--permission-mode dontAsk` | Step 3. Unmatched tools denied without falling through to a missing `canUseTool`. | + +The official permission flow (1) hooks β†’ (2) deny rules β†’ (3) permission mode β†’ (4) allow rules β†’ (5) `canUseTool`. In `dontAsk` mode step 5 is skipped β€” denied. The doc states verbatim: *"`allowedTools` and `disallowedTools` ... control whether a tool call is approved, not whether the tool is available."* Availability is `tools`. + +## Recipe β€” read-only agent (audit, classify, summarize) + +```ts +import { query } from '@anthropic-ai/claude-agent-sdk' + +query({ + prompt: '...', + options: { + tools: ['Read', 'Grep', 'Glob'], + allowedTools: ['Read', 'Grep', 'Glob'], + disallowedTools: ['Agent', 'Bash', 'Edit', 'NotebookEdit', 'Task', 'WebFetch', 'WebSearch', 'Write'], + permissionMode: 'dontAsk', + }, +}) +``` + +CLI form for workflow YAML / shell scripts: + +```yaml +claude --print \ + --tools "Read" "Grep" "Glob" \ + --allowedTools "Read" "Grep" "Glob" \ + --disallowedTools "Agent" "Bash" "Edit" "NotebookEdit" "Task" "WebFetch" "WebSearch" "Write" \ + --permission-mode dontAsk \ + --model "$MODEL" \ + --max-turns 25 \ + "" +``` + +## Recipe β€” agent that needs Bash (e.g. `/updating`: pnpm + git + jq) + +Narrow `Bash(...)` patterns surgically. Block dangerous Bash patterns explicitly. Fleet rules: no `npx`/`pnpm dlx`/`yarn dlx`; no `curl`/`wget` exfil; no destructive `rm -rf`; no `sudo`. Build the deny list as shell vars so the npx/dlx denials can carry the `# zizmor:` exemption marker (the pre-commit `scanNpxDlx` hook treats those literal strings as the prohibited tools, not as exemptions, unless the line is tagged): + +```yaml +DISALLOW_BASE='Agent Task NotebookEdit WebFetch WebSearch Bash(curl:*) Bash(wget:*) Bash(rm -rf*) Bash(sudo:*)' +DISALLOW_PKG_EXEC='Bash(npx:*) Bash(pnpm dlx:*) Bash(yarn dlx:*)' # zizmor: documentation-prohibition +claude --print \ + --tools "Bash" "Read" "Write" "Edit" "Glob" "Grep" \ + --allowedTools "Bash(pnpm:*)" "Bash(git:*)" "Bash(jq:*)" "Read" "Write" "Edit" "Glob" "Grep" \ + --disallowedTools $DISALLOW_BASE $DISALLOW_PKG_EXEC \ + --permission-mode dontAsk \ + --model "$MODEL" --max-turns 25 \ + "" +``` + +## Never + +- ❌ `permissionMode: 'default'` in headless contexts β€” falls through to a missing `canUseTool`. Behavior undefined. +- ❌ `permissionMode: 'bypassPermissions'` / `allowDangerouslySkipPermissions: true`. +- ❌ Omitting `tools` β€” SDK default is the full claude_code preset. +- ❌ `Agent` / `Task` permitted β€” sub-agents inherit modes and can escape per-subagent restrictions when the parent is `bypassPermissions`/`acceptEdits`/`auto`. + +## Reference implementation + +`socket-lib/tools/prim/src/disambiguate.mts` β€” canonical SDK-form callsite. The file header documents each flag against the eval-flow step it enforces. + +`socket-lib/tools/prim/test/disambiguate.test.mts` β€” source-text guards that fail the build if `BASE_TOOLS` widens, if `tools: BASE_TOOLS` is unwired, if `permissionMode` drifts from `'dontAsk'`, or if `bypassPermissions` / `allowDangerouslySkipPermissions: true` ever appears. Mirror this pattern in any new callsite. + +## Existing fleet callsites + +- `socket-registry/.github/workflows/weekly-update.yml` β€” two `claude --print` invocations (run `/updating` skill, fix test failures). Bash recipe above. +- `socket-lib/tools/prim/src/disambiguate.mts` β€” read-only recipe above (`query()` SDK form). diff --git a/CLAUDE.md b/CLAUDE.md index a59e46aa..793dbf85 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -44,6 +44,7 @@ The umbrella rule: never run a git command that mutates state belonging to a pat - **minimumReleaseAge**: NEVER add packages to `minimumReleaseAgeExclude` in CI. Locally, ASK before adding β€” the age threshold is a security control. - 🚨 **NEVER mention private repos or internal project names** in commits, PR titles/descriptions/comments, issues, release notes, or any public-surface text. Internal codenames, unreleased product names, internal tooling repo names not on the public org page, customer names, partner names β€” none belong in public surfaces. **Omit the reference entirely.** Don't substitute a placeholder ("an internal tool", "a downstream consumer", etc.) β€” the placeholder itself is a tell that something is being elided. Rewrite the sentence to not need the reference at all. - 🚨 **NEVER trigger Publish / Release / Provenance / Build-Release workflows** β€” no `gh workflow run`, `gh workflow dispatch`, or `gh api .../dispatches`. Workflow dispatches are irrevocable: Publish workflows push npm versions (unpublishable after 24h), Build/Release workflows pin GitHub releases by SHA, container workflows push immutable tags. Even build workflows with a `dry_run` input still treat the dispatch itself as the prod trigger. The user runs workflow_dispatch jobs manually after CI passes on the release commit + tag β€” Claude **never** dispatches them. If the user asks for a publish, tell them to run the command in their own terminal (or the GitHub Actions UI). +- 🚨 **Programmatic Claude calls** (workflows, skills, scripts that invoke `claude` CLI or `@anthropic-ai/claude-agent-sdk`) MUST set all four lockdown flags: `--tools`/`tools`, `--allowedTools`/`allowedTools`, `--disallowedTools`/`disallowedTools`, and `--permission-mode dontAsk`/`permissionMode: 'dontAsk'`. NEVER `default` mode in headless contexts (falls through to a missing `canUseTool` β†’ undefined behavior). NEVER `bypassPermissions`. See `.claude/skills/programmatic-claude-lockdown/SKILL.md` for the recipe + reference impl (`socket-lib/tools/prim/src/disambiguate.mts`). - File existence: ALWAYS `existsSync` from `node:fs`. NEVER `fs.access`, `fs.stat`-for-existence, or an async `fileExists` wrapper. Import form: `import { existsSync, promises as fs } from 'node:fs'`. - Null-prototype objects: ALWAYS use `{ __proto__: null, ...rest }` for config, return, and internal-state objects. Prevents prototype pollution and accidental inheritance. See `src/socket-sdk-class.ts` and `src/file-upload.ts` for examples. - Linear references: NEVER reference Linear issues (e.g. `SOC-123`, `ENG-456`, Linear URLs) in code, code comments, or PR titles/descriptions/review comments. Keep the codebase and PR history tool-agnostic β€” tracking lives in Linear. From 14d0c83b75b2af6d29cc7902bb720e6d44c89d72 Mon Sep 17 00:00:00 2001 From: jdalton Date: Tue, 28 Apr 2026 13:03:26 -0400 Subject: [PATCH 02/17] =?UTF-8?q?chore(deps):=20bump=20pnpm=2011.0.0-rc.5?= =?UTF-8?q?=20=E2=86=92=2011.0.0=20(GA)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit pnpm v11 is now stable: https://github.com/pnpm/pnpm/releases/tag/v11.0.0 - package.json: packageManager pin "pnpm@11.0.0-rc.5" β†’ "pnpm@11.0.0"; engines.pnpm ">=11.0.0-rc.0" β†’ ">=11.0.0". - external-tools.json: bump version + 6 platform sha256s (darwin arm64/x64, linux arm64/x64, win arm64/x64). Hashes computed locally from the v11.0.0 release tarballs. pnpm-workspace.yaml already on the v11 idioms (allowBuilds, pmOnFail, minimumReleaseAge); lockfile shape unchanged. --- external-tools.json | 14 +++++++------- package.json | 4 ++-- 2 files changed, 9 insertions(+), 9 deletions(-) diff --git a/external-tools.json b/external-tools.json index 06ef8a73..a3c36c59 100644 --- a/external-tools.json +++ b/external-tools.json @@ -22,7 +22,7 @@ }, "pnpm": { "description": "pnpm β€” the fleet's package manager.", - "version": "11.0.0-rc.5", + "version": "11.0.0", "packageManager": "pnpm", "repository": "github:pnpm/pnpm", "release": "asset", @@ -34,27 +34,27 @@ "checksums": { "darwin-arm64": { "asset": "pnpm-darwin-arm64.tar.gz", - "sha256": "32a50710ccacfdcf14e6d5995d5368298eec913b0ce3903b9e09b6555f06f4e5" + "sha256": "3620a0fcaf81ecd3aaeccd5965919d90dbc913f4d07a96e11e7cafc2c785054b" }, "darwin-x64": { "asset": "pnpm-darwin-x64.tar.gz", - "sha256": "71dca33f4275da6b43bf1eb40bdc4d876f59a116716eacbf01079c3d985ff85d" + "sha256": "1701748b75187f1333a9c616827943ff84ff46cc42becc156ff6864b9bd0f948" }, "linux-arm64": { "asset": "pnpm-linux-arm64.tar.gz", - "sha256": "2dd04127ff10b1f9dd20bae248b779c77a8ec67e3afa35e7256e5f94abddd493" + "sha256": "1e6d87ebfd7ff169966ff5b3ad71b780b883c68d3e59987df1096dfd8853df75" }, "linux-x64": { "asset": "pnpm-linux-x64.tar.gz", - "sha256": "7ebef4b616ba41fb0d54a207b36508fae3346723283a088b43fc1e038ee6fed0" + "sha256": "9b44acc77ada40fc41b665fde1d57367a5ebec31bd4b1b00598daed195da3e17" }, "win-arm64": { "asset": "pnpm-win32-arm64.zip", - "sha256": "e4a39ad4c251db5e34b18b98561ef25bab5506ad65cad2fa3602af58d1972667" + "sha256": "0746be8e98ca183078d0747559f0cbbd30a13a53eb177f67474eb3c52dc21bc8" }, "win-x64": { "asset": "pnpm-win32-x64.zip", - "sha256": "147485ae2f38c3d1ccf2f5db00d0244416bcd22b9114c02388e6a78f41538fc4" + "sha256": "581e222e622cd0cc4f0ac5f85dd0db76b65117e3b17507979d89e63fdc68edca" } } }, diff --git a/package.json b/package.json index 586274d1..f27ff047 100644 --- a/package.json +++ b/package.json @@ -111,7 +111,7 @@ }, "engines": { "node": ">=18.20.8", - "pnpm": ">=11.0.0-rc.0" + "pnpm": ">=11.0.0" }, - "packageManager": "pnpm@11.0.0-rc.5" + "packageManager": "pnpm@11.0.0" } From dae9d496cd47fbec90f336c87389d55ca5aad2c3 Mon Sep 17 00:00:00 2001 From: jdalton Date: Wed, 29 Apr 2026 14:16:55 -0400 Subject: [PATCH 03/17] chore(ci): cascade socket-registry pin to eeb81520 (pnpm 11.0.0 GA) --- .github/workflows/ci.yml | 2 +- .github/workflows/generate.yml | 6 +++--- .github/workflows/provenance.yml | 2 +- .github/workflows/weekly-update.yml | 2 +- 4 files changed, 6 insertions(+), 6 deletions(-) diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index b430cd4b..535697a0 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -21,6 +21,6 @@ concurrency: jobs: ci: name: Run CI Pipeline - uses: SocketDev/socket-registry/.github/workflows/ci.yml@85a2fc0d33af6304246620365de3e7f053035a8d # main + uses: SocketDev/socket-registry/.github/workflows/ci.yml@eeb81520395947c6c4ab701b4f7f690a2296b816 # main with: test-script: 'pnpm run test --all --skip-build' diff --git a/.github/workflows/generate.yml b/.github/workflows/generate.yml index b4534c3e..f26b4e8d 100644 --- a/.github/workflows/generate.yml +++ b/.github/workflows/generate.yml @@ -46,14 +46,14 @@ jobs: echo "Sleeping for $delay seconds..." sleep $delay - - uses: SocketDev/socket-registry/.github/actions/setup-and-install@85a2fc0d33af6304246620365de3e7f053035a8d # main + - uses: SocketDev/socket-registry/.github/actions/setup-and-install@eeb81520395947c6c4ab701b4f7f690a2296b816 # main - name: Configure push credentials env: GH_TOKEN: ${{ github.token }} run: git remote set-url origin "https://x-access-token:${GH_TOKEN}@github.com/${{ github.repository }}.git" - - uses: SocketDev/socket-registry/.github/actions/setup-git-signing@85a2fc0d33af6304246620365de3e7f053035a8d # main + - uses: SocketDev/socket-registry/.github/actions/setup-git-signing@eeb81520395947c6c4ab701b4f7f690a2296b816 # main with: gpg-private-key: ${{ secrets.BOT_GPG_PRIVATE_KEY }} @@ -145,5 +145,5 @@ jobs: > \`\`\` EOF - - uses: SocketDev/socket-registry/.github/actions/cleanup-git-signing@85a2fc0d33af6304246620365de3e7f053035a8d # main + - uses: SocketDev/socket-registry/.github/actions/cleanup-git-signing@eeb81520395947c6c4ab701b4f7f690a2296b816 # main if: always() diff --git a/.github/workflows/provenance.yml b/.github/workflows/provenance.yml index ecf5df13..7fd8db92 100644 --- a/.github/workflows/provenance.yml +++ b/.github/workflows/provenance.yml @@ -25,7 +25,7 @@ jobs: permissions: contents: write # To create GitHub releases id-token: write # For npm trusted publishing via OIDC - uses: SocketDev/socket-registry/.github/workflows/provenance.yml@85a2fc0d33af6304246620365de3e7f053035a8d # main + uses: SocketDev/socket-registry/.github/workflows/provenance.yml@eeb81520395947c6c4ab701b4f7f690a2296b816 # main with: debug: ${{ inputs.debug }} dist-tag: ${{ inputs.dist-tag }} diff --git a/.github/workflows/weekly-update.yml b/.github/workflows/weekly-update.yml index 1112eaed..2aa344ff 100644 --- a/.github/workflows/weekly-update.yml +++ b/.github/workflows/weekly-update.yml @@ -10,7 +10,7 @@ permissions: jobs: weekly-update: - uses: SocketDev/socket-registry/.github/workflows/weekly-update.yml@85a2fc0d33af6304246620365de3e7f053035a8d # main + uses: SocketDev/socket-registry/.github/workflows/weekly-update.yml@eeb81520395947c6c4ab701b4f7f690a2296b816 # main with: test-setup-script: 'pnpm run build' test-script: 'pnpm test' From a53d18e2553ff0bbfa3ee5ba4a2bfcd110e78528 Mon Sep 17 00:00:00 2001 From: jdalton Date: Wed, 29 Apr 2026 19:46:38 -0400 Subject: [PATCH 04/17] chore(ci): cascade socket-registry to 0fc1abfd (pnpm 11.0.0 GA chain) --- .github/workflows/ci.yml | 2 +- .github/workflows/generate.yml | 6 +++--- .github/workflows/provenance.yml | 2 +- .github/workflows/weekly-update.yml | 2 +- 4 files changed, 6 insertions(+), 6 deletions(-) diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index 535697a0..69763f87 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -21,6 +21,6 @@ concurrency: jobs: ci: name: Run CI Pipeline - uses: SocketDev/socket-registry/.github/workflows/ci.yml@eeb81520395947c6c4ab701b4f7f690a2296b816 # main + uses: SocketDev/socket-registry/.github/workflows/ci.yml@0fc1abfd5e9ae4396f9fb507371aeea75e2f412c # main with: test-script: 'pnpm run test --all --skip-build' diff --git a/.github/workflows/generate.yml b/.github/workflows/generate.yml index f26b4e8d..8a6580ee 100644 --- a/.github/workflows/generate.yml +++ b/.github/workflows/generate.yml @@ -46,14 +46,14 @@ jobs: echo "Sleeping for $delay seconds..." sleep $delay - - uses: SocketDev/socket-registry/.github/actions/setup-and-install@eeb81520395947c6c4ab701b4f7f690a2296b816 # main + - uses: SocketDev/socket-registry/.github/actions/setup-and-install@0fc1abfd5e9ae4396f9fb507371aeea75e2f412c # main - name: Configure push credentials env: GH_TOKEN: ${{ github.token }} run: git remote set-url origin "https://x-access-token:${GH_TOKEN}@github.com/${{ github.repository }}.git" - - uses: SocketDev/socket-registry/.github/actions/setup-git-signing@eeb81520395947c6c4ab701b4f7f690a2296b816 # main + - uses: SocketDev/socket-registry/.github/actions/setup-git-signing@0fc1abfd5e9ae4396f9fb507371aeea75e2f412c # main with: gpg-private-key: ${{ secrets.BOT_GPG_PRIVATE_KEY }} @@ -145,5 +145,5 @@ jobs: > \`\`\` EOF - - uses: SocketDev/socket-registry/.github/actions/cleanup-git-signing@eeb81520395947c6c4ab701b4f7f690a2296b816 # main + - uses: SocketDev/socket-registry/.github/actions/cleanup-git-signing@0fc1abfd5e9ae4396f9fb507371aeea75e2f412c # main if: always() diff --git a/.github/workflows/provenance.yml b/.github/workflows/provenance.yml index 7fd8db92..9bd82b9d 100644 --- a/.github/workflows/provenance.yml +++ b/.github/workflows/provenance.yml @@ -25,7 +25,7 @@ jobs: permissions: contents: write # To create GitHub releases id-token: write # For npm trusted publishing via OIDC - uses: SocketDev/socket-registry/.github/workflows/provenance.yml@eeb81520395947c6c4ab701b4f7f690a2296b816 # main + uses: SocketDev/socket-registry/.github/workflows/provenance.yml@0fc1abfd5e9ae4396f9fb507371aeea75e2f412c # main with: debug: ${{ inputs.debug }} dist-tag: ${{ inputs.dist-tag }} diff --git a/.github/workflows/weekly-update.yml b/.github/workflows/weekly-update.yml index 2aa344ff..dc58ead3 100644 --- a/.github/workflows/weekly-update.yml +++ b/.github/workflows/weekly-update.yml @@ -10,7 +10,7 @@ permissions: jobs: weekly-update: - uses: SocketDev/socket-registry/.github/workflows/weekly-update.yml@eeb81520395947c6c4ab701b4f7f690a2296b816 # main + uses: SocketDev/socket-registry/.github/workflows/weekly-update.yml@0fc1abfd5e9ae4396f9fb507371aeea75e2f412c # main with: test-setup-script: 'pnpm run build' test-script: 'pnpm test' From 86a11e419da64e03f4551285fec26f8c4759bd29 Mon Sep 17 00:00:00 2001 From: jdalton Date: Thu, 30 Apr 2026 10:55:54 -0400 Subject: [PATCH 05/17] feat(.claude): add stale-process-sweeper Stop hook Reaps orphan vitest/tsgo/type-coverage/esbuild workers at turn-end so they don't pile up across turns and exhaust system memory. Only kills processes whose parent has died (true orphans); leaves running test/build trees alone. - .claude/hooks/stale-process-sweeper/ (hook + tests + README) - .claude/settings.json (Stop hook block) - CLAUDE.md (Background Bash rule) Sourced from socket-repo-template (canonical fleet hook). --- .claude/hooks/stale-process-sweeper/README.md | 74 ++++++ .claude/hooks/stale-process-sweeper/index.mts | 214 ++++++++++++++++++ .../hooks/stale-process-sweeper/package.json | 12 + .../test/stale-process-sweeper.test.mts | 84 +++++++ .../hooks/stale-process-sweeper/tsconfig.json | 15 ++ .claude/settings.json | 10 + CLAUDE.md | 4 + 7 files changed, 413 insertions(+) create mode 100644 .claude/hooks/stale-process-sweeper/README.md create mode 100644 .claude/hooks/stale-process-sweeper/index.mts create mode 100644 .claude/hooks/stale-process-sweeper/package.json create mode 100644 .claude/hooks/stale-process-sweeper/test/stale-process-sweeper.test.mts create mode 100644 .claude/hooks/stale-process-sweeper/tsconfig.json diff --git a/.claude/hooks/stale-process-sweeper/README.md b/.claude/hooks/stale-process-sweeper/README.md new file mode 100644 index 00000000..38d96674 --- /dev/null +++ b/.claude/hooks/stale-process-sweeper/README.md @@ -0,0 +1,74 @@ +# stale-process-sweeper + +Claude Code `Stop` hook that sweeps stale Node test/build worker +processes at turn-end, before they pile up across turns and exhaust +system memory. + +## Why + +Vitest's `forks` pool spawns one Node worker per CPU. When the parent +runner exits abnormally β€” `Bash` timeout, `SIGINT` from the user, +pre-commit hook crash β€” the workers stay alive holding 80–100 MB +each. After a few interrupted runs the host has gigabytes of +abandoned processes. + +The sweeper finds those processes (matched by command-line pattern) +that have lost their parent, and sends them `SIGTERM`. A still-living +parent means the worker is part of a real, in-progress run, and the +sweeper leaves it alone. + +## What's swept + +| Pattern | Source | +| --- | --- | +| `vitest/dist/workers/(forks\|threads)` | Vitest worker pool | +| `vitest/dist/(cli\|node).[mc]?js` | Orphaned Vitest parent runners | +| `\btsgo\b` | TypeScript Go-based type checker | +| `type-coverage/bin/type-coverage` | Type coverage tool | +| `esbuild/(bin\|lib)/.*\bservice\b` | esbuild's daemon service | + +## What's not swept + +- Anything spawned by a still-living shell (PPID alive) +- The Claude Code process itself or its parent terminal +- Anything outside the pattern list + +## Wiring + +In `.claude/settings.json`: + +```json +{ + "hooks": { + "Stop": [ + { + "hooks": [ + { + "type": "command", + "command": "node .claude/hooks/stale-process-sweeper/index.mts" + } + ] + } + ] + } +} +``` + +## Output + +Silent on the happy path (no orphans found). When something is reaped: + +``` +[stale-process-sweeper] reaped 14 stale worker(s), ~1120MB freed: +vitest-worker=29240(95MB), vitest-worker=33278(93MB), … +``` + +The line goes to stderr. Stop-hook output is shown to the user, not +the model β€” useful diagnostic, doesn't pollute Claude's context. + +## Tests + +```bash +cd .claude/hooks/stale-process-sweeper +node --test test/*.test.mts +``` diff --git a/.claude/hooks/stale-process-sweeper/index.mts b/.claude/hooks/stale-process-sweeper/index.mts new file mode 100644 index 00000000..4e9923e5 --- /dev/null +++ b/.claude/hooks/stale-process-sweeper/index.mts @@ -0,0 +1,214 @@ +#!/usr/bin/env node +// Claude Code Stop hook β€” stale-process-sweeper. +// +// Fires at turn-end. Finds Node test/build worker processes that the +// session left behind (test runner crashed mid-run, hook timed out, +// user interrupted `Bash`, etc.) and kills them so they don't pile up +// across turns and exhaust system memory. +// +// What's swept: +// - vitest workers (`vitest/dist/workers/forks` and the threads pool) +// - vitest itself (orphan parent runners that survived a SIGINT) +// - tsgo / tsc type-check daemons +// - type-coverage workers +// - esbuild service processes +// +// What's NOT swept: +// - Anything spawned by a still-living shell (PPID alive) +// - Anything matching the user's editors / IDEs / terminals +// - The Claude Code process itself +// +// The hook is fast (one `ps` call + a few regex matches + a couple of +// `kill -0` probes) and silent on the happy path. It only writes to +// stderr when it actually killed something β€” that's a useful signal. +// +// Stop hooks receive JSON on stdin (we don't read it; the body +// shape is irrelevant to our work) and exit code is advisory. + +import { spawnSync } from 'node:child_process' +import process from 'node:process' + +// Process-name patterns that indicate a stale test/build worker. +// Must be specific enough that real user processes (a normal `node` +// invocation, an editor's language server) don't match. +const STALE_PATTERNS: Array<{ name: string; rx: RegExp }> = [ + // Vitest worker pools β€” both `forks` (process-per-worker) and the + // path the threads pool uses when isolation is requested. The + // canonical leak: Vitest spawns N workers, parent crashes/SIGINTs, + // workers stay alive holding 80–100MB each. + { + name: 'vitest-worker', + rx: /vitest\/dist\/workers\/(forks|threads)/, + }, + // Vitest parent runner that survived its own children's exit. + // Matches `node ... vitest/dist/cli ... run` etc. + { + name: 'vitest-runner', + rx: /vitest\/dist\/(cli|node)\.[mc]?js/, + }, + // tsgo / tsc daemons. `tsgo` is the new Go-based type checker; + // `tsc --watch` daemons can also linger. + { + name: 'tsgo', + rx: /\btsgo\b/, + }, + // type-coverage runs as a separate process and sometimes outlives + // its CI step. + { + name: 'type-coverage', + rx: /type-coverage\/bin\/type-coverage/, + }, + // esbuild's daemon service helper. + { + name: 'esbuild-service', + rx: /esbuild\/(bin|lib)\/.*\bservice\b/, + }, +] + +interface ProcRow { + pid: number + ppid: number + rss: number + command: string +} + +function listProcesses(): ProcRow[] { + // -A: all processes, -o: custom format, no truncation. macOS + Linux + // both support this exact form. Windows isn't supported (Stop hook + // is unix-only in practice for socket-* repos). + const result = spawnSync( + 'ps', + ['-A', '-o', 'pid=,ppid=,rss=,command='], + { encoding: 'utf8' }, + ) + if (result.status !== 0 || !result.stdout) { + return [] + } + const rows: ProcRow[] = [] + for (const line of result.stdout.split('\n')) { + if (!line.trim()) { + continue + } + // Split into [pid, ppid, rss, ...command]. `command` may contain + // arbitrary spaces, so re-join after the first three fields. + const parts = line.trim().split(/\s+/) + if (parts.length < 4) { + continue + } + const pid = Number.parseInt(parts[0]!, 10) + const ppid = Number.parseInt(parts[1]!, 10) + const rss = Number.parseInt(parts[2]!, 10) + if (!Number.isFinite(pid) || !Number.isFinite(ppid)) { + continue + } + const command = parts.slice(3).join(' ') + rows.push({ pid, ppid, rss, command }) + } + return rows +} + +function isAlive(pid: number): boolean { + if (pid <= 1) { + // PID 0 / 1 are the kernel / init β€” if our parent is one of those, + // we're definitely an orphan, but `kill -0 1` would mislead. + return false + } + try { + process.kill(pid, 0) + return true + } catch { + return false + } +} + +function classify(row: ProcRow): string | undefined { + for (const { name, rx } of STALE_PATTERNS) { + if (rx.test(row.command)) { + return name + } + } + return undefined +} + +function sweep(): { killed: Array<{ pid: number; name: string; rssMb: number }>; skipped: number } { + const rows = listProcesses() + const myPid = process.pid + const myPpid = process.ppid + const killed: Array<{ pid: number; name: string; rssMb: number }> = [] + let skipped = 0 + + for (const row of rows) { + // Never touch ourselves or our parent (Claude Code). + if (row.pid === myPid || row.pid === myPpid) { + continue + } + const name = classify(row) + if (!name) { + continue + } + // Only sweep if the parent is gone (true orphan) or is PID 1 + // (re-parented to init after the original parent exited). A live + // parent means the worker is part of a real, in-progress run we + // should not interrupt. + const orphan = row.ppid === 1 || !isAlive(row.ppid) + if (!orphan) { + skipped += 1 + continue + } + try { + // SIGTERM first β€” give the worker a chance to flush. We don't + // wait for it; the next sweep (next turn) will SIGKILL anything + // that ignored SIGTERM. Keeping the hook fast matters more than + // squeezing every last byte. + process.kill(row.pid, 'SIGTERM') + killed.push({ + pid: row.pid, + name, + rssMb: Math.round(row.rss / 1024), + }) + } catch { + // Already gone, or we lack permission β€” nothing to do. + } + } + return { killed, skipped } +} + +function main() { + // Drain stdin (Stop hook delivers a JSON payload). We don't need + // the body, but Node will keep the event loop alive if we don't + // consume it. + process.stdin.resume() + process.stdin.on('data', () => {}) + process.stdin.on('end', runSweep) + // If stdin is already closed (some hook runners don't pipe input), + // run immediately. + if (process.stdin.readable === false) { + runSweep() + } +} + +function runSweep() { + let result: { killed: Array<{ pid: number; name: string; rssMb: number }>; skipped: number } + try { + result = sweep() + } catch (e) { + // Hooks must never crash a Claude turn. Log and exit clean. + process.stderr.write( + `[stale-process-sweeper] unexpected error: ${(e as Error).message}\n`, + ) + process.exit(0) + } + if (result.killed.length > 0) { + const totalMb = result.killed.reduce((sum, k) => sum + k.rssMb, 0) + const breakdown = result.killed + .map(k => `${k.name}=${k.pid}(${k.rssMb}MB)`) + .join(', ') + process.stderr.write( + `[stale-process-sweeper] reaped ${result.killed.length} stale ` + + `worker(s), ~${totalMb}MB freed: ${breakdown}\n`, + ) + } + process.exit(0) +} + +main() diff --git a/.claude/hooks/stale-process-sweeper/package.json b/.claude/hooks/stale-process-sweeper/package.json new file mode 100644 index 00000000..1a0f6de1 --- /dev/null +++ b/.claude/hooks/stale-process-sweeper/package.json @@ -0,0 +1,12 @@ +{ + "name": "hook-stale-process-sweeper", + "private": true, + "type": "module", + "main": "./index.mts", + "exports": { + ".": "./index.mts" + }, + "scripts": { + "test": "node --test test/*.test.mts" + } +} diff --git a/.claude/hooks/stale-process-sweeper/test/stale-process-sweeper.test.mts b/.claude/hooks/stale-process-sweeper/test/stale-process-sweeper.test.mts new file mode 100644 index 00000000..56ac3572 --- /dev/null +++ b/.claude/hooks/stale-process-sweeper/test/stale-process-sweeper.test.mts @@ -0,0 +1,84 @@ +import { spawn } from 'node:child_process' +import { fileURLToPath } from 'node:url' +import path from 'node:path' +import { test } from 'node:test' +import assert from 'node:assert/strict' + +const __dirname = path.dirname(fileURLToPath(import.meta.url)) +const HOOK = path.resolve(__dirname, '..', 'index.mts') + +// Run the hook with an empty stdin payload (Stop hook delivers JSON, +// but the body is unused). Captures stderr + exit code. +function runHook(): Promise<{ code: number; stderr: string }> { + return new Promise((resolve, reject) => { + const child = spawn(process.execPath, [HOOK], { + stdio: ['pipe', 'ignore', 'pipe'], + }) + let stderr = '' + child.stderr.on('data', d => { + stderr += d.toString() + }) + child.on('error', reject) + child.on('exit', code => { + resolve({ code: code ?? -1, stderr }) + }) + // Stop hooks receive a JSON payload on stdin. Send an empty object + // so the hook's drain logic completes. + child.stdin.end('{}\n') + }) +} + +test('stale-process-sweeper: exits 0 when nothing to sweep', async () => { + const { code, stderr } = await runHook() + assert.equal(code, 0, `hook should exit 0; stderr=${stderr}`) + // On a clean host the hook should be silent. + assert.equal( + stderr, + '', + `hook should be silent when no orphans exist; got: ${stderr}`, + ) +}) + +test('stale-process-sweeper: ignores live-parent test workers', async () => { + // Spawn a fake "vitest worker" whose parent is still alive. The + // sweeper must not touch it. We use a script path that matches the + // worker regex; the actual command runs `node -e 'setTimeout(...)'` + // long enough to outlive the hook invocation. + // + // Note: matching the regex `vitest/dist/workers/forks` requires a + // command line that contains that substring. We can't easily forge + // a real vitest binary, so we approximate by passing the path as an + // argv string β€” `ps -o command=` reflects argv, and the regex sees + // it. + const fakeWorker = spawn( + process.execPath, + [ + '-e', + 'setTimeout(() => {}, 5000)', + // This dummy arg is what `ps` will report; the sweeper's regex + // picks it up. The worker still has a live parent (this test + // process), so the sweeper should NOT kill it. + '/fake/vitest/dist/workers/forks.js', + ], + { stdio: 'ignore', detached: false }, + ) + // Give the OS a moment to register the child. + await new Promise(r => setTimeout(r, 100)) + try { + const { code, stderr } = await runHook() + assert.equal(code, 0) + // Should NOT have reaped the fake worker β€” its parent (us) is + // alive. If the hook killed it, the message would mention it. + assert.ok( + !stderr.includes('reaped'), + `hook reaped a live-parent worker: ${stderr}`, + ) + // Verify the worker is still alive. + assert.ok( + !fakeWorker.killed && fakeWorker.exitCode === null, + 'fake worker should still be running', + ) + } finally { + fakeWorker.kill('SIGKILL') + } +}) diff --git a/.claude/hooks/stale-process-sweeper/tsconfig.json b/.claude/hooks/stale-process-sweeper/tsconfig.json new file mode 100644 index 00000000..53c5c847 --- /dev/null +++ b/.claude/hooks/stale-process-sweeper/tsconfig.json @@ -0,0 +1,15 @@ +{ + "compilerOptions": { + "declarationMap": false, + "erasableSyntaxOnly": true, + "module": "nodenext", + "moduleResolution": "nodenext", + "noEmit": true, + "rewriteRelativeImportExtensions": true, + "skipLibCheck": true, + "sourceMap": false, + "strict": true, + "target": "esnext", + "verbatimModuleSyntax": true + } +} diff --git a/.claude/settings.json b/.claude/settings.json index 894dcf15..5c6bd51f 100644 --- a/.claude/settings.json +++ b/.claude/settings.json @@ -35,6 +35,16 @@ } ] } + ], + "Stop": [ + { + "hooks": [ + { + "type": "command", + "command": "node .claude/hooks/stale-process-sweeper/index.mts" + } + ] + } ] }, "permissions": { diff --git a/CLAUDE.md b/CLAUDE.md index 793dbf85..b6e292fa 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -104,6 +104,10 @@ When you encounter a legacy term during unrelated work, fix it inline β€” don't - **Leaky**: `Promise.race(pool)` inside a loop where `pool` persists across iterations (the classic concurrency-limiter bug) β€” also applies to `Promise.any` and long-lived arms like interrupt signals. - **Fix**: single-waiter "slot available" signal β€” each task's `.then` resolves a one-shot `promiseWithResolvers` that the loop awaits, then replaces. No persistent pool, nothing to stack. +### Background Bash + +Never use `Bash(run_in_background: true)` for test/build commands (`vitest`, `pnpm test`, `pnpm build`, `tsgo`). Backgrounded runs you don't poll get abandoned and leak Node workers. Background mode is for dev servers and long migrations whose results you'll consume. If a run hangs, kill it: `pkill -f "vitest/dist/workers"`. + --- ## EMOJI & OUTPUT STYLE From 11082c6c55bcc7227504f24bbfb57bdb2d7404dc Mon Sep 17 00:00:00 2001 From: jdalton Date: Thu, 30 Apr 2026 11:29:49 -0400 Subject: [PATCH 06/17] docs(claude): restructure CLAUDE.md to fleet-canonical + project-specific layout MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Split CLAUDE.md into two clearly-delimited sections: - `## πŸ“š Fleet Standards` β€” wrapped in BEGIN/END FLEET-CANONICAL markers, byte-identical across every socket-* repo (sync via socket-repo-template). - `## πŸ—οΈ SDK-Specific` β€” repo-owned content: Architecture, Commands, Configuration Files, SDK-Specific Patterns, Testing, CI Testing, Changelog Management, Debugging, SDK Notes. Fleet block ~8.6 KB; verbose content moves to references: - `docs/references/inclusive-language.md` - `docs/references/sorting.md` - `.claude/skills/promise-race-pitfall/SKILL.md` CLAUDE.md 22.7 KB β†’ 13.9 KB. Joins this PR's programmatic-Claude lockdown additions; the new fleet block already references the lockdown skill this PR adds. --- .claude/skills/promise-race-pitfall/SKILL.md | 57 +++++ CLAUDE.md | 255 ++++++------------- docs/references/inclusive-language.md | 34 +++ docs/references/sorting.md | 16 ++ 4 files changed, 189 insertions(+), 173 deletions(-) create mode 100644 .claude/skills/promise-race-pitfall/SKILL.md create mode 100644 docs/references/inclusive-language.md create mode 100644 docs/references/sorting.md diff --git a/.claude/skills/promise-race-pitfall/SKILL.md b/.claude/skills/promise-race-pitfall/SKILL.md new file mode 100644 index 00000000..d38f3c2a --- /dev/null +++ b/.claude/skills/promise-race-pitfall/SKILL.md @@ -0,0 +1,57 @@ +--- +name: promise-race-pitfall +description: Reference for the `Promise.race` cross-iteration handler-leak bug. Loads on demand when writing or reviewing concurrency code that uses `Promise.race`, `Promise.any`, or hand-rolled concurrency limiters. +--- + +# Promise.race in loops β€” the handler-leak pitfall + +**Never re-race the same pool of promises across loop iterations.** Each call to `Promise.race([A, B, …])` attaches fresh `.then` handlers to every arm. A promise that survives N iterations accumulates N handler sets. See [nodejs/node#17469](https://github.com/nodejs/node/issues/17469) and [`@watchable/unpromise`](https://github.com/watchable/unpromise). + +## Patterns + +- **Safe** β€” both arms created per call: + + ```ts + const value = await Promise.race([ + fetchSomething(), + new Promise((_, r) => setTimeout(() => r(new Error('timeout')), 5000)), + ]) + ``` + +- **Leaky** β€” `pool` survives across iterations, accumulating handlers: + + ```ts + while (queue.length) { + const winner = await Promise.race(pool) // ← N handlers per arm by iteration N + pool = pool.filter(p => p !== winner) + } + ``` + + Same hazard for `Promise.any` and any long-lived arm such as an interrupt signal. + +## The fix + +Use a single-waiter "slot available" signal. Each task's `.then` resolves a one-shot `promiseWithResolvers` that the loop awaits, then replaces. No persistent pool, nothing to stack. + +```ts +let signal = Promise.withResolvers() +function startTask(task: Task) { + task.run().then(() => { + const prev = signal + signal = Promise.withResolvers() + prev.resolve(task) + }) +} +while (queue.length) { + // launch up to N tasks + while (running < N && queue.length) startTask(queue.shift()!) + const finished = await signal.promise + running -= 1 +} +``` + +The arm being awaited is *always fresh*; nothing accumulates handlers. + +## Quick check + +Before merging concurrency code, ask: *does any arm of a `Promise.race`/`Promise.any` outlive the call?* If yes, refactor to the single-waiter signal. diff --git a/CLAUDE.md b/CLAUDE.md index b6e292fa..33343c12 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -2,155 +2,132 @@ 🚨 **MANDATORY**: Act as principal-level engineer with deep expertise in TypeScript, Node.js, and SDK development. -## USER CONTEXT + -- Identify users by git credentials; use their actual name, never "the user" -- Use "you/your" when speaking directly; use names when referencing contributions +## πŸ“š Fleet Standards -## 🚨 PARALLEL CLAUDE SESSIONS - WORKTREE REQUIRED +### Identifying users -**This repo may have multiple Claude sessions running concurrently against the same checkout, against parallel git worktrees, or against sibling clones.** Several common git operations are hostile to that and silently destroy or hijack the other session's work. +Identify users by git credentials and use their actual name. Use "you/your" when speaking directly; use names when referencing contributions. -- **FORBIDDEN in the primary checkout** (the one another Claude may be editing): - - `git stash` β€” shared stash store; another session can `pop` yours. - - `git add -A` / `git add .` β€” sweeps files belonging to other sessions. - - `git checkout ` / `git switch ` β€” yanks the working tree out from under another session. - - `git reset --hard` against a non-HEAD ref β€” discards another session's commits. -- **REQUIRED for branch work**: spawn a worktree instead of switching branches in place. Each worktree has its own HEAD, so branch operations inside it are safe. +### Parallel Claude sessions - ```bash - # From the primary checkout β€” does NOT touch the working tree here. - git worktree add -b ../- main - cd ../- - # edit, commit, push from here; the primary checkout is untouched. - cd - - git worktree remove ../- - ``` +This repo may have multiple Claude sessions running concurrently against the same checkout, against parallel git worktrees, or against sibling clones. Several common git operations are hostile to that. -- **REQUIRED for staging**: surgical `git add […]` with explicit paths. Never `-A` / `.`. -- **If you need a quick WIP save**: commit on a new branch from inside a worktree, not a stash. -- **NEVER revert files you didn't touch.** If `git status` shows files you didn't modify, those belong to another session, an upstream pull, or a hook side-effect β€” leave them alone. Specifically: do not run `git checkout -- ` to "clean up" the diff before committing, and do not include unrelated paths in `git add`. Stage only the explicit files you edited. +**Forbidden in the primary checkout:** -The umbrella rule: never run a git command that mutates state belonging to a path other than the file you just edited. +- `git stash` β€” shared store; another session can `pop` yours +- `git add -A` / `git add .` β€” sweeps files from other sessions +- `git checkout ` / `git switch ` β€” yanks the working tree out from under another session +- `git reset --hard` against a non-HEAD ref β€” discards another session's commits -## πŸ“š SHARED STANDARDS +**Required for branch work:** spawn a worktree. -- Commits: [Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/) `(): ` β€” NO AI attribution -- **Open PRs:** when adding commits to an OPEN PR, ALWAYS update the PR title and description to match the new scope. A title like `chore: foo` after you've added security-fix and docs commits to it is now a lie. Use `gh pr edit --title "..." --body "..."` (or `--body-file`) and rewrite the body so it reflects every commit on the branch, grouped by theme. The reviewer should be able to read the PR description and know what's in it without scrolling commits. -- Scripts: Prefer `pnpm run foo --flag` over `foo:bar` scripts -- Dependencies: After `package.json` edits, run `pnpm install` -- Backward Compatibility: 🚨 FORBIDDEN to maintain β€” actively remove when encountered -- 🚨 **NEVER use `npx`, `pnpm dlx`, or `yarn dlx`** β€” use `pnpm exec ` or `pnpm run