feat(skillify): one-shot local skill mining for non-auth users (mine-local)#129
feat(skillify): one-shot local skill mining for non-auth users (mine-local)#129efenocchi wants to merge 10 commits into
mine-local)#129Conversation
New pure helpers for mining skills from local agent transcripts without talking to Deeplake — supports the upcoming `hivemind skillify mine-local` one-shot for users who haven't signed in yet. - detectInstalledAgents() walks well-known session-dir roots (~/.claude/projects/, ~/.codex/sessions/) and reports each agent's encode-cwd scheme. Claude Code maps both `/` AND `_` to `-` in the encoded dir name; verified against real ~/.claude/projects/ entries. - detectHostAgent() reads CLAUDECODE / CODEX_HOME env vars to know when we're running inside an agent (so the CLI can skip interactive prompts and default to the host's gate CLI + model). - listLocalSessions() enumerates .jsonl files across all installs and tags each with mtime + in_cwd flag for the picker. - pickSessions() implements the 3-phase ε-greedy pick: cwd-quota → global-quota → top-up, dedup-by-path throughout. Handles all-in-cwd / none-in-cwd / mixed without producing duplicates. - nativeJsonlToRows() converts Claude Code native JSONL into the SessionRow shape the existing extractPairs() consumes. Mirrors the production capture hook's `last_assistant_message` semantics: only the final text-bearing assistant entry per turn is emitted, so the gate doesn't see "Now I'll run X" mini-narration between dropped tool_use blocks. No wiring yet — the orchestrator and CLI dispatch land in follow-ups.
One-shot mining flow for users who haven't logged into Deeplake yet: pick N local sessions, run an LLM gate per session in parallel, write unique skills to ~/.claude/skills/, track results in a manifest. Design choices that came out of e2e debugging: - Parallel-per-session, NOT concatenated. Each session has its own problem domain; mixing N sessions in one prompt dilutes signal and makes the gate over-conservative. Concurrency cap=4 keeps the Anthropic side honest while finishing 8 sessions in ~90s. - stdin-piped gate runner (local runGateViaStdin), not the shared argv-bound runGate. Linux MAX_ARG_STRLEN is 128 KB per single argv arg, and a per-session prompt easily exceeds that. Doesn't touch the worker's shared gate path. - In-flight session filter: skip any session modified within the last 60 s. Without this, mining bundles the live conversation into the prompt and the gate sees meta-discussion about the feature under construction. - Per-session pair cap (30) + per-pair char cap (4 KB). The gate sees the LAST 30 pairs of each session — that's where crystallized takeaways live, not "let's explore X" session-openers. - Multi-skill output per call. The gate returns up to 3 distinct skills per session; each session contributes independently. - Overlap check, not name-dedup. Each candidate's description is compared (Jaccard on stopword-filtered tokens, threshold 0.4) against already-installed skills AND already-written-this-run skills. Overlap → skip with a "overlaps with X" line. No name collision, no semantic duplicate. - Manifest at ~/.claude/hivemind/local-mined.json doubles as a one-shot sentinel — re-runs require --force. Each entry tracks source_session_ids/paths + uploaded:false so a later `skillify push-local` (when the user signs in) knows what to send. CLI surface: hivemind skillify mine-local [--n <num|all>] [--force] [--dry-run] No tests yet — pure unit-testable bits (pickSessions, parseMultiVerdict, findOverlap) will get their own test file in a follow-up.
Plumb the orchestrator into the CLI dispatcher and rebuild the unified bundle so `hivemind skillify mine-local` is callable from the installed binary. - runSkillifyCommand now matches "mine-local" and calls runMineLocal with the remaining argv. - usage() text grows three lines documenting --n / --force / --dry-run. - bundle/cli.js rebuilt from current src state.
…here
Each of the 5 agents (claude_code, codex, cursor, hermes, pi) used to
maintain its own hand-edited list of `hivemind skillify ...` commands
in its SessionStart injection block. Adding a new subcommand meant
remembering to touch all five places, and the `mine-local` command we
just shipped was missing from every agent's injected context — the
model couldn't know it existed.
This commit:
- Introduces src/cli/skillify-spec.ts as the single source of truth:
a typed array of {cmd, desc} entries plus a renderSkillifyCommands()
helper that produces the dash-aligned bullet block. The new
mine-local entry is included.
- Refactors the 4 hook-based session-start.ts files (claude_code,
codex, cursor, hermes) to import the spec and render it inline.
This unifies a small wording divergence (claude_code's "Skill
management ..." header vs codex/cursor/hermes' "SKILLS (skillify)
..." header is preserved per-agent; only the bulleted list itself
comes from the spec).
- Mirrors the spec inline in pi/extension-source/hivemind.ts. pi's
extension is shipped as a single self-contained .ts loaded by pi's
runtime — it can't import from src/. The duplicate is clearly
flagged "MIRROR of src/cli/skillify-spec.ts" and guarded by a
drift-detection test (tests/pi/skillify-spec-drift.test.ts) that
fails the build if either side adds, removes, or rewords an entry.
- Adds tests/pi/ to vitest.config.ts include glob.
After this commit, `mine-local` appears in every agent's
SessionStart injection (verified by grep against the rebuilt
bundles), and every future subcommand only needs editing in two
places (the spec + pi's mirror) instead of five.
openclaw exposes a different command surface (slash commands +
MCP-style tools, not `hivemind skillify`) so it's intentionally
out of scope here.
Add coverage for the pure functions that mine-local relies on: - pickSessions (3 degenerate cases + dedup + ordering) - nativeJsonlToRows (last_assistant_message semantics, tool-result user arrays dropped, thinking + tool_use blocks stripped, malformed lines skipped silently) - summaryTokens / jaccard (stopword + short-token filtering, identical / disjoint / partial overlap math) - findOverlap (no-match → null, semantic overlap detection, best-match selection when multiple cross threshold, stopword-heavy descriptions not falsely matched) - parseMultiVerdict (valid array shape, empty-skills SKIP, filtering of entries missing required fields, malformed JSON → null, code-fenced / prose-wrapped JSON extraction, whitespace trimming) The orchestrator runMineLocal itself is exercised by the e2e flow (`hivemind skillify mine-local --force`), not unit-tested here — it spawns the agent CLI and writes to ~/.claude/skills/, neither of which is mock-friendly enough to be worth re-deriving here. Adds `export` to summaryTokens / jaccard / findOverlap / parseMultiVerdict / MinedSkill / MultiVerdict in mine-local.ts. No behavior change — just makes them testable from outside the module. 35 new tests, all passing. Full suite stays green at 2278/2278.
Mined skills now appear in every installed agent's native skills root, not just ~/.claude/skills/. Without this, mine-local skills were invisible to codex / hermes / pi even when those agents were installed on the same machine. Implementation reuses the existing pull infrastructure: - detectAgentSkillsRoots(skillsRoot) from src/skillify/agent-roots.ts enumerates roots present on this machine: ~/.agents/skills/ when codex OR pi is installed (agentskills.io shared layout), ~/.hermes/skills/ when hermes is installed, ~/.pi/agent/skills/ when pi is installed. Cursor has no native skill discovery and is intentionally excluded by the detector. - fanOutSymlinks(canonicalDir, dirName, roots) from src/skillify/pull.ts creates idempotent symlinks pointing back at the canonical ~/.claude/skills/<name>/ — already battle-tested by the `hivemind skillify pull` flow. Manifest entries gain a `symlinks[]` field listing every link created, so a future `push-local` / `unpull` flow can reverse the fan-out cleanly without re-detecting installs. E2E verified by running `hivemind skillify mine-local --force` after deleting one existing skill: the new skill landed at the canonical path plus three symlinks (~/.agents/skills/<name>, ~/.hermes/skills/<name>, ~/.pi/agent/skills/<name>), each pointing at the canonical directory. Console output shows "fan-out → 3 root(s)" per written skill.
mine-local's manifest helpers (LOCAL_MANIFEST_PATH, ManifestEntry, Manifest, loadManifest, saveManifest) move to a new self-contained src/skillify/local-manifest.ts so the SessionStart hooks can read the count without dragging the full orchestrator (gate runner, parallelMap, fan-out, etc.) into the hook bundle. mine-local.ts now imports the shared types and delegates read/write through `readLocalManifest` / `writeLocalManifest` aliases. Behavior is unchanged; the diff is mechanical. countLocalManifestEntries() is added on the shared side as a zero-allocation accessor for the upcoming "you have N local skills, sign in to share new ones" SessionStart message.
When a user runs `hivemind skillify mine-local` then opens a new
session without first signing in, every agent's SessionStart hook
now appends a one-liner to the "not logged in" injection:
N local skill(s) from past 'hivemind skillify mine-local' run(s)
live in ~/.claude/skills/. Run 'hivemind login' to start sharing
new mining results with your team.
This closes the loop on the bootstrap flow: a fresh user gets
useful skills from their local history immediately (no auth needed)
and is gently prompted to sign in when ready to share. The line is
silently omitted when the manifest is missing or empty, so first-
time users who haven't run mine-local don't see a vacuous "0 skills"
note.
Wiring:
- countLocalManifestEntries() now takes an optional path arg, so
tests can point at a tmpdir instead of mutating HOME.
- countLocalManifestEntries() defends against malformed manifests
where `entries` is non-array (e.g. a stray string) — would
otherwise leak that string's `.length` as the count.
- 4 hook session-start.ts files (claude_code, codex, cursor, hermes)
import countLocalManifestEntries from the shared module.
- pi's extension keeps an inline mirror (piCountLocalManifestEntries)
for the same reason the spec mirror exists — pi loads its .ts
directly and can't import from src/.
- 6 new unit tests in tests/claude-code/local-manifest.test.ts cover
the read/write round-trip + every degenerate count path (missing
file, empty entries, malformed JSON, missing field, non-array).
Full suite stays green: 117 test files, 2286 tests passing.
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughThis pull request introduces ChangesSkillify Mine-Local Feature Implementation
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 7
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@bundle/cli.js`:
- Around line 5852-5860: The gate stdin runner (runGateViaStdin) currently
rejects any agent other than "claude_code", which causes gateAgentFor() results
like "codex" to be unusable; change runGateViaStdin (and the same check around
opts.agent) to handle other agents by either invoking the argv-based runner
fallback (e.g., call the existing runGateViaArgv/path that expects prompt in
argv) or by supporting "codex" (and other agents returned by gateAgentFor())
instead of hard-failing; update the branch that checks opts.agent (and the
duplicate check at the other location) to route non-"claude_code" agents into
the fallback runner and preserve the existing error structure when no fallback
is available.
- Around line 6222-6226: The zero-candidate early return skips writing the local
manifest/sentinel so SessionStart lacks a local-skill count; change the branch
where totalCandidates === 0 to still persist the same manifest/sentinel/count
file and any tmpDir metadata the code writes when written.length > 0 before
returning (use the same write/save routine used for the normal path), and apply
the same fix to the duplicate block around the other occurrence (lines handling
the same flow near 6269-6286) so a sentinel is always created even when no
skills were written.
- Line 6430: The help text for the --n option is inconsistent with the
implementation: update the help message printed in the CLI (the console.log line
that documents "--n") to match the actual default used by the code (DEFAULT_N =
8), or alternatively change the DEFAULT_N constant to 3 so behavior matches the
message; locate the DEFAULT_N symbol and the console.log help line that mentions
"--n" (the mine-local option) and make both values consistent (preferably update
the help string to "default: 8" if you want behavior unchanged).
In `@src/commands/mine-local.ts`:
- Around line 418-423: After resolving gateAgent via detectHostAgent() and
gateAgentFor(...) (and computing gateBin with findAgentBin), add a fail-fast
check that validates the selected gateAgent is supported by runGateViaStdin; if
gateAgent !== "claude_code" log a clear error (include gateAgent and gateBin for
context) and terminate (e.g., process.exit(1>0) or throw) before proceeding with
mining. This ensures runGateViaStdin (and the surrounding flow in mine-local.ts)
won't continue with an unsupported agent.
- Around line 271-276: The parser currently allows skills with empty description
which breaks description-based dedupe; update the validation in the
skill-parsing block (where name, description, body, trigger are derived and
pushed to out via out.push) to require a non-empty description as well—i.e.,
after computing description = typeof s.description === "string" ?
s.description.trim() : "" change the guard from if (!name || !body) continue; to
also check description (if (!name || !description || !body) continue;) so only
skills with name, description, and body are added.
In `@src/commands/skillify.ts`:
- Line 196: Update the CLI help text to match the actual runtime default: change
the console.log message that prints the --n help (currently showing "(default:
3)") so it reflects DEFAULT_N used by runMineLocal (which is 8). Locate the help
string emitted in the CLI (the line that prints " --n <num|all>
how many sessions to mine (default: 3)") and update the default value text to
"(default: 8)" or, better, interpolate the DEFAULT_N constant used by
runMineLocal to keep help in sync with the code.
In `@tests/pi/skillify-spec-drift.test.ts`:
- Around line 24-26: The regex extraction of PI_SKILLIFY_COMMANDS into
piArrayMatch (via PI_SOURCE.match(...)) must be guarded before any dereference;
update the tests that use piArrayMatch! (references: piArrayMatch,
PI_SOURCE.match, PI_SKILLIFY_COMMANDS) to first assert/piArrayMatch truthiness
(e.g., expect(piArrayMatch).toBeTruthy(...) or if (!piArrayMatch) throw new
Error(...)) and provide a clear drift-failure message so later lines that access
piArrayMatch[0] or similar never cause a TypeError but instead produce the
explicit test failure.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 4c746266-1457-46b8-90db-d005f536ab11
📒 Files selected for processing (20)
bundle/cli.jsclaude-code/bundle/session-start.jscodex/bundle/session-start.jscursor/bundle/session-start.jshermes/bundle/session-start.jspi/extension-source/hivemind.tssrc/cli/skillify-spec.tssrc/commands/mine-local.tssrc/commands/skillify.tssrc/hooks/codex/session-start.tssrc/hooks/cursor/session-start.tssrc/hooks/hermes/session-start.tssrc/hooks/session-start.tssrc/skillify/local-manifest.tssrc/skillify/local-source.tstests/claude-code/local-manifest.test.tstests/claude-code/local-source.test.tstests/claude-code/mine-local-helpers.test.tstests/pi/skillify-spec-drift.test.tsvitest.config.ts
| return new Promise((resolve) => { | ||
| if (opts.agent !== "claude_code") { | ||
| resolve({ | ||
| stdout: "", | ||
| stderr: "", | ||
| errored: true, | ||
| errorMessage: `stdin gate runner only supports claude_code (got ${opts.agent}); for other agents the prompt must fit in argv` | ||
| }); | ||
| return; |
There was a problem hiding this comment.
Don’t select a gate agent the runner immediately rejects.
gateAgentFor() can return codex, but runGateViaStdin() hard-fails for anything except claude_code. On a Codex host — or a machine with only Codex installed — mine-local will burn through every picked session and never produce a skill.
Suggested fix
-function gateAgentFor(host, fallback) {
- return host ?? fallback;
+function gateAgentFor(host, fallback, installs) {
+ const installed = new Set(installs.map((i) => i.agent));
+ if (installed.has("claude_code"))
+ return "claude_code";
+ throw new Error("mine-local currently requires Claude Code for the local gate runner");
}- const gateAgent = gateAgentFor(host, fallback);
+ const gateAgent = gateAgentFor(host, fallback, installs);Also applies to: 6157-6160
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@bundle/cli.js` around lines 5852 - 5860, The gate stdin runner
(runGateViaStdin) currently rejects any agent other than "claude_code", which
causes gateAgentFor() results like "codex" to be unusable; change
runGateViaStdin (and the same check around opts.agent) to handle other agents by
either invoking the argv-based runner fallback (e.g., call the existing
runGateViaArgv/path that expects prompt in argv) or by supporting "codex" (and
other agents returned by gateAgentFor()) instead of hard-failing; update the
branch that checks opts.agent (and the duplicate check at the other location) to
route non-"claude_code" agents into the fallback runner and preserve the
existing error structure when no fallback is available.
| if (totalCandidates === 0) { | ||
| console.log(`No skills to write.`); | ||
| console.log(`tmp dir kept for inspection: ${tmpDir}`); | ||
| return; | ||
| } |
There was a problem hiding this comment.
Persist the local manifest even when nothing new is written.
Right now the sentinel/count file is only created when written.length > 0, and the zero-candidate path returns before any save. That breaks the “one-shot” behavior and leaves SessionStart with no local-skill count whenever mining finds only overlaps or no skills at all.
Suggested fix
+ const existing = loadManifest2();
if (totalCandidates === 0) {
+ saveManifest2({
+ created_at: existing?.created_at ?? new Date().toISOString(),
+ entries: existing?.entries ?? []
+ });
console.log(`No skills to write.`);
console.log(`tmp dir kept for inspection: ${tmpDir}`);
return;
}
...
- if (written.length > 0) {
- const existing = loadManifest2();
- const newEntries = written.map(({ skill, session, result, symlinks }) => ({
+ const newEntries = written.map(({ skill, session, result, symlinks }) => ({
skill_name: skill.name,
canonical_path: result.path,
symlinks,
source_session_ids: [session.sessionId],
source_session_paths: [session.path],
source_agent: session.agent,
gate_agent: gateAgent,
created_at: result.createdAt,
uploaded: false
- }));
- saveManifest2({
- created_at: existing?.created_at ?? (/* `@__PURE__` */ new Date()).toISOString(),
- entries: [...existing?.entries ?? [], ...newEntries]
- });
- }
+ }));
+ saveManifest2({
+ created_at: existing?.created_at ?? new Date().toISOString(),
+ entries: [...existing?.entries ?? [], ...newEntries]
+ });Also applies to: 6269-6286
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@bundle/cli.js` around lines 6222 - 6226, The zero-candidate early return
skips writing the local manifest/sentinel so SessionStart lacks a local-skill
count; change the branch where totalCandidates === 0 to still persist the same
manifest/sentinel/count file and any tmpDir metadata the code writes when
written.length > 0 before returning (use the same write/save routine used for
the normal path), and apply the same fix to the duplicate block around the other
occurrence (lines handling the same flow near 6269-6286) so a sentinel is always
created even when no skills were written.
| console.log(" hivemind skillify status show per-project state"); | ||
| console.log(" hivemind skillify mine-local [opts] one-shot: seed skills from local sessions (no auth needed)"); | ||
| console.log(" Options for mine-local:"); | ||
| console.log(" --n <num|all> how many sessions to mine (default: 3)"); |
There was a problem hiding this comment.
Fix the documented default for --n.
The help text says --n defaults to 3, but the implementation uses DEFAULT_N = 8. This will mislead users about how many sessions mine-local actually scans.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@bundle/cli.js` at line 6430, The help text for the --n option is inconsistent
with the implementation: update the help message printed in the CLI (the
console.log line that documents "--n") to match the actual default used by the
code (DEFAULT_N = 8), or alternatively change the DEFAULT_N constant to 3 so
behavior matches the message; locate the DEFAULT_N symbol and the console.log
help line that mentions "--n" (the mine-local option) and make both values
consistent (preferably update the help string to "default: 8" if you want
behavior unchanged).
| const name = typeof s.name === "string" ? s.name.trim() : ""; | ||
| const description = typeof s.description === "string" ? s.description.trim() : ""; | ||
| const body = typeof s.body === "string" ? s.body.trim() : ""; | ||
| const trigger = typeof s.trigger === "string" ? s.trigger.trim() : undefined; | ||
| if (!name || !body) continue; | ||
| out.push({ name, description, body, trigger }); |
There was a problem hiding this comment.
Enforce description as required in parsed skills.
Line 272 reads description, but Line 275 only requires name and body. Since overlap dedupe is description-based, allowing empty descriptions undermines duplicate detection and contradicts the declared required shape.
Proposed fix
- if (!name || !body) continue;
+ if (!name || !description || !body) continue;
out.push({ name, description, body, trigger });📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const name = typeof s.name === "string" ? s.name.trim() : ""; | |
| const description = typeof s.description === "string" ? s.description.trim() : ""; | |
| const body = typeof s.body === "string" ? s.body.trim() : ""; | |
| const trigger = typeof s.trigger === "string" ? s.trigger.trim() : undefined; | |
| if (!name || !body) continue; | |
| out.push({ name, description, body, trigger }); | |
| const name = typeof s.name === "string" ? s.name.trim() : ""; | |
| const description = typeof s.description === "string" ? s.description.trim() : ""; | |
| const body = typeof s.body === "string" ? s.body.trim() : ""; | |
| const trigger = typeof s.trigger === "string" ? s.trigger.trim() : undefined; | |
| if (!name || !description || !body) continue; | |
| out.push({ name, description, body, trigger }); |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/commands/mine-local.ts` around lines 271 - 276, The parser currently
allows skills with empty description which breaks description-based dedupe;
update the validation in the skill-parsing block (where name, description, body,
trigger are derived and pushed to out via out.push) to require a non-empty
description as well—i.e., after computing description = typeof s.description ===
"string" ? s.description.trim() : "" change the guard from if (!name || !body)
continue; to also check description (if (!name || !description || !body)
continue;) so only skills with name, description, and body are added.
| const host = detectHostAgent(); | ||
| const fallback = installs[0].agent; | ||
| const gateAgent = gateAgentFor(host, fallback); | ||
| const gateBin = findAgentBin(gateAgent); | ||
| console.log(`Gate CLI: ${gateAgent} (${gateBin})${host ? " — host-agent detected" : ""}`); | ||
|
|
There was a problem hiding this comment.
Fail fast when selected gate agent is not supported.
At Line 420, gateAgent can resolve to non-claude_code, but runGateViaStdin rejects those agents (Line 92), causing all sessions to fail and the command to end as a no-op success path. This should exit immediately with a clear error before starting mining.
Proposed fix
const host = detectHostAgent();
const fallback = installs[0].agent;
const gateAgent = gateAgentFor(host, fallback);
const gateBin = findAgentBin(gateAgent);
+ if (gateAgent !== "claude_code") {
+ console.error(
+ `mine-local currently requires claude_code as the gate agent (resolved: ${gateAgent}). ` +
+ `Install/use Claude Code or add non-claude stdin gate support.`,
+ );
+ process.exit(1);
+ }
console.log(`Gate CLI: ${gateAgent} (${gateBin})${host ? " — host-agent detected" : ""}`);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const host = detectHostAgent(); | |
| const fallback = installs[0].agent; | |
| const gateAgent = gateAgentFor(host, fallback); | |
| const gateBin = findAgentBin(gateAgent); | |
| console.log(`Gate CLI: ${gateAgent} (${gateBin})${host ? " — host-agent detected" : ""}`); | |
| const host = detectHostAgent(); | |
| const fallback = installs[0].agent; | |
| const gateAgent = gateAgentFor(host, fallback); | |
| const gateBin = findAgentBin(gateAgent); | |
| if (gateAgent !== "claude_code") { | |
| console.error( | |
| `mine-local currently requires claude_code as the gate agent (resolved: ${gateAgent}). ` + | |
| `Install/use Claude Code or add non-claude stdin gate support.`, | |
| ); | |
| process.exit(1); | |
| } | |
| console.log(`Gate CLI: ${gateAgent} (${gateBin})${host ? " — host-agent detected" : ""}`); |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/commands/mine-local.ts` around lines 418 - 423, After resolving gateAgent
via detectHostAgent() and gateAgentFor(...) (and computing gateBin with
findAgentBin), add a fail-fast check that validates the selected gateAgent is
supported by runGateViaStdin; if gateAgent !== "claude_code" log a clear error
(include gateAgent and gateBin for context) and terminate (e.g.,
process.exit(1>0) or throw) before proceeding with mining. This ensures
runGateViaStdin (and the surrounding flow in mine-local.ts) won't continue with
an unsupported agent.
| console.log(" hivemind skillify status show per-project state"); | ||
| console.log(" hivemind skillify mine-local [opts] one-shot: seed skills from local sessions (no auth needed)"); | ||
| console.log(" Options for mine-local:"); | ||
| console.log(" --n <num|all> how many sessions to mine (default: 3)"); |
There was a problem hiding this comment.
CLI help default for --n is incorrect.
Line 196 says (default: 3), but runMineLocal uses DEFAULT_N = 8. Please align the help text to the actual runtime default.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/commands/skillify.ts` at line 196, Update the CLI help text to match the
actual runtime default: change the console.log message that prints the --n help
(currently showing "(default: 3)") so it reflects DEFAULT_N used by runMineLocal
(which is 8). Locate the help string emitted in the CLI (the line that prints "
--n <num|all> how many sessions to mine (default: 3)") and update
the default value text to "(default: 8)" or, better, interpolate the DEFAULT_N
constant used by runMineLocal to keep help in sync with the code.
| const piArrayMatch = PI_SOURCE.match( | ||
| /const PI_SKILLIFY_COMMANDS[^]*?\];/, | ||
| ); |
There was a problem hiding this comment.
Guard regex extraction before dereferencing in later tests.
Line 34/46/49 uses piArrayMatch!; if extraction fails, subsequent tests can throw a TypeError instead of producing the explicit drift failure message.
Proposed fix
const piArrayMatch = PI_SOURCE.match(
/const PI_SKILLIFY_COMMANDS[^]*?\];/,
);
+const piBlock = piArrayMatch?.[0] ?? "";
describe("pi skillify spec drift", () => {
it("pi mirror block is present", () => {
expect(piArrayMatch, "PI_SKILLIFY_COMMANDS array literal not found in pi/extension-source/hivemind.ts").toBeTruthy();
});
it("pi mirror has the same number of entries as the canonical spec", () => {
- const piBlock = piArrayMatch![0];
+ expect(piArrayMatch, "PI_SKILLIFY_COMMANDS array literal not found in pi/extension-source/hivemind.ts").toBeTruthy();
const piEntryCount = (piBlock.match(/cmd:\s*"/g) ?? []).length;
expect(
piEntryCount,
`pi has ${piEntryCount} entries but src/cli/skillify-spec.ts has ${SKILLIFY_COMMANDS.length}; sync them`,
).toBe(SKILLIFY_COMMANDS.length);
});
for (const c of SKILLIFY_COMMANDS) {
it(`pi mirror contains command "${c.cmd}"`, () => {
- expect(piArrayMatch![0]).toContain(c.cmd);
+ expect(piBlock).toContain(c.cmd);
});
it(`pi mirror contains description for "${c.cmd}"`, () => {
- expect(piArrayMatch![0]).toContain(c.desc);
+ expect(piBlock).toContain(c.desc);
});
}
});Also applies to: 34-50
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tests/pi/skillify-spec-drift.test.ts` around lines 24 - 26, The regex
extraction of PI_SKILLIFY_COMMANDS into piArrayMatch (via PI_SOURCE.match(...))
must be guarded before any dereference; update the tests that use piArrayMatch!
(references: piArrayMatch, PI_SOURCE.match, PI_SKILLIFY_COMMANDS) to first
assert/piArrayMatch truthiness (e.g., expect(piArrayMatch).toBeTruthy(...) or if
(!piArrayMatch) throw new Error(...)) and provide a clear drift-failure message
so later lines that access piArrayMatch[0] or similar never cause a TypeError
but instead produce the explicit test failure.
…-auth users
The wow-effect flow:
1. User installs hivemind, opens a Claude Code (or codex / cursor /
hermes / pi) session for the first time. They are NOT signed in.
2. SessionStart hook detects: no credentials + no local-mined.json
manifest + ~/.claude/projects/ has at least one .jsonl + `hivemind`
binary is on PATH. All four guards green → spawn
`hivemind skillify mine-local` detached in the background.
3. THIS session continues normally and sees the standard "not logged
in to Deeplake" message — no waiting, no blocking.
4. The background worker (typical wall-clock 60-120 s) mines up to 8
sessions in parallel, writes SKILL.md files to ~/.claude/skills/
with fan-out symlinks to every detected agent skill root, and
records each in ~/.claude/hivemind/local-mined.json.
5. NEXT SessionStart fires (could be the same agent or a different
one — symlinks make the skills visible everywhere). The hook reads
the manifest count and surfaces:
"N local skill(s) from past 'hivemind skillify mine-local'
run(s) live in ~/.claude/skills/. Run 'hivemind login' to
start sharing new mining results with your team."
User opens session N+1 → sees concrete value the system already
produced for them → motivation to sign in to share.
Implementation:
- `src/skillify/spawn-mine-local-worker.ts` — maybeAutoMineLocal()
helper invoked from every SessionStart hook in the no-creds branch.
Guards: manifest-exists, lock-exists, no-claude-sessions, no-hivemind-bin.
Stale-lock recovery: a lock older than 15 min is overridden (a prior
worker presumably crashed without releasing it). Output goes to
~/.claude/hooks/mine-local.log so failures are inspectable.
- `src/skillify/local-manifest.ts` — exports LOCAL_MINE_LOCK_PATH so
both the spawner (creates the lock) and the orchestrator (releases
it on exit) agree without a circular import.
- `src/commands/mine-local.ts` — wraps runMineLocal in a `process.on('exit')`
handler that unlinks the lock. process.exit() skips finally inside
an async function but does fire 'exit' handlers, so this is the
only correct cleanup path for the existing process.exit(1) error
paths.
- 4 hook session-start.ts files (claude_code, codex, cursor, hermes)
call maybeAutoMineLocal() in the no-creds branch and log the result.
- pi/extension-source/hivemind.ts inlines the equivalent piMaybeAutoMineLocal
for the same reason the other pi mirrors exist (extension can't
import from src/). Wired into the existing on('session_start')
handler's else branch.
E2E verified in a sandboxed HOME tmpdir: hook fires → lock file
created within ms → detached worker logs to mine-local.log → on
exit, lock file removed.
Summary
hivemind skillify mine-localsubcommand: a one-shot, auth-free flow that mines reusable skills from the user's local Claude Code / Codex session transcripts on disk. Works before signing in, so a fresh install gives an immediate "this is useful" moment.~/.agents/skills/,~/.hermes/skills/,~/.pi/agent/skills/).hivemind skillify ...command list (src/cli/skillify-spec.ts) consumed by every per-agent SessionStart injection.mine-local, every agent's SessionStart now surfaces the count + sign-in CTA.Why
A fresh hivemind install currently shows the user an empty wall: no skills, no demo of what mining produces.
mine-localreads their existing local agent sessions (no Deeplake auth required), runs the same LLM gate the production worker uses (parallel-per-session, ε-greedy session pick, in-flight session filter, last-assistant-message preprocessing), and writes the resultingSKILL.mds to~/.claude/skills/with symlink fan-out so codex/hermes/pi see them too.On the next session, if the user still hasn't signed in, the SessionStart injection now includes:
— closing the loop on the "first-impression" bootstrap.
Architecture
src/skillify/local-source.ts— agent + session-file detection on disk; ε-greedy pickSessions (3-phase: cwd-quota → global-quota → top-up, dedup-by-path); native Claude Code JSONL → SessionRow conversion mirroring the production capture hook'slast_assistant_messagesemantics (drops tool noise + intermediate narration).src/commands/mine-local.ts— orchestrator. Detects host agent, picks N sessions, runs gate calls in parallel via a smallparallelMap(concurrency 4) + a localrunGateViaStdinrunner (the sharedrunGateuses argv and hits Linux's MAX_ARG_STRLEN at ~128 KB; stdin has no cap). Per-session prompts ask for up to 3 skills as a JSON array. Across sessions, overlap is detected via Jaccard on stopword-filtered description tokens (threshold 0.4) — no aggregation, no name-collision logic; each candidate writes independently unless its summary overlaps something already on disk.src/skillify/local-manifest.ts— shared manifest at~/.claude/hivemind/local-mined.json. Triple duty: one-shot sentinel (re-runs require--force), provenance index for a futurepush-local, and read-only count surface for SessionStart.src/cli/skillify-spec.ts— single source of truth for the command list injected into every agent's SessionStart block. Four hook-based agents import it directly; pi keeps an inline mirror (it can't import fromsrc/) guarded by a drift-detection test.detectAgentSkillsRoots(src/skillify/agent-roots.ts) andfanOutSymlinks(src/skillify/pull.ts) — battle-tested by theskillify pullflow.CLI surface
Commits
(Plus one merge from origin/main.)
Test plan
pickSessions(3 degenerate cases + dedup + ordering),nativeJsonlToRows(last-assistant semantics, tool-result drops, malformed lines),findOverlap(Jaccard threshold + stopword filter + best-match),parseMultiVerdict(valid/empty/missing-fields/malformed/code-fenced),countLocalManifestEntries(missing/empty/populated/malformed/non-array), pi spec drift (every spec entry mirrored inPI_SKILLIFY_COMMANDS).hivemindbinary: mined 12 skills from 8 sessions in the first verified run; subsequent run mined 5 more skills with proper overlap-skipping; symlink fan-out confirmed at~/.agents/skills/<name>,~/.hermes/skills/<name>,~/.pi/agent/skills/<name>(each pointing at the canonical~/.claude/skills/<name>dir).grep -c "skillify mine-local"returns 1 in every per-agent session-start bundle + pi inline source.Follow-ups (not in this PR)
hivemind skillify push-local: uploaduploaded:falsemanifest rows to the orgskillstable after the user signs in.hivemind_mine_localas an MCP-style tool, since openclaw doesn't use thehivemind skillifyCLI surface.Summary by CodeRabbit
Release Notes
New Features
hivemind skillify mine-localcommand to extract and catalog skills from locally installed agent sessions without requiring Deeplake authentication.Tests