feat(gsd2): add GSD 2 agent adapter and TypeScript extension#149
feat(gsd2): add GSD 2 agent adapter and TypeScript extension#149hashedone wants to merge 7 commits into
Conversation
Replace hardcoded Claude Code transcript parsing with an extensible AgentAdapter trait. Each agent gets its own adapter for event mapping, file change extraction, transcript parsing, and token/model extraction. - AgentAdapter trait with ClaudeCode, Codex, and Default adapters - Codex transcript parsing: response_item, custom_tool_call, event_msg, apply_patch file changes from transcript chunks - CLI: protocol v2, repeatable --agent flag for init/stream - tracevault init --agent codex installs .codex/hooks.json - AgentBadge component with per-agent icon on session list/detail - Server uses AgentAdapterRegistry on AppState - Removes old hardcoded extract_file_change/is_file_modifying_tool Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…hooks and CLI flag Claude Code was rewritten in #95 instead of being ported from session_detail.rs::parse_record on main, which introduced display regressions: Bash/Glob toolUseResult lost their tool_name (wrong nested-key lookup), tool_result blocks lost their text body (read `text` instead of `content`), and assistant text formatting lost the \n\n separator and `[thinking] ` prefix. The parser is now a faithful port — same fields, same fallbacks, same format strings. Token extraction now mirrors main: presence of `usage` gates the whole RecordUsage and individual missing fields default to 0, instead of aborting on missing input/output_tokens. Codex adapter: - SessionStart matcher widened from "startup|resume" to "" so the hook also fires on /clear (verified against openai/codex sources). - The user-message system-prompt filter no longer drops every message starting with `<`. It now matches only the seven known Codex injection tags from codex protocol.rs (user_instructions, environment_context, apps_instructions, skills_instructions, plugins_instructions, collaboration_mode, realtime_conversation), preserving legitimate <div>/<svg>/<T>-style user questions. - File changes extracted from transcript chunks now use the chunk's own RFC 3339 timestamp (with fallback to the hook delivery time) rather than stamping every batched patch with the hook arrival time. CLI: - `tracevault init --agent <name>` is now additive: Claude Code hooks are always installed, additional --agent values are appended and deduplicated (with `claude` aliased to `claude-code`). Previously --agent codex replaced rather than augmented the default, so users following the README ended up without Claude hooks. - The success print now reflects which agents were actually installed instead of unconditionally claiming "Claude Code hooks installed". - README CLI table reworded to match the additive behavior. Cleanup: deduplicated adapter.is_file_modifying call in service/stream.rs (the result is already in `store_response`). Tests: 16 new adapter tests cover the regressed Claude Code parser paths (Bash/Glob/tool_result/thinking/system unknown subtype/progress edge cases) plus Codex token_usage edge cases and the Codex system-prompt whitelist. 5 new init tests cover the additive --agent behavior, dedup of `claude`/`claude-code` aliases, and the Codex SessionStart match-all matcher. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…hic API
Pull adapter-specific knowledge out of `service/stream.rs`. Previously
the stream service hardcoded Codex chunk-shape lookups (`payload.name`,
`payload` cloning, RFC 3339 timestamp parsing) and used two extraction
methods with different return types (`Vec<ExtractedFileChange>` vs
`Vec<TranscriptFileChange>`), so the call site had to reach into chunk
internals to fill in tool_name / tool_input / timestamp.
The trait now exposes two symmetric methods returning the same
`FileChangeRecord` type:
fn file_changes_from_hook(&self, tool, input, ts) -> Vec<FileChangeRecord>
fn file_changes_from_transcript(&self, chunk, fallback_ts)
-> Vec<FileChangeRecord>
Each adapter overrides at most one. Defaults return empty. The
`FileChangeRecord` carries everything the persistence layer needs
(change, tool_name, tool_input, timestamp), so `stream.rs` just
iterates and inserts — no chunk shape knowledge anywhere outside the
adapter that owns that format.
Claude path is preserved bit-for-bit against main:
* `is_file_modifying` gate around the hook-extract loop is kept, so
Read/Glob/etc. skip the call entirely (matches main's
`if is_file_modifying_tool { ... }`).
* New `provides_transcript_file_changes()` capability flag (default
false) gates the per-line transcript-extract loop. Claude returns
false → the `file_changes_from_transcript` method is never invoked
for Claude transcript lines, exactly as on main where no equivalent
call existed.
* `file_changes_from_hook` for Claude wraps the same Write/Edit logic
that lived in `extract_file_change` on main; the resulting DB writes
have identical fields and timestamps (record.timestamp = req.timestamp).
CLI: replace the hardcoded `match agent.as_str() { "claude-code" => ...,
"codex" => ... }` in `main.rs` with `adapter.display_name()` and
`adapter.hooks_install_path()` from the trait, so adding a new agent
no longer requires touching the print-message code.
Codex: `file_changes_from_transcript` now resolves the chunk's RFC 3339
timestamp internally and returns it in each record, replacing the
duplicated timestamp logic that previously lived in `stream.rs`.
The `provides_transcript_file_changes` override is `true`.
Tests: 51 adapter tests (was 50), including a new fallback case
verifying that a chunk with no top-level timestamp falls back to the
hook delivery time. All hook/transcript extraction tests updated to
the new method names and return type.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ing --agent Move the "claude"/"claude-code" alias resolution and dedup off the CLI and onto the AgentAdapter::name() canonical id — the registry already maps both strings to the same adapter, so the manual match was redundant. Dedup now runs against the adapter's own id, not the user-provided string. Change semantics: --agent codex installs only Codex hooks. Claude Code is installed only when --agent is omitted entirely (default), instead of being appended unconditionally to every --agent invocation. .gitignore entries are derived from each installed adapter's hooks_install_path(), so a codex-only init no longer pins .claude/settings.json into the ignore list.
Multi-agent split caused subtle drift on the Claude code path. Restore parity with pre-multi-agent main: - wire_protocol_version() trait method (default v2); Claude overrides to v1 so request bytes match main - persists_model_without_usage() capability flag (default false); Codex sets it true. Server stream gate becomes has_tokens || (flag && model.is_some()), so Claude's update_tokens stays token-presence-only as in main - ClaudeCodeAdapter parser locks onto first tool_use block via seen_tool_use flag (matches main's arr.iter().find() semantics) - CLI stream uses adapter.wire_protocol_version() / adapter.name() for protocol_version + tool fields - init.rs installs hooks after .gitignore update (matches main order) Also: CLI init prints actually-installed gitignore entries instead of hardcoded paths, and a comment marks _event_type as unused (routing is via hook_event_name from stdin). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…E with --agent semantics
Codex Stop hook entry was missing the `matcher` field that all other
lifecycle hooks (SessionStart, PreToolUse, PostToolUse) already carry,
risking a silent no-op if Codex requires the field. README also still
described `--agent` as additive ("in addition to the Claude Code hooks")
even though the flag has been replacement-only since 6fad80f.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds first-class GSD 2 (pi/GSD-2) support on top of the PR#95 AgentAdapter abstraction. ### Rust adapter (tracevault-core) - New Gsd2Adapter registered under 'gsd2' and 'gsd-2' - File changes from transcript: write → SHA256 hash + diff, edit → old/new diff, errors skipped - Token usage from agent_end chunks (input/output/cacheRead/cacheWrite) - Model from agent_end and session_start chunks - install_hooks() is a no-op — GSD2 uses in-process extension - 16 new tests covering all code paths ### TypeScript extension (integrations/gsd2-extension/) - Hooks into session_start, tool_execution_end, agent_end, stop - POSTs directly to TraceVault HTTP API (no shell subprocess) - Config from .tracevault/config.toml or ~/.config/tracevault/config.toml - Silent no-op when config is absent - is_error natively available — must_succeed policies work correctly ### Key difference vs CC/Codex GSD2 is an in-process TypeScript extension, not a shell hook. This means: - No transcript file to parse; events arrive via the extension event system - is_error is a first-class bool on every tool result (not inferred from transcript) - Token usage comes from AssistantMessage.usage in agent_end, not transcript chunks
| cfg = loadConfig(ctx.cwd); | ||
| if (!cfg) return; // Not a TraceVault project — silent no-op | ||
|
|
||
| const state = getState(ctx.sessionManager.sessionId ?? crypto.randomUUID()); |
There was a problem hiding this comment.
Fallback session_id is inconsistent: session_start uses crypto.randomUUID() but other handlers use "unknown", causing the same run’s events to be attributed to different sessions.
| const state = getState(ctx.sessionManager.sessionId ?? crypto.randomUUID()); | |
| const state = getState(ctx.sessionManager.sessionId ?? "unknown"); |
Details
✨ AI Reasoning
The code is trying to keep all events in a single per-session stream. However, when the runtime session identifier is absent, the first handler creates state with a random ID, but subsequent handlers use a different fallback string. Those branches cannot refer to the same session state in that scenario, so events are split across different IDs and cleanup targets the wrong map key.
Reply @AikidoSec feedback: [FEEDBACK] to get better review comments in the future.
Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info
| join(projectRoot, ".tracevault", "config.toml"), | ||
| join(homedir(), ".config", "tracevault", "config.toml"), | ||
| ]; | ||
|
|
||
| for (const path of candidates) { | ||
| if (!existsSync(path)) continue; | ||
| try { | ||
| const raw = readFileSync(path, "utf8"); |
There was a problem hiding this comment.
Potential file inclusion attack via reading file - high severity
If an attacker can control the input leading into the ReadFile function, they might be able to read sensitive files and launch further attacks with that information.
Show fix
| join(projectRoot, ".tracevault", "config.toml"), | |
| join(homedir(), ".config", "tracevault", "config.toml"), | |
| ]; | |
| for (const path of candidates) { | |
| if (!existsSync(path)) continue; | |
| try { | |
| const raw = readFileSync(path, "utf8"); | |
| join(resolve(projectRoot), ".tracevault", "config.toml"), | |
| join(homedir(), ".config", "tracevault", "config.toml"), | |
| ]; | |
| for (const candidatePath of candidates) { | |
| const resolvedBase = resolve(projectRoot); | |
| const resolvedTarget = resolve(candidatePath); | |
| const rel = relative(resolvedBase, resolvedTarget); | |
| if (rel.startsWith('..') || isAbsolute(rel)) { | |
| continue; | |
| } | |
| if (!existsSync(resolvedTarget)) continue; | |
| try { | |
| const raw = readFileSync(resolvedTarget, "utf8"); |
Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info
| } catch { | ||
| // ignore malformed config |
There was a problem hiding this comment.
Catch block silently ignores malformed config in loadConfig; log or record the parse error (e.g., console.debug or console.error) so malformed config isn't silently swallowed.
Show fix
| } catch { | |
| // ignore malformed config | |
| } catch (err) { | |
| // ignore malformed config and continue searching | |
| console.error(`[tracevault] Failed to parse config at ${path}:`, err); |
Details
✨ AI Reasoning
A try/catch was added around config parsing. The catch is a bare catch (no exception variable or type) and its body contains only a comment that says the error is ignored. This swallows parse errors silently. Silent catches on I/O or parsing operations make debugging harder and can hide configuration issues. The try block reads and parses files (state-changing I/O), so failing to report parse errors reduces observability. Adding minimal error handling (logging or structured handling) would preserve the intended silent-no-op behavior while leaving an audit trail.
Reply @AikidoSec feedback: [FEEDBACK] to get better review comments in the future.
Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info
| const sessions = new Map<string, SessionState>(); | ||
|
|
||
| function getState(gsdSessionId: string): SessionState { | ||
| let s = sessions.get(gsdSessionId); | ||
| if (!s) { |
There was a problem hiding this comment.
Module-level 'sessions' Map stores per-session state and can persist across requests, risking cross-session data leakage. Consider making session state request-scoped or using an explicit, documented shared cache abstraction.
Show fix
| const sessions = new Map<string, SessionState>(); | |
| function getState(gsdSessionId: string): SessionState { | |
| let s = sessions.get(gsdSessionId); | |
| if (!s) { | |
| const sessions = new Map<string, SessionState>(); | |
| const MAX_CACHE_SIZE = 128; | |
| function getState(gsdSessionId: string): SessionState { | |
| let s = sessions.get(gsdSessionId); | |
| if (!s) { | |
| if (sessions.size >= MAX_CACHE_SIZE) { | |
| sessions.delete(sessions.keys().next().value); | |
| } |
Details
✨ AI Reasoning
The extension now defines a module-level 'sessions' Map that holds SessionState objects keyed by GSD session IDs. This Map is mutated by getState and used across event handlers, making per-session data live in module/global scope. In long-running Node.js processes, module-level mutable state persists across requests and can unintentionally leak or mix data between different sessions or users. The pattern here stores request/session-specific information in a shared global that was introduced by this change, which meets the criteria for unintended global caching and therefore is worth flagging.
Reply @AikidoSec feedback: [FEEDBACK] to get better review comments in the future.
Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info
bb61ee2 to
d858db4
Compare
Summary
Builds on top of #95 (AgentAdapter abstraction). Adds first-class GSD 2 (pi/GSD-2) support.
Closes #148 (partial — covers GSD2; Codex already covered by #95)
What's included
Rust adapter (
tracevault-core)Gsd2Adapterregistered undergsd2andgsd-2inAgentAdapterRegistrytool_execution_endtranscript chunks (write→ SHA256 + diff,edit→ old/new diff, errors skipped)agent_endchunks using GSD2's{ input, output, cacheRead, cacheWrite }shapeagent_endandsession_startchunksinstall_hooks()is a no-op — GSD2 is in-process, no shell hooks neededTypeScript extension (
integrations/gsd2-extension/)session_start,tool_execution_end,agent_end,stopfetch.tracevault/config.toml(project) or~/.config/tracevault/config.toml(user-wide)Key difference from CC and Codex
GSD 2 is an in-process TypeScript extension, not a shell hook. This means:
is_erroris a first-classboolon everytool_execution_endevent, somust_succeedpolicies (from feat(policies): add must_succeed flag to tool call policies #147) work correctly without any transcript scanningAssistantMessage.usageinagent_end, not transcript JSONL chunksNotes
tool_is_error/tool_use_idfields to add toStreamEventRequest)