Skip to content

feat(gsd2): add GSD 2 agent adapter and TypeScript extension#149

Open
hashedone wants to merge 7 commits into
feat/agent-adapter-codexfrom
feat/gsd2-adapter
Open

feat(gsd2): add GSD 2 agent adapter and TypeScript extension#149
hashedone wants to merge 7 commits into
feat/agent-adapter-codexfrom
feat/gsd2-adapter

Conversation

@hashedone
Copy link
Copy Markdown
Contributor

Summary

Builds on top of #95 (AgentAdapter abstraction). Adds first-class GSD 2 (pi/GSD-2) support.

Closes #148 (partial — covers GSD2; Codex already covered by #95)

What's included

Rust adapter (tracevault-core)

  • Gsd2Adapter registered under gsd2 and gsd-2 in AgentAdapterRegistry
  • File changes extracted from tool_execution_end transcript chunks (write → SHA256 + diff, edit → old/new diff, errors skipped)
  • Token usage from agent_end chunks using GSD2's { input, output, cacheRead, cacheWrite } shape
  • Model from agent_end and session_start chunks
  • install_hooks() is a no-op — GSD2 is in-process, no shell hooks needed
  • 16 new adapter tests

TypeScript extension (integrations/gsd2-extension/)

  • In-process extension (not a shell hook process)
  • Hooks: session_start, tool_execution_end, agent_end, stop
  • POSTs directly to TraceVault HTTP API via fetch
  • Config from .tracevault/config.toml (project) or ~/.config/tracevault/config.toml (user-wide)
  • Silent no-op if config absent — won't interfere with normal GSD usage

Key difference from CC and Codex

GSD 2 is an in-process TypeScript extension, not a shell hook. This means:

  • No transcript file to parse — events arrive via the GSD extension event system
  • is_error is a first-class bool on every tool_execution_end event, so must_succeed policies (from feat(policies): add must_succeed flag to tool call policies #147) work correctly without any transcript scanning
  • Token usage comes from AssistantMessage.usage in agent_end, not transcript JSONL chunks

Notes

Michał Grabowski and others added 7 commits April 29, 2026 09:41
Replace hardcoded Claude Code transcript parsing with an extensible
AgentAdapter trait. Each agent gets its own adapter for event mapping,
file change extraction, transcript parsing, and token/model extraction.

- AgentAdapter trait with ClaudeCode, Codex, and Default adapters
- Codex transcript parsing: response_item, custom_tool_call, event_msg,
  apply_patch file changes from transcript chunks
- CLI: protocol v2, repeatable --agent flag for init/stream
- tracevault init --agent codex installs .codex/hooks.json
- AgentBadge component with per-agent icon on session list/detail
- Server uses AgentAdapterRegistry on AppState
- Removes old hardcoded extract_file_change/is_file_modifying_tool

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…hooks and CLI flag

Claude Code was rewritten in #95 instead of being ported from
session_detail.rs::parse_record on main, which introduced display
regressions: Bash/Glob toolUseResult lost their tool_name (wrong
nested-key lookup), tool_result blocks lost their text body (read
`text` instead of `content`), and assistant text formatting lost the
\n\n separator and `[thinking] ` prefix. The parser is now a faithful
port — same fields, same fallbacks, same format strings.

Token extraction now mirrors main: presence of `usage` gates the whole
RecordUsage and individual missing fields default to 0, instead of
aborting on missing input/output_tokens.

Codex adapter:
- SessionStart matcher widened from "startup|resume" to "" so the hook
  also fires on /clear (verified against openai/codex sources).
- The user-message system-prompt filter no longer drops every message
  starting with `<`. It now matches only the seven known Codex
  injection tags from codex protocol.rs (user_instructions,
  environment_context, apps_instructions, skills_instructions,
  plugins_instructions, collaboration_mode, realtime_conversation),
  preserving legitimate <div>/<svg>/<T>-style user questions.
- File changes extracted from transcript chunks now use the chunk's
  own RFC 3339 timestamp (with fallback to the hook delivery time)
  rather than stamping every batched patch with the hook arrival time.

CLI:
- `tracevault init --agent <name>` is now additive: Claude Code hooks
  are always installed, additional --agent values are appended and
  deduplicated (with `claude` aliased to `claude-code`). Previously
  --agent codex replaced rather than augmented the default, so users
  following the README ended up without Claude hooks.
- The success print now reflects which agents were actually installed
  instead of unconditionally claiming "Claude Code hooks installed".
- README CLI table reworded to match the additive behavior.

Cleanup: deduplicated adapter.is_file_modifying call in
service/stream.rs (the result is already in `store_response`).

Tests: 16 new adapter tests cover the regressed Claude Code parser
paths (Bash/Glob/tool_result/thinking/system unknown subtype/progress
edge cases) plus Codex token_usage edge cases and the Codex
system-prompt whitelist. 5 new init tests cover the additive --agent
behavior, dedup of `claude`/`claude-code` aliases, and the Codex
SessionStart match-all matcher.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…hic API

Pull adapter-specific knowledge out of `service/stream.rs`. Previously
the stream service hardcoded Codex chunk-shape lookups (`payload.name`,
`payload` cloning, RFC 3339 timestamp parsing) and used two extraction
methods with different return types (`Vec<ExtractedFileChange>` vs
`Vec<TranscriptFileChange>`), so the call site had to reach into chunk
internals to fill in tool_name / tool_input / timestamp.

The trait now exposes two symmetric methods returning the same
`FileChangeRecord` type:

  fn file_changes_from_hook(&self, tool, input, ts) -> Vec<FileChangeRecord>
  fn file_changes_from_transcript(&self, chunk, fallback_ts)
      -> Vec<FileChangeRecord>

Each adapter overrides at most one. Defaults return empty. The
`FileChangeRecord` carries everything the persistence layer needs
(change, tool_name, tool_input, timestamp), so `stream.rs` just
iterates and inserts — no chunk shape knowledge anywhere outside the
adapter that owns that format.

Claude path is preserved bit-for-bit against main:

* `is_file_modifying` gate around the hook-extract loop is kept, so
  Read/Glob/etc. skip the call entirely (matches main's
  `if is_file_modifying_tool { ... }`).
* New `provides_transcript_file_changes()` capability flag (default
  false) gates the per-line transcript-extract loop. Claude returns
  false → the `file_changes_from_transcript` method is never invoked
  for Claude transcript lines, exactly as on main where no equivalent
  call existed.
* `file_changes_from_hook` for Claude wraps the same Write/Edit logic
  that lived in `extract_file_change` on main; the resulting DB writes
  have identical fields and timestamps (record.timestamp = req.timestamp).

CLI: replace the hardcoded `match agent.as_str() { "claude-code" => ...,
"codex" => ... }` in `main.rs` with `adapter.display_name()` and
`adapter.hooks_install_path()` from the trait, so adding a new agent
no longer requires touching the print-message code.

Codex: `file_changes_from_transcript` now resolves the chunk's RFC 3339
timestamp internally and returns it in each record, replacing the
duplicated timestamp logic that previously lived in `stream.rs`.
The `provides_transcript_file_changes` override is `true`.

Tests: 51 adapter tests (was 50), including a new fallback case
verifying that a chunk with no top-level timestamp falls back to the
hook delivery time. All hook/transcript extraction tests updated to
the new method names and return type.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ing --agent

Move the "claude"/"claude-code" alias resolution and dedup off the CLI and
onto the AgentAdapter::name() canonical id — the registry already maps both
strings to the same adapter, so the manual match was redundant. Dedup now
runs against the adapter's own id, not the user-provided string.

Change semantics: --agent codex installs only Codex hooks. Claude Code is
installed only when --agent is omitted entirely (default), instead of being
appended unconditionally to every --agent invocation. .gitignore entries are
derived from each installed adapter's hooks_install_path(), so a codex-only
init no longer pins .claude/settings.json into the ignore list.
Multi-agent split caused subtle drift on the Claude code path. Restore
parity with pre-multi-agent main:

- wire_protocol_version() trait method (default v2); Claude overrides
  to v1 so request bytes match main
- persists_model_without_usage() capability flag (default false); Codex
  sets it true. Server stream gate becomes
  has_tokens || (flag && model.is_some()), so Claude's update_tokens
  stays token-presence-only as in main
- ClaudeCodeAdapter parser locks onto first tool_use block via
  seen_tool_use flag (matches main's arr.iter().find() semantics)
- CLI stream uses adapter.wire_protocol_version() / adapter.name() for
  protocol_version + tool fields
- init.rs installs hooks after .gitignore update (matches main order)

Also: CLI init prints actually-installed gitignore entries instead of
hardcoded paths, and a comment marks _event_type as unused (routing is
via hook_event_name from stdin).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…E with --agent semantics

Codex Stop hook entry was missing the `matcher` field that all other
lifecycle hooks (SessionStart, PreToolUse, PostToolUse) already carry,
risking a silent no-op if Codex requires the field. README also still
described `--agent` as additive ("in addition to the Claude Code hooks")
even though the flag has been replacement-only since 6fad80f.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds first-class GSD 2 (pi/GSD-2) support on top of the PR#95
AgentAdapter abstraction.

### Rust adapter (tracevault-core)
- New Gsd2Adapter registered under 'gsd2' and 'gsd-2'
- File changes from transcript: write → SHA256 hash + diff,
  edit → old/new diff, errors skipped
- Token usage from agent_end chunks (input/output/cacheRead/cacheWrite)
- Model from agent_end and session_start chunks
- install_hooks() is a no-op — GSD2 uses in-process extension
- 16 new tests covering all code paths

### TypeScript extension (integrations/gsd2-extension/)
- Hooks into session_start, tool_execution_end, agent_end, stop
- POSTs directly to TraceVault HTTP API (no shell subprocess)
- Config from .tracevault/config.toml or ~/.config/tracevault/config.toml
- Silent no-op when config is absent
- is_error natively available — must_succeed policies work correctly

### Key difference vs CC/Codex
GSD2 is an in-process TypeScript extension, not a shell hook. This means:
- No transcript file to parse; events arrive via the extension event system
- is_error is a first-class bool on every tool result (not inferred from transcript)
- Token usage comes from AssistantMessage.usage in agent_end, not transcript chunks
cfg = loadConfig(ctx.cwd);
if (!cfg) return; // Not a TraceVault project — silent no-op

const state = getState(ctx.sessionManager.sessionId ?? crypto.randomUUID());
Copy link
Copy Markdown

@aikido-pr-checks aikido-pr-checks Bot May 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fallback session_id is inconsistent: session_start uses crypto.randomUUID() but other handlers use "unknown", causing the same run’s events to be attributed to different sessions.

Suggested change
const state = getState(ctx.sessionManager.sessionId ?? crypto.randomUUID());
const state = getState(ctx.sessionManager.sessionId ?? "unknown");
Details

✨ AI Reasoning
​The code is trying to keep all events in a single per-session stream. However, when the runtime session identifier is absent, the first handler creates state with a random ID, but subsequent handlers use a different fallback string. Those branches cannot refer to the same session state in that scenario, so events are split across different IDs and cleanup targets the wrong map key.

Reply @AikidoSec feedback: [FEEDBACK] to get better review comments in the future.
Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info

Comment on lines +37 to +44
join(projectRoot, ".tracevault", "config.toml"),
join(homedir(), ".config", "tracevault", "config.toml"),
];

for (const path of candidates) {
if (!existsSync(path)) continue;
try {
const raw = readFileSync(path, "utf8");
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential file inclusion attack via reading file - high severity
If an attacker can control the input leading into the ReadFile function, they might be able to read sensitive files and launch further attacks with that information.

Show fix
Suggested change
join(projectRoot, ".tracevault", "config.toml"),
join(homedir(), ".config", "tracevault", "config.toml"),
];
for (const path of candidates) {
if (!existsSync(path)) continue;
try {
const raw = readFileSync(path, "utf8");
join(resolve(projectRoot), ".tracevault", "config.toml"),
join(homedir(), ".config", "tracevault", "config.toml"),
];
for (const candidatePath of candidates) {
const resolvedBase = resolve(projectRoot);
const resolvedTarget = resolve(candidatePath);
const rel = relative(resolvedBase, resolvedTarget);
if (rel.startsWith('..') || isAbsolute(rel)) {
continue;
}
if (!existsSync(resolvedTarget)) continue;
try {
const raw = readFileSync(resolvedTarget, "utf8");

Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info

Comment on lines +55 to +56
} catch {
// ignore malformed config
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Catch block silently ignores malformed config in loadConfig; log or record the parse error (e.g., console.debug or console.error) so malformed config isn't silently swallowed.

Show fix
Suggested change
} catch {
// ignore malformed config
} catch (err) {
// ignore malformed config and continue searching
console.error(`[tracevault] Failed to parse config at ${path}:`, err);
Details

✨ AI Reasoning
​A try/catch was added around config parsing. The catch is a bare catch (no exception variable or type) and its body contains only a comment that says the error is ignored. This swallows parse errors silently. Silent catches on I/O or parsing operations make debugging harder and can hide configuration issues. The try block reads and parses files (state-changing I/O), so failing to report parse errors reduces observability. Adding minimal error handling (logging or structured handling) would preserve the intended silent-no-op behavior while leaving an audit trail.

Reply @AikidoSec feedback: [FEEDBACK] to get better review comments in the future.
Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info

Comment on lines +121 to +125
const sessions = new Map<string, SessionState>();

function getState(gsdSessionId: string): SessionState {
let s = sessions.get(gsdSessionId);
if (!s) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Module-level 'sessions' Map stores per-session state and can persist across requests, risking cross-session data leakage. Consider making session state request-scoped or using an explicit, documented shared cache abstraction.

Show fix
Suggested change
const sessions = new Map<string, SessionState>();
function getState(gsdSessionId: string): SessionState {
let s = sessions.get(gsdSessionId);
if (!s) {
const sessions = new Map<string, SessionState>();
const MAX_CACHE_SIZE = 128;
function getState(gsdSessionId: string): SessionState {
let s = sessions.get(gsdSessionId);
if (!s) {
if (sessions.size >= MAX_CACHE_SIZE) {
sessions.delete(sessions.keys().next().value);
}
Details

✨ AI Reasoning
​The extension now defines a module-level 'sessions' Map that holds SessionState objects keyed by GSD session IDs. This Map is mutated by getState and used across event handlers, making per-session data live in module/global scope. In long-running Node.js processes, module-level mutable state persists across requests and can unintentionally leak or mix data between different sessions or users. The pattern here stores request/session-specific information in a shared global that was introduced by this change, which meets the criteria for unintended global caching and therefore is worth flagging.

Reply @AikidoSec feedback: [FEEDBACK] to get better review comments in the future.
Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info

@ayeo ayeo force-pushed the feat/agent-adapter-codex branch from bb61ee2 to d858db4 Compare May 29, 2026 14:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant