Skip to content

fix(agent): guard large tool outputs#1333

Merged
zerob13 merged 2 commits intodevfrom
bugfix/large-tool-call-2
Mar 7, 2026
Merged

fix(agent): guard large tool outputs#1333
zerob13 merged 2 commits intodevfrom
bugfix/large-tool-call-2

Conversation

@zerob13
Copy link
Collaborator

@zerob13 zerob13 commented Mar 7, 2026

Summary by CodeRabbit

  • New Features

    • Tool outputs are now guarded and can be offloaded to files when large; guarded content and context budgeting control how tool results are presented.
  • Bug Fixes

    • Better handling and normalization of context-window errors and terminal tool errors.
    • Offload failures and oversized outputs now surface clear error messages and prevent broken states.
  • Tests

    • Added tests covering offloading, budgeting/resume behavior, and terminal-error scenarios.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 7, 2026

📝 Walkthrough

Walkthrough

Adds a ToolOutputGuard for offloading and context-budgeting of tool outputs, integrates it into tool execution and presenter flows, surfaces terminal errors from oversized tool results, and updates tests to cover offload, budgeting, and terminal-error behaviors.

Changes

Cohort / File(s) Summary
ToolOutputGuard Implementation
src/main/presenter/deepchatAgentPresenter/toolOutputGuard.ts
New class and types implementing offload to per-session files, context-budget validation, prepared output shaping, and guarded result semantics (ok
Dispatch: tool execution
src/main/presenter/deepchatAgentPresenter/dispatch.ts
executeTools signature extended to accept toolOutputGuard, contextLength, maxTokens; guards tool outputs, handles terminal errors, updates tool-call blocks and state, and returns terminalError when applicable.
Presenter wiring & resume logic
src/main/presenter/deepchatAgentPresenter/index.ts
Adds toolOutputGuard field, new methods loadToolDefinitionsForSession and fitResumeBudgetForToolCall, extends runStreamForMessage, resumeAssistantMessage, and deferred execution types to propagate tools, budgeted resumes, and terminal errors.
Processing & error handling
src/main/presenter/deepchatAgentPresenter/process.ts
Adds context-window error detection utilities, passes toolOutputGuard and budget params into executeTools, normalizes context-window failures, and treats terminal errors as immediate finalization.
Types
src/main/presenter/deepchatAgentPresenter/types.ts
Adds terminalError?: string to ProcessResult and requires toolOutputGuard: ToolOutputGuard in ProcessParams.
Tests: offload & budget scenarios
test/main/presenter/deepchatAgentPresenter/*.test.ts
Introduces temp home/path mocking, updates tests to create ToolOutputGuard, passes new budget params, and adds cases for offload success/failure, context-window errors, terminal-error propagation, and cleanup of offload files.

Sequence Diagram

sequenceDiagram
    participant Executor as Tool Executor
    participant Guard as ToolOutputGuard
    participant FS as File System
    participant Presenter as Presenter/Stream
    participant Provider as LLM Provider

    Executor->>Guard: prepareToolOutput(rawOutput, tool, contextLength)
    alt output too large / requires offload
        Guard->>FS: write offload file (session, tool)
        alt write succeeds
            Guard-->>Executor: {ok: offload_stub}
        else write fails
            Guard-->>Executor: {tool_error: "offload failed"}
        end
    else fits in-memory
        Guard-->>Executor: {ok: guarded_content}
    end

    Executor->>Guard: guardToolOutput(prepResult, contextLength, maxTokens)
    alt fits in conversation context
        Guard-->>Executor: {ok: tool_message_content}
        Executor->>Presenter: inject tool_message_content
        Presenter->>Provider: resumeWithToolResult(tool_message_content)
    else exceeds context
        Guard-->>Executor: {terminal_error: "context overflow"}
        Executor->>Presenter: surface terminal_error (finalize as error)
    end
Loading

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Poem

🐰 I dug a burrow for output wide,

Offloaded bytes where secrets hide.
I counted tokens, kept the heap neat,
Swapped big blobs for a stub so sweet.
Now resumes hop on tidy feet. 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 16.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: introducing guarding for large tool outputs to prevent context overflow and enable offloading.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch bugfix/large-tool-call-2

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/main/presenter/deepchatAgentPresenter/dispatch.ts`:
- Around line 577-589: The current code uses guardedResult.kind === 'tool_error'
to mark tool_call blocks, which loses the original tool failure if callTool()
returned rawData.isError === true but the guard returned ok/offloaded; compute a
preserved error flag (e.g., isToolError = guardedResult.kind === 'tool_error' ||
rawData?.isError === true) after callTool()/guarding and use that flag when
pushing the conversation entry and when calling updateToolCallBlock so the
UI/persisted block reflects the tool's original failure; update references in
this section (guardedResult, rawData, updateToolCallBlock, conversation push) to
use that combined flag.

In `@src/main/presenter/deepchatAgentPresenter/toolOutputGuard.ts`:
- Around line 15-25: The current PreparedToolOutputResult 'ok' variant drops the
offload file handle by only returning {content, offloaded}; change the 'ok'
variant to include the offload resource (e.g., offloadHandle or offloadPath and
a cleanup/close token) so callers can manage the file lifecycle, then propagate
that handle through the code paths that convert results to
tool_error/terminal_error (the locations around the earlier offload checks and
the downgrade in the later block) instead of discarding it; update any consumers
that construct or transform PreparedToolOutputResult so they close/delete the
offload when the final result is an error or when no longer needed.
- Around line 237-245: The buildOffloadStub function currently injects the raw
filesystem path (filePath) into tool output which leaks local paths; change it
to avoid returning the absolute path by using an opaque offload identifier or at
most the basename instead of the full filePath (e.g., accept or compute an
offloadId and/or use path.basename(filePath) inside buildOffloadStub) and ensure
the real filePath remains only in server-side metadata/storage (not in the
string returned by buildOffloadStub).

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1093e40e-3ec9-488b-8270-b7429236f0a4

📥 Commits

Reviewing files that changed from the base of the PR and between f4165c7 and 7093e98.

📒 Files selected for processing (8)
  • src/main/presenter/deepchatAgentPresenter/dispatch.ts
  • src/main/presenter/deepchatAgentPresenter/index.ts
  • src/main/presenter/deepchatAgentPresenter/process.ts
  • src/main/presenter/deepchatAgentPresenter/toolOutputGuard.ts
  • src/main/presenter/deepchatAgentPresenter/types.ts
  • test/main/presenter/deepchatAgentPresenter/deepchatAgentPresenter.test.ts
  • test/main/presenter/deepchatAgentPresenter/dispatch.test.ts
  • test/main/presenter/deepchatAgentPresenter/process.test.ts

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/main/presenter/deepchatAgentPresenter/index.ts (1)

1848-1919: ⚠️ Potential issue | 🟠 Major

Clean up offloaded tool artifacts when a session is cleared or destroyed.

This path now persists successful tool outputs under the session directory, but clearMessages() and destroySession() still only remove DB/runtime state. Deleting a chat will leave its offloaded tool payloads on disk, which is a storage and privacy leak. Please remove the session offload directory as part of those teardown paths.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/main/presenter/deepchatAgentPresenter/index.ts` around lines 1848 - 1919,
The teardown paths (clearMessages and destroySession) currently only clear
DB/runtime state but leave offloaded tool artifacts on disk (created via
prepareToolOutput and returned in prepared.offloadPath), causing storage/privacy
leaks; update clearMessages() and destroySession() to remove the session's
offload directory by resolving the session project dir (use
resolveProjectDir(sessionId)) and deleting the offload folder (or the specific
prepared.offloadPath children) asynchronously, handling and logging errors
without throwing (so teardown continues), and ensure any helper cleanup on
toolOutputGuard (if available) is invoked to avoid orphaned files.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/main/presenter/deepchatAgentPresenter/toolOutputGuard.ts`:
- Around line 84-107: The current offload filename (from
resolveToolOffloadPath(sessionId, toolCallId)) can be overwritten when the same
toolCallId is reused; change the offload flow to produce a unique filename per
execution (e.g., append a monotonic timestamp or UUID) before creating/writing
the file and return that unique path in offloadPath and buildOffloadStub.
Concretely, generate a uniqueSuffix (timestamp or uuid) when preparing filePath,
either by extending resolveToolOffloadPath to accept a suffix or by computing
filePath = resolveToolOffloadPath(sessionId, toolCallId) + `.${uniqueSuffix}`;
then use that filePath for mkdir/writeFile and in the returned object and
buildOffloadStub so prior offloads are never overwritten.

---

Outside diff comments:
In `@src/main/presenter/deepchatAgentPresenter/index.ts`:
- Around line 1848-1919: The teardown paths (clearMessages and destroySession)
currently only clear DB/runtime state but leave offloaded tool artifacts on disk
(created via prepareToolOutput and returned in prepared.offloadPath), causing
storage/privacy leaks; update clearMessages() and destroySession() to remove the
session's offload directory by resolving the session project dir (use
resolveProjectDir(sessionId)) and deleting the offload folder (or the specific
prepared.offloadPath children) asynchronously, handling and logging errors
without throwing (so teardown continues), and ensure any helper cleanup on
toolOutputGuard (if available) is invoked to avoid orphaned files.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 78d55bb1-10ea-4564-aec4-0ee6f77c6cba

📥 Commits

Reviewing files that changed from the base of the PR and between 7093e98 and b6ac16e.

📒 Files selected for processing (6)
  • src/main/presenter/deepchatAgentPresenter/dispatch.ts
  • src/main/presenter/deepchatAgentPresenter/index.ts
  • src/main/presenter/deepchatAgentPresenter/toolOutputGuard.ts
  • test/main/presenter/deepchatAgentPresenter/deepchatAgentPresenter.test.ts
  • test/main/presenter/deepchatAgentPresenter/dispatch.test.ts
  • test/main/presenter/deepchatAgentPresenter/process.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • test/main/presenter/deepchatAgentPresenter/process.test.ts

@zerob13 zerob13 merged commit 27508c8 into dev Mar 7, 2026
2 checks passed
This was referenced Mar 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant