Intelligently compress verbose tool outputs during request preparation to conserve context window space while preserving essential information for LLM reasoning.
Dynamic Tool Output Compression is an advanced feature that applies strategy-based compression to role="tool" outputs during backend request preparation. Unlike simple truncation, it uses content-aware strategies to reduce token usage while maintaining the semantic information the LLM needs for effective reasoning.
The compression pipeline runs after history compaction and before backend translation, ensuring:
- Stale content is already removed by compaction
- Remaining tool outputs are optimized for the specific backend
- Token savings are maximized without information loss
Key Characteristics:
- Disabled by default - Must be explicitly enabled
- Deterministic precedence - CLI > Environment > YAML > Defaults
- Content-aware strategies - Different compression for different output types
- Observable and recoverable - Full telemetry and optional artifact recovery
- Extensible - Supports declarative rule customization
Use dynamic compression when:
- Working with large codebases - File reads, directory listings, and searches generate verbose output
- Running extensive test suites - Pytest and build outputs can consume thousands of tokens
- Processing structured data - JSON logs, XML configs, and similar formats are often repetitive
- Debugging complex issues - Diagnostic outputs from multiple tools accumulate quickly
- Managing long sessions - Conversations spanning many turns with repeated tool usage
Comparison with Related Features:
| Feature | Stage | Scope | Purpose |
|---|---|---|---|
| Context Compaction | Before request prep | Entire history | Remove stale tool results |
| Dynamic Compression | During request prep | Individual tool outputs | Compress content while preserving meaning |
| Backend Truncation | Backend-specific | Per-backend | Legacy truncation (deprecated) |
Configuration is resolved in the following order (highest to lowest priority):
- CLI Arguments - Command-line flags override everything
- Environment Variables - Shell environment variables
- YAML Configuration - Config file settings
- Built-in Defaults - Conservative defaults if nothing else is specified
Enable with minimal configuration:
dynamic_compression:
enabled: trueOr via CLI:
python -m src.core.cli --enable-dynamic-compressiondynamic_compression:
# Master switch
enabled: false
# Compression levels
level: "conservative" # Base level: conservative | balanced | aggressive
max_level: "aggressive" # Escalation ceiling during budget pressure
# Size thresholds
min_bytes: 1024 # Skip outputs smaller than this (bytes)
# Telemetry and observability
telemetry_include_content_hashes: true # Include correlation-safe hashes (no raw content)
# Alerting on compression issues
alerts:
enabled: true
failure_threshold: 5 # Warn after N method failures within window
fallback_threshold: 8 # Warn after N fallback/skips within window
window_seconds: 300 # Observation window for alerts
cooldown_seconds: 300 # Rate-limit alert emissions
# Recovery artifacts for debugging
recovery:
mode: "never" # When to save originals: never | failures | always
min_original_bytes: 4096 # Minimum original size to qualify for recovery
min_saved_bytes: 2048 # Minimum bytes saved to qualify for recovery
max_artifact_bytes: 262144 # Maximum artifact size (256 KB)
max_artifacts: 128 # Maximum artifacts to retain
retention_seconds: 86400 # Artifact retention (24 hours)
storage_dir: "var/compression_recovery"
hint_in_text: false # Append recovery handle hint to plain text
# Tool category exclusions (skip compression for these categories)
disable_categories: [] # e.g., ["search", "command_execution"]
# Compression method exclusions
disable_methods: [] # e.g., ["line_dedupe", "ansi_normalization"]
# Specific tool exclusions
disable_tools: [] # e.g., ["shell", "bash", "run_command"]
# Tool-name substring exclusions (match anywhere, case-insensitive)
disable_tool_name_substrings: [] # e.g., ["fff", "mcp_"]
# Command prefix exclusions
disable_command_prefixes: [] # e.g., ["git diff --stat", "pytest -v"]
# Listing/search/read strategy tuning
noise_directories: ["node_modules", ".git", "target", "__pycache__", ".venv", "vendor"]
search_context_lines: 2
search_max_matches_per_file: 8
search_max_total_groups: 100
search_max_line_length: 240
# File detail mode configuration
file_detail_mode: "auto" # auto | full | structure | signatures
file_detail_fallback_mode: "full" # full | structure | signatures
file_detail_auto_full_max_lines: 120
file_detail_auto_structure_max_lines: 280
file_detail_include_line_numbers: false
file_detail_max_lines: null # Head-like cap when set
file_detail_last_n_lines: null # Tail-like cap when set
# Pattern-based rules (8-stage text pipeline)
output_pattern_rules: []
output_pattern_regex_timeout_ms: 25
# Declarative operator-defined rules
declarative_rules: []
declarative_rule_files: [] # Extra YAML/JSON files with rules
declarative_regex_timeout_ms: 25 # Guard timeout for regex evaluation
# Diff output controls
diff_max_lines_per_hunk: 100
diff_max_total_lines: 500| Flag | Description |
|---|---|
--enable-dynamic-compression |
Enable dynamic compression |
--dynamic-compression-level LEVEL |
Set base level: conservative, balanced, aggressive |
--dynamic-compression-max-level LEVEL |
Set escalation ceiling |
--dynamic-compression-min-bytes BYTES |
Minimum output size for compression |
--dynamic-compression-file-detail-include-line-numbers |
Include line numbers in file detail output |
--dynamic-compression-file-detail-exclude-line-numbers |
Exclude line numbers |
--dynamic-compression-disable-categories CSV |
Comma-separated categories to bypass |
--dynamic-compression-disable-methods CSV |
Comma-separated methods to disable |
--dynamic-compression-disable-tools CSV |
Comma-separated tool names to bypass |
--dynamic-compression-disable-tool-name-substrings CSV |
Comma-separated substrings matched anywhere in tool names |
--dynamic-compression-disable-command-prefixes CSV |
Comma-separated command prefixes to bypass |
If a tool family already produces compact output, skip dynamic compression for it using disable_tool_name_substrings.
dynamic_compression:
enabled: true
disable_tool_name_substrings: ["fff"]This matches tool names case-insensitively and anywhere in the name, including:
fff_grepfff_find_filesfff_multi_grepturbo_fff_grep
The level setting controls compression aggressiveness:
| Level | Description | Best For |
|---|---|---|
conservative |
Minimal compression, maximum preservation | Debugging, critical outputs |
balanced |
Moderate compression, good token savings | General development work |
aggressive |
Maximum compression, may reduce detail | Large outputs, routine operations |
The max_level setting allows escalation during budget pressure. When context limits are tight, the system can temporarily increase compression up to max_level.
Dynamic compression applies different strategies based on content type:
Converts ANSI color codes to a normalized form, reducing variation while preserving visual structure.
Applies to: Terminal output with color codes
Removes adjacent duplicate lines while preserving count information.
Before:
Running test_foo...
PASS
Running test_bar...
PASS
Running test_baz...
PASS
After:
Running test_foo... PASS
Running test_bar... PASS
Running test_baz... PASS
[3 similar results grouped]
Compresses git-style diff output by:
- Removing context lines beyond configured limits
- Summarizing large hunks
- Preserving all actual changes
Converts verbose directory listings to structured summaries:
Before:
drwxr-xr-x 5 user group 4096 Jan 15 10:23 src
drwxr-xr-x 3 user group 4096 Jan 15 10:23 tests
-rw-r--r-- 1 user group 2341 Jan 15 10:23 README.md
-rw-r--r-- 1 user group 892 Jan 15 10:23 setup.py
After:
Directories: src/, tests/
Files: README.md (2.3K), setup.py (892B)
Groups search results by file and limits matches per file:
Configuration:
search_max_matches_per_file: 8- Max matches shown per filesearch_max_total_groups: 100- Max file groups overallsearch_context_lines: 2- Context lines per match
Compresses file content based on file_detail_mode:
| Mode | Behavior |
|---|---|
full |
Entire file content (respects max_lines) |
structure |
Class/function signatures only |
signatures |
Function signatures only |
auto |
Chooses based on file size and context |
For pytest/build output, keeps only failures and errors:
Before:
test_example.py::test_one PASSED
test_example.py::test_two PASSED
... 50 more PASSED ...
test_example.py::test_fail FAILED
AssertionError: expected 5, got 3
test_example.py::test_error ERROR
NameError: name 'undefined' is not defined
After:
[50 passed tests omitted]
test_example.py::test_fail FAILED
AssertionError: expected 5, got 3
test_example.py::test_error ERROR
NameError: name 'undefined' is not defined
Compresses JSON/NDJSON by:
- Truncating large arrays (showing first/last N items)
- Summarizing repetitive structures
- Preserving all keys and unique values
Parseability-preserving compression for XML:
- Maintains valid XML structure
- Truncates large text nodes
- Preserves attribute values
For log output with volatile fields (timestamps, PIDs):
- Normalizes variable fields
- Deduplicates similar log lines
- Preserves unique messages
For env/cloud-style outputs:
- Redacts sensitive values (API keys, passwords)
- Preserves key names and structure
- Maintains non-sensitive values
Scenario: Agent is exploring a large codebase with many file reads.
Configuration:
dynamic_compression:
enabled: true
level: "balanced"
file_detail_mode: "auto"
file_detail_auto_full_max_lines: 120
file_detail_auto_structure_max_lines: 280Result: File reads automatically adapt - small files shown in full, large files show structure/skeleton.
Scenario: Running large test suites frequently during development.
Configuration:
dynamic_compression:
enabled: true
level: "aggressive"
disable_command_prefixes: [] # Allow compression on all test commandsResult: Only failing tests and errors are shown, saving thousands of tokens per run.
Scenario: Analyzing application logs with repetitive patterns.
Configuration:
dynamic_compression:
enabled: true
level: "aggressive"
disable_categories: [] # Allow on all outputsResult: Log deduplication and volatile-field normalization dramatically reduce log volume.
Scenario: Compress most outputs but preserve full search results.
Configuration:
dynamic_compression:
enabled: true
level: "balanced"
disable_categories:
- "search" # Skip compression for search outputsResult: Search results remain complete; other outputs are compressed.
Scenario: Need to debug compression behavior without losing original data.
Configuration:
dynamic_compression:
enabled: true
level: "balanced"
recovery:
mode: "always"
storage_dir: "var/compression_debug"
hint_in_text: trueResult: All compressed outputs have recovery handles; originals stored for inspection.
For advanced use cases, you can define custom compression rules:
dynamic_compression:
declarative_rules:
- name: "custom_log_compression"
priority: 100
when:
tool_name: "shell"
command_prefix: "tail -f"
min_bytes: 2048
pipeline:
- "ansi_normalize"
- "line_dedupe"
- "volatile_field_normalize"
- name: "preserve_important_json"
priority: 50
when:
content_types: ["application/json"]
command_prefix: "cat config.json"
pipeline:
- "identity" # No compressionRules support an 8-stage filter pipeline and precedence override.
Each compression produces a telemetry record with:
- Original and compressed sizes
- Compression ratio
- Methods applied
- Content type
- Tool/command information
The system can alert on:
- Method failures - When a compression method repeatedly fails
- Fallbacks - When compression falls back to less effective methods
Configure thresholds and windows in the alerts section.
When recovery.mode is not never, original outputs are stored as artifacts:
# List recovery artifacts
ls var/compression_recovery/
# Inspect a specific artifact
cat var/compression_recovery/abc123.original.json- Verify feature is enabled: check
dynamic_compression.enabled - Check size thresholds: output may be below
min_bytes - Review exclusions: tool may be in
disable_toolsordisable_categories - Check logs for compression telemetry
- Reduce compression level:
level: "conservative" - Exclude specific tools: add to
disable_tools - Enable recovery:
recovery.mode: "always" - Review and adjust
file_detail_modesettings
- Compression adds minimal overhead (<10ms per output)
- Disable for specific high-frequency tools
- Adjust
min_bytesto skip small outputs - Use
disable_methodsto skip expensive methods
Legacy Gemini connector truncation controls (GEMINI_TOOL_OUTPUT_TRUNCATE_* and backend tool_output_truncate_* extras) are deprecated but still functional:
- Context Compaction - Remove stale tool results before compression
- Pytest Context Saving - Rewrite pytest commands at the tool-call level to inject
-r fEand-qflags for compact output (complementary, works before request preparation) - Token Saving - How compaction and dynamic compression combine for lower context usage
- Context Window Enforcement - Enforce per-model context limits
- Session Management - Session handling with compression awareness
- CLI Parameters Reference - Complete CLI flag documentation
- Configuration Guide - General configuration patterns