Add size limits to prevent context overflow in large repos by JordanCoin · Pull Request #19 · JordanCoin/codemap

JordanCoin · 2026-01-29T06:49:12Z

Summary

This PR keeps codemap hooks/MCP useful while making them safe on large repos.

Existing fix (kept)

Session-start hook uses adaptive depth based on repo size:
- >5000 files -> depth 2
- >2000 files -> depth 3
- otherwise depth 4
Hook + MCP get_structure enforce 60KB max output (~15k tokens, <10% of context)
Truncation is clean at line boundaries with a helpful message

Additional hardening in this PR

Added shared limits package (limits/) so hook + MCP use one source of truth for:
- output budget (60KB)
- adaptive depth thresholds
Daemon now skips dependency graph build on very large repos (>5000 files) to avoid startup CPU/memory spikes
Daemon now always writes .codemap/state.json (even when dep graph is unavailable), so hooks still get file_count + session event context
ReadState now accepts stale state if daemon PID is still alive (avoids unnecessary expensive rescans after idle periods)
Hook hub lookup no longer falls back to heavy fresh graph builds when daemon is already running but dep data is unavailable
Session-start now:
- waits briefly for daemon state (small warmup window)
- uses conservative depth when file count is unknown
- uses lightweight git diff output for large/unknown repos instead of full codemap --diff
MCP get_structure now:
- accepts optional depth
- defaults to adaptive depth when omitted
- reuses daemon hub data when available
- skips expensive hub analysis on very large repos and prints a note

Problem

Large repos (10k+ files) could produce massive startup output and expensive repeated analysis:

Hook output goes into Claude “Messages” context (competes directly with conversation history)
Large tree output could consume context immediately
Repeated fallback graph scans could be expensive when daemon state was missing/stale

Why this matters

Hooks need to be context-aware and resource-aware:

bounded output
graceful degradation on large repos
avoid repeated heavy computation in background hook flows

Test results

go test ./... passes
Stress-tested on synthetic ~10k-file repo:
- session-start output remained small (no runaway output)
- daemon state remained usable after idle
- pre-edit hook significantly faster with daemon vs no daemon

Behavioral note

On very large repos, hub/dependency analysis is intentionally deferred in hook/session-start flows unless explicitly requested via dedicated tooling (get_hubs, etc.).

- Session-start hook now uses adaptive depth based on repo size: - >5000 files: depth 2 - >2000 files: depth 3 - Otherwise: depth 4 - Both hook and MCP get_structure enforce 60KB max output (~15k tokens) - Truncates cleanly at line boundaries with helpful message - Prevents consuming >10% of LLM context window Fixes issue where 10k+ file repos (like Rails monoliths) would output 1.3MB+ of tree structure, overwhelming Claude Code's context. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Hook output goes directly into Claude's "Messages" context, not system prompt. This means hook output competes with conversation history for the ~200k token limit. A 1.3MB output (like a full tree of a 10k file repo) equals ~500k tokens, causing instant context overflow. The size limits (adaptive depth + 60KB cap) are critical safeguards. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

JordanCoin and others added 4 commits January 29, 2026 01:49

Merge branch 'main' into fix/context-size-limits

afd2407

Harden hooks and daemon behavior for large repositories

779e97d

JordanCoin merged commit 4e18220 into main Feb 19, 2026
12 checks passed

JordanCoin deleted the fix/context-size-limits branch February 19, 2026 22:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Add size limits to prevent context overflow in large repos#19

Add size limits to prevent context overflow in large repos#19
JordanCoin merged 4 commits intomainfrom
fix/context-size-limits

JordanCoin commented Jan 29, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

Conversation

JordanCoin commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Existing fix (kept)

Additional hardening in this PR

Problem

Why this matters

Test results

Behavioral note

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

JordanCoin commented Jan 29, 2026 •

edited

Loading