-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Describe the feature or problem you'd like to solve
Allow setting context compaction threshold which currently fixed
Proposed solution
Feature Request: Configurable Auto-Compaction Threshold
Summary
Allow users to configure the context window percentage at which auto-compaction triggers, via config.json or a CLI flag. The current fixed threshold of 95% is too late — research shows LLM quality degrades well before that point.
Motivation
Recent peer-reviewed research demonstrates that LLM performance degrades significantly as context window utilization increases, and that compacting earlier (around 50–60%) preserves substantially better output quality:
1. Positional Biases Shift Beyond 50% Context Fill
Paper: "Positional Biases Shift as Inputs Approach Context Window Limits" — Veseli et al., COLM 2025 (arXiv:2508.07479)
Key findings:
- The "Lost in the Middle" (LiM) effect is strongest when inputs occupy up to 50% of a model's context window
- Beyond 50%, primacy bias weakens — the model progressively loses the ability to reference information from earlier in the context
- At high utilization, only recency bias remains, meaning the model effectively ignores earlier conversation history
- This shift is consistent across models and is measured relative to each model's context window size
2. The "Lost in the Middle" Problem
Paper: "Lost in the Middle: How Language Models Use Long Contexts" — Liu et al., TACL 2023 (arXiv:2307.03172)
Key findings:
- Performance degrades significantly when relevant information is positioned in the middle of long contexts
- Even models explicitly designed for long contexts exhibit this degradation
- Performance is often highest when relevant information occurs at the beginning or end of the input
Why 60% Is Better Than 95%
| Aspect | Compact at 60% | Compact at 95% |
|---|---|---|
| Primacy bias | Still intact — model can reference early context | Weakened — model struggles with early context |
| Lost-in-the-middle effect | At its peak but manageable | Replaced by pure recency bias |
| Information from early turns | Preserved in high-quality summary while model can still "see" it well | Summarized after model was already degraded for ~35% of context |
| Quality of compaction summary | Higher — model has full positional access during summarization | Lower — model may miss important early details during summarization |
| User experience | Proactive — avoids degradation before it's noticed | Reactive — user may already experience worse responses |
Proposed Implementation
Option A: Config setting (preferred)
via ~/.copilot/config.json
{
"compactionThreshold": 0.60
}Option B: CLI flag
copilot --compaction-threshold 0.60Option C: Slash command
/compact --auto-at 60
Defaults
- Current default: 95% (preserve for backward compatibility)
- Recommended default: 60% (based on research)
- Valid range: 30–95%
Expected Behavior
- When context usage reaches the configured threshold, auto-compaction triggers (same as current 95% behavior)
- The
/contextcommand should display the configured threshold alongside current usage - The setting persists in
config.jsonacross sessions
Additional Context
Users working on complex, multi-file coding tasks (e.g., large refactors, architecture changes) are most impacted by late compaction. By the time context reaches 95%, earlier instructions, file contents, and plan details may already be effectively "invisible" to the model due to positional bias shifts. Compacting at 60% ensures the summary is generated while the model still has strong access to all parts of the conversation.
Example prompts or workflows
No response
Additional context
No response