Skip to content

Allow setting context compaction threshold #1761

@chetanmsft

Description

@chetanmsft

Describe the feature or problem you'd like to solve

Allow setting context compaction threshold which currently fixed

Proposed solution

Feature Request: Configurable Auto-Compaction Threshold

Summary

Allow users to configure the context window percentage at which auto-compaction triggers, via config.json or a CLI flag. The current fixed threshold of 95% is too late — research shows LLM quality degrades well before that point.

Motivation

Recent peer-reviewed research demonstrates that LLM performance degrades significantly as context window utilization increases, and that compacting earlier (around 50–60%) preserves substantially better output quality:

1. Positional Biases Shift Beyond 50% Context Fill

Paper: "Positional Biases Shift as Inputs Approach Context Window Limits" — Veseli et al., COLM 2025 (arXiv:2508.07479)

Key findings:

  • The "Lost in the Middle" (LiM) effect is strongest when inputs occupy up to 50% of a model's context window
  • Beyond 50%, primacy bias weakens — the model progressively loses the ability to reference information from earlier in the context
  • At high utilization, only recency bias remains, meaning the model effectively ignores earlier conversation history
  • This shift is consistent across models and is measured relative to each model's context window size

2. The "Lost in the Middle" Problem

Paper: "Lost in the Middle: How Language Models Use Long Contexts" — Liu et al., TACL 2023 (arXiv:2307.03172)

Key findings:

  • Performance degrades significantly when relevant information is positioned in the middle of long contexts
  • Even models explicitly designed for long contexts exhibit this degradation
  • Performance is often highest when relevant information occurs at the beginning or end of the input

Why 60% Is Better Than 95%

Aspect Compact at 60% Compact at 95%
Primacy bias Still intact — model can reference early context Weakened — model struggles with early context
Lost-in-the-middle effect At its peak but manageable Replaced by pure recency bias
Information from early turns Preserved in high-quality summary while model can still "see" it well Summarized after model was already degraded for ~35% of context
Quality of compaction summary Higher — model has full positional access during summarization Lower — model may miss important early details during summarization
User experience Proactive — avoids degradation before it's noticed Reactive — user may already experience worse responses

Proposed Implementation

Option A: Config setting (preferred)

via ~/.copilot/config.json

{
  "compactionThreshold": 0.60
}

Option B: CLI flag

copilot --compaction-threshold 0.60

Option C: Slash command

/compact --auto-at 60

Defaults

  • Current default: 95% (preserve for backward compatibility)
  • Recommended default: 60% (based on research)
  • Valid range: 30–95%

Expected Behavior

  1. When context usage reaches the configured threshold, auto-compaction triggers (same as current 95% behavior)
  2. The /context command should display the configured threshold alongside current usage
  3. The setting persists in config.json across sessions

Additional Context

Users working on complex, multi-file coding tasks (e.g., large refactors, architecture changes) are most impacted by late compaction. By the time context reaches 95%, earlier instructions, file contents, and plan details may already be effectively "invisible" to the model due to positional bias shifts. Compacting at 60% ensures the summary is generated while the model still has strong access to all parts of the conversation.

Example prompts or workflows

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions