Feature request: expose per-model token usage in the status line payload
Context
The configurable status line (statusLine.command) receives a JSON payload on stdin. Token usage is currently exposed two ways:
context_window: {
total_input_tokens, // ← SUM across ALL models used in the session
total_output_tokens, // ← same
total_cache_read_tokens, // ← same
total_cache_write_tokens, // ← same
total_reasoning_tokens, // ← same
...
current_usage: { // ← ONLY for the currently selected model
input_tokens,
output_tokens,
cache_creation_input_tokens,
cache_read_input_tokens
}
}
The runtime tracks usage per model in its internal modelMetrics map (this is visible in session.shutdown events and surfaced by /usage), but only the cross-model sum and the currently-selected model's slice are exposed to the status line script.
Problem
This makes it impossible for a status line script to compute an accurate session cost when the user switches models mid-session (e.g. starts in GPT-5.5, /model to Claude Opus 4.7). The script can only apply the currently selected model's rates to all cumulative tokens — which under- or over-estimates depending on which model is pricier.
Concrete example: spend 1M output tokens on GPT-5.5 ($30/M output), then switch to Opus 4.7 ($25/M output). True cost is $30. A status line script can only see "current model = Opus, total_output = 1M" and reports $25. Off by 20%.
Proposal
Expose the per-model breakdown alongside the existing totals:
context_window: {
total_input_tokens,
total_output_tokens,
total_cache_read_tokens,
total_cache_write_tokens,
total_reasoning_tokens,
...
// NEW: array of per-model usage slices
model_usage: [
{
model_id: "gpt-5.5",
model_display_name: "GPT-5.5",
input_tokens: 700000,
output_tokens: 700000,
cache_read_tokens: 700000,
cache_write_tokens: 35000,
reasoning_tokens: 0,
requests: 12
},
{
model_id: "claude-opus-4.7",
model_display_name: "Claude Opus 4.7",
input_tokens: 300000,
output_tokens: 300000,
cache_read_tokens: 300000,
cache_write_tokens: 15000,
reasoning_tokens: 5000,
requests: 4
}
]
}
This data already exists internally as modelMetrics and is exactly what /usage displays today. Exposing it would be a one-line payload change.
Use case
Any status line script that surfaces session cost in USD (using the published model pricing) would become correct for multi-model sessions. The script just iterates model_usage, looks up each model's rates, and sums.
Current workaround (and why it's bad)
Parsing ~/.copilot/session-state/<id>/events.jsonl for assistant.message.outputTokens + model per turn — fragile (race conditions with the writer), incomplete (input/cache tokens are NOT logged per call, only aggregated at session.shutdown), and forces every status line author to reinvent the same incremental-parsing + caching machinery.
Related
Companion request for total_files_modified: #3404
Feature request: expose per-model token usage in the status line payload
Context
The configurable status line (
statusLine.command) receives a JSON payload on stdin. Token usage is currently exposed two ways:The runtime tracks usage per model in its internal
modelMetricsmap (this is visible insession.shutdownevents and surfaced by/usage), but only the cross-model sum and the currently-selected model's slice are exposed to the status line script.Problem
This makes it impossible for a status line script to compute an accurate session cost when the user switches models mid-session (e.g. starts in GPT-5.5,
/modelto Claude Opus 4.7). The script can only apply the currently selected model's rates to all cumulative tokens — which under- or over-estimates depending on which model is pricier.Concrete example: spend 1M output tokens on GPT-5.5 ($30/M output), then switch to Opus 4.7 ($25/M output). True cost is $30. A status line script can only see "current model = Opus, total_output = 1M" and reports $25. Off by 20%.
Proposal
Expose the per-model breakdown alongside the existing totals:
This data already exists internally as
modelMetricsand is exactly what/usagedisplays today. Exposing it would be a one-line payload change.Use case
Any status line script that surfaces session cost in USD (using the published model pricing) would become correct for multi-model sessions. The script just iterates
model_usage, looks up each model's rates, and sums.Current workaround (and why it's bad)
Parsing
~/.copilot/session-state/<id>/events.jsonlforassistant.message.outputTokens+modelper turn — fragile (race conditions with the writer), incomplete (input/cache tokens are NOT logged per call, only aggregated atsession.shutdown), and forces every status line author to reinvent the same incremental-parsing + caching machinery.Related
Companion request for
total_files_modified: #3404