Context
The CLI currently relies on provider/model streaming APIs to keep producing events or eventually fail. If a provider connection stalls mid-stream without closing or raising an error, the session can remain busy indefinitely and automation wrappers may appear hung.
This came up while reviewing a local exploratory patch that added an AICTRL_MODEL_STREAM_IDLE_TIMEOUT_MS guard in SessionProcessor, but we are not shipping that patch yet because timeout behavior should be designed at the execution/runtime boundary.
Problem / Goal
Add a deliberate model-stream timeout strategy so stalled provider streams are surfaced as explicit failures instead of leaving sessions running forever.
Success means the CLI has a clear, configurable timeout/cancellation policy for model stream reads, emits useful failure information, and does not accidentally abort legitimate long-running reasoning/model calls that are still making progress.
Proposed Approach
Investigate the right layer for timeout handling, likely around LLM.stream / SessionProcessor, and define whether the guard should be:
- idle-time based: abort when no stream events arrive for a configurable interval
- wall-clock based: abort after a maximum total model turn duration
- provider-specific: only enabled for known problematic providers/transports
- surfaced through config/env and JSON error events
Avoid baking in a hidden global timeout until the policy is explicit.
User Story
As a developer running aictrl run in CI or automation,
I want stalled model streams to fail with a clear timeout error,
So that jobs do not hang forever and can retry or alert correctly.
Acceptance Criteria
Out of Scope
- The separate structured
session_error parity change for existing session.error events.
- Provider-specific retry/fallback policy after a timeout.
- Shipping the current exploratory local patch as-is without a design pass.
Roadmap Alignment
- Pillar: EXEC
- Quarter: Q2 2026
- Theme fit: Supports executor observability and operational reliability for automation workflows.
- Decision gate impact: indirect — improves reliability and debuggability of headless execution.
References
- Closest existing milestone: Enterprise Observability
- Related code areas:
packages/cli/src/session/processor.ts, packages/cli/src/session/llm.ts, packages/cli/src/cli/cmd/run.ts
- Local exploratory patch considered:
AICTRL_MODEL_STREAM_IDLE_TIMEOUT_MS wrapping reads from stream.fullStream
Context
The CLI currently relies on provider/model streaming APIs to keep producing events or eventually fail. If a provider connection stalls mid-stream without closing or raising an error, the session can remain busy indefinitely and automation wrappers may appear hung.
This came up while reviewing a local exploratory patch that added an
AICTRL_MODEL_STREAM_IDLE_TIMEOUT_MSguard inSessionProcessor, but we are not shipping that patch yet because timeout behavior should be designed at the execution/runtime boundary.Problem / Goal
Add a deliberate model-stream timeout strategy so stalled provider streams are surfaced as explicit failures instead of leaving sessions running forever.
Success means the CLI has a clear, configurable timeout/cancellation policy for model stream reads, emits useful failure information, and does not accidentally abort legitimate long-running reasoning/model calls that are still making progress.
Proposed Approach
Investigate the right layer for timeout handling, likely around
LLM.stream/SessionProcessor, and define whether the guard should be:Avoid baking in a hidden global timeout until the policy is explicit.
User Story
As a developer running
aictrl runin CI or automation,I want stalled model streams to fail with a clear timeout error,
So that jobs do not hang forever and can retry or alert correctly.
Acceptance Criteria
Out of Scope
session_errorparity change for existingsession.errorevents.Roadmap Alignment
References
packages/cli/src/session/processor.ts,packages/cli/src/session/llm.ts,packages/cli/src/cli/cmd/run.tsAICTRL_MODEL_STREAM_IDLE_TIMEOUT_MSwrapping reads fromstream.fullStream