Skip to content

Python: Fix RUN_FINISHED.interrupt to accumulate all interrupts when multiple tools need approval#4717

Open
moonbox3 wants to merge 3 commits intomicrosoft:mainfrom
moonbox3:agent/fix-4590-1
Open

Python: Fix RUN_FINISHED.interrupt to accumulate all interrupts when multiple tools need approval#4717
moonbox3 wants to merge 3 commits intomicrosoft:mainfrom
moonbox3:agent/fix-4590-1

Conversation

@moonbox3
Copy link
Contributor

Motivation and Context

When multiple tool calls required approval in the same turn, flow.interrupts was being reassigned with a single-element list for each approval request, causing only the last interrupt to appear in the RUN_FINISHED event. This meant clients could not see or approve earlier tool calls.

Fixes #4590

Description

The root cause was that both _emit_approval_request in _run_common.py and the confirmation handling in _agent_run.py used flow.interrupts = [...] (list reassignment) instead of appending to the existing list. The fix changes these assignments to flow.interrupts.append(...) so that each approval request is accumulated. A new test verifies that three consecutive approval requests result in three distinct interrupts in flow.interrupts.

Contribution Checklist

  • The code builds clean without any errors or warnings
  • The PR follows the Contribution Guidelines
  • All unit tests pass, and I have added new tests where possible
  • Is this a breaking change? If yes, add "[BREAKING]" prefix to the title of the PR.

Note: PR autogenerated by moonbox3's agent

…osoft#4590)

Change flow.interrupts assignment to append so that all interrupt entries
accumulate when multiple tools require approval in a single turn.

Both _run_common.py and _agent_run.py used assignment (=) which caused
each new interrupt to overwrite the previous one. Switching to append()
ensures RUN_FINISHED.interrupt contains all pending approvals.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@moonbox3 moonbox3 self-assigned this Mar 16, 2026
Copilot AI review requested due to automatic review settings March 16, 2026 03:52
@markwallace-microsoft
Copy link
Member

markwallace-microsoft commented Mar 16, 2026

Python Test Coverage

Python Test Coverage Report •
FileStmtsMissCoverMissing
packages/ag-ui/agent_framework_ag_ui
   _agent_run.py4754989%148–155, 194–195, 202, 299, 311, 315, 317–318, 334, 361–362, 380–384, 492–494, 506–508, 612, 620, 729, 731–732, 785, 800–801, 808, 873, 896, 904, 906, 914, 967, 970, 980–981, 988
   _run_common.py2031393%263–265, 290–291, 297–302, 405–406
TOTAL23998264288% 

Python Unit Test Overview

Tests Skipped Failures Errors Time
5223 20 💤 0 ❌ 0 🔥 1m 24s ⏱️

Copy link
Contributor Author

@moonbox3 moonbox3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated Code Review

Reviewers: 4 | Confidence: 83%

✗ Correctness

The diff changes flow.interrupts = [...] (list assignment) to flow.interrupts.append(...) in two locations, enabling accumulation of multiple interrupts per turn. The logic is sound and the test correctly validates the new behavior. However, the diff does not show a corresponding change to the FlowState dataclass to add an interrupts field with default_factory=list. The installed version of FlowState has no interrupts field — the old assignment (= [...]) worked because it dynamically created the attribute, but .append() will raise AttributeError if the attribute doesn't already exist. If this field was added in another commit in the same PR, the change is correct; otherwise this is a blocking bug.

✓ Security Reliability

The diff changes flow.interrupts from list replacement (= [...]) to list append (.append(...)) in two locations, enabling accumulation of multiple interrupt entries per turn. A corresponding test validates the new behavior. The change is straightforward and correct, with no security issues. The only reliability consideration is that flow.interrupts must be pre-initialized as a list (e.g., via field(default_factory=list)) on FlowState for .append() to succeed on the first call — the old assignment-based code would have worked even without prior initialization. Since the FlowState definition change is not included in this diff, this is worth confirming but the new test exercising FlowState(message_id="msg-1") implies the field is properly defined.

✓ Test Coverage

The diff changes flow.interrupts from assignment (= [...]) to append (.append(...)) in two locations: _run_common.py (_emit_approval_request) and _agent_run.py (run_agent_stream). A new test covers the accumulation behavior for _emit_approval_request, but no test covers the identical change in _agent_run.py's streaming path. The added test is well-structured with meaningful assertions (checks count and interrupt IDs), but only exercises one of the two modified code paths.

✓ Design Approach

The change fixes a real bug: every call to _emit_approval_request was overwriting flow.interrupts with a new single-element list instead of accumulating entries, meaning only the last approval request in a turn was ever surfaced. The fix (.append()) is the correct approach and aligns with the list-accumulation pattern implied by flow.interrupts being a list. The accompanying test directly validates the multi-interrupt accumulation scenario. No design issues are present — this is a straightforward and correct targeted fix with no leaky abstractions, fragile assumptions, or symptom-level masking.

Flagged Issues

  • FlowState dataclass does not appear to define an interrupts field in this diff. The old flow.interrupts = [...] dynamically created the attribute via assignment, but .append() requires it to already exist. If FlowState was not updated elsewhere in this PR to include interrupts: list[dict[str, Any]] = field(default_factory=list), both call sites will raise AttributeError on first invocation.

Suggestions

  • Ensure FlowState declares interrupts: list[dict[str, Any]] = field(default_factory=list) so that .append() works on a freshly constructed instance. If this was added in a separate commit not shown in this diff, confirm it is present.
  • Consider whether flow.interrupts should be cleared between turns or agent runs to prevent stale interrupt data from carrying over. Since FlowState represents state for a single AG-UI run this is likely safe, but worth verifying the lifecycle.
  • Add a test exercising the run_agent_stream code path in _agent_run.py (line 924) to verify that multiple approval requests in a streaming run also accumulate in flow.interrupts. Currently only the _emit_approval_request helper in _run_common.py is covered, so a regression in the streaming path would go undetected.

Automated review by moonbox3's agents

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes AG-UI’s interrupt metadata so that when multiple tool calls require approval in the same agent turn, all approval interrupts are preserved and surfaced in RUN_FINISHED.interrupt (instead of only the last one), enabling clients to see/approve every pending tool call.

Changes:

  • Update _emit_approval_request to append to flow.interrupts rather than overwriting it.
  • Update the predictive-tool confirmation path in run_agent_stream to append additional interrupts rather than overwriting.
  • Add a unit test asserting that multiple approval requests accumulate into multiple flow.interrupts entries.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
python/packages/ag-ui/agent_framework_ag_ui/_run_common.py Accumulates approval interrupts by appending to FlowState.interrupts.
python/packages/ag-ui/agent_framework_ag_ui/_agent_run.py Accumulates multiple confirmation interrupts during agent streaming runs.
python/packages/ag-ui/tests/ag_ui/test_run.py Adds regression test ensuring multiple approval requests produce multiple interrupts.

You can also share your feedback on Copilot code review. Take the survey.

Copilot and others added 2 commits March 16, 2026 03:57
…icrosoft#4590)

Add integration test exercising run_agent_stream with multiple predictive
tool calls requiring confirmation. Verifies that flow.interrupts.append()
correctly accumulates all interrupt entries and they appear in the
RUN_FINISHED event.

Also confirms FlowState already declares interrupts field with
default_factory=list, addressing the AttributeError concern from review.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Contributor Author

@moonbox3 moonbox3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated Code Review

Reviewers: 4 | Confidence: 97% | Result: All clear

Reviewed: Correctness, Security Reliability, Test Coverage, Design Approach


Automated review by moonbox3's agents

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Python: [Bug]: RUN_FINISHED.interrupt Only Contains the Last Interrupt When Multiple Tools Need Approval

5 participants