Python: stabilize executor lifecycle ordering in run results#4654
Python: stabilize executor lifecycle ordering in run results#4654davidahmann wants to merge 1 commit intomicrosoft:mainfrom
Conversation
|
This PR stabilizes non-streaming workflow receipts by canonicalizing executor lifecycle event order within superstep windows, so repeated equivalent runs produce reproducible traces. The code change is narrowly scoped to workflow result finalization plus one regression test that repeats a parallel fan-out run and asserts stable lifecycle signatures. Validation:
Inspired by research context: CAISI publishes independent, reproducible AI agent governance research: https://caisi.dev |
| ) -> list[WorkflowEvent]: | ||
| """Normalize executor lifecycle ordering in non-streaming results. | ||
| Concurrent fan-out paths can enqueue executor lifecycle events in scheduler-dependent |
There was a problem hiding this comment.
Hi @davidahmann,
Could you explain what you mean by scheduler dependent order?
The order in which the executors run within a superstep is undetermined. If we artificially change the event order that is coming out, it won't reflect the true sequence of execution.
There was a problem hiding this comment.
By scheduler-dependent order I mean the queue arrival order of executor_invoked / executor_completed events from concurrent executor coroutines inside one superstep, not a meaningful causal order of executor execution.
You are right that there is no single "true" total order within a superstep. The reason I narrowed this PR to non-streaming run() results is that today those receipts already reflect whichever coroutine happened to reach ctx.add_event() / ctx.next_event() first, which is an implementation artifact rather than a stable semantic signal. This patch does not change stream=True; it only canonicalizes the finalized non-streaming receipt after collection so repeated equivalent runs are comparable.
Evidence packet: current branch c2e571ff, Python workflow path at python/packages/core/agent_framework/_workflows/_workflow.py; repeated fan-out runs over the same topology can swap the executor_b / executor_c lifecycle slots without any change in outputs or superstep boundaries. If preserving observed arrival order in non-streaming run() is the intended contract, I should drop this instead of normalizing it.
There was a problem hiding this comment.
Yes, even in non-streaming run, the order in which the executors are run in a superstep is still non-deterministic.
7777d88 to
c2e571f
Compare
Motivation and Context
Concurrent fan-out paths can emit executor lifecycle events in scheduler-dependent order, which makes repeated non-streaming workflow traces harder to compare. This PR makes non-streaming run results deterministic for lifecycle ordering and adds regression coverage.
Refs #4653.
Description
Workflow._finalize_eventsto canonicalize executor lifecycle ordering (executor_invoked,executor_completed,executor_failed) byexecutor_idwithin each superstep window.test_parallel_executor_lifecycle_order_is_deterministic_in_run_resultsto verify repeated fan-out runs produce stable lifecycle signatures.Contribution Checklist