Skip to content

fix: auto-start workers in context manager; warn on RUNNING+failed tasks#391

Open
nthmost-orkes wants to merge 4 commits intomainfrom
fix/critical-sdk-behaviors
Open

fix: auto-start workers in context manager; warn on RUNNING+failed tasks#391
nthmost-orkes wants to merge 4 commits intomainfrom
fix/critical-sdk-behaviors

Conversation

@nthmost-orkes
Copy link
Copy Markdown
Contributor

Summary

Test plan

  • TestTaskHandlerContextManager::test_context_manager_enter — verifies with handler as h: h is handler
  • TestTaskHandlerContextManager::test_context_manager_exit — verifies stop_processes() is called on exit (updated to patch Process since __enter__ now starts workers)
  • Full unit suite: python3 -m pytest tests/unit/ — 440 passed

🤖 Generated with Claude Code

nthmost-orkes and others added 2 commits March 26, 2026 16:25
… tasks

- TaskHandler.__enter__ now calls start_processes() so `with TaskHandler(...)
  as h:` works out of the box without a separate h.start_processes() call.
  start_processes() is idempotent via _processes_started guard, so existing
  code that calls it explicitly inside the with-block is unaffected.
  Fixes conductor-oss/getting-started#42.

- WorkflowExecutor.execute() and execute_workflow() now log a WARNING when
  the workflow returns RUNNING after the wait timeout but a task is already
  in FAILED/FAILED_WITH_TERMINAL_ERROR state, surfacing the failure reason
  that would otherwise be invisible to the caller.
  Fixes conductor-oss/getting-started#41.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The conductor_server.sh one-liner downloaded a 606 MB JAR to the
current directory with no progress indicator and ran in the foreground.
Replace it with the `conductor server start` CLI approach, which handles
download location, caching, and UX correctly.

Fixes conductor-oss/getting-started#47

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
logger.info("TaskHandler initialized")

def __enter__(self):
self.start_processes()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what are the implications of auto starting the process?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question -- mostly pros in my reckoning but i'll take a second look.

  1. Fixes the context manager contract. with-as-no-op is a Python anti-pattern. File handles, locks, threads, subprocess all acquire on enter. TaskHandler was the odd one.
  2. Fixes a real user bug — Python SDK: TaskHandler context manager silently starts 0 workers if start_processes() is not called explicitly getting-started#42. Users wrote with TaskHandler(...) as h: expecting it to run and got silence.
  3. Idempotency guard preserves every existing caller. Anyone doing with h: h.start_processes() is unaffected — second call is a no-op.
  4. stop_processes() resets the flag, so stop → restart works. (Onboard repo to PyPI (Python Package Index) #3 from last turn is handled.)

I'll go look at some things that come to mind... will update

If start_processes() raised partway through spawning workers, the flag
was already True, causing a subsequent retry call to be a silent no-op.
Moving the assignment to the end means a failed start remains retryable.
Python skips __exit__ when __enter__ raises, so an exception mid-startup
would leak any workers already spawned. Wrap the call so stop_processes()
runs before the exception propagates.
@nthmost-orkes
Copy link
Copy Markdown
Contributor Author

Changes prompted by this review:

Thinking through the implications surfaced two edge cases worth hardening, both now pushed as follow-up commits:

  • Partial-failure cleanup in __enter__ — Python skips __exit__ when __enter__ raises, so if start_processes() bailed mid-spawn, any already-started workers would leak. __enter__ now calls stop_processes() before re-raising.
  • Flag-set ordering_processes_started = True now runs after a successful start rather than before, so a failed start remains retryable instead of silently no-op'ing on the next attempt.

For the main behavior change itself:

  • Backward compat is preserved — the idempotency guard means existing code that calls start_processes() explicitly inside a with block is unaffected.
  • stop_processes() resets the flag, so stop → restart cycles still work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants