Skip to content

fix: singleton WorkflowService + mark task failed on load error#153

Open
devteamaegis wants to merge 1 commit into
browser-use:mainfrom
devteamaegis:fix/workflow-service-singleton-and-load-failure-status
Open

fix: singleton WorkflowService + mark task failed on load error#153
devteamaegis wants to merge 1 commit into
browser-use:mainfrom
devteamaegis:fix/workflow-service-singleton-and-load-failure-status

Conversation

@devteamaegis
Copy link
Copy Markdown

@devteamaegis devteamaegis commented May 26, 2026

What

Two independent logic bugs in workflows/backend/:

Bug 1 — get_service() creates a new instance on every request

# before
def get_service() -> WorkflowService:
    return WorkflowService()   # fresh empty state every call!

active_tasks, workflow_tasks, and cancel_events are stored on the instance.
A POST /execute request builds service A and stores the task there.
Any follow-up GET /tasks/{id}/status or POST /tasks/{id}/cancel builds service B (empty) and always returns 404 / "Task not found" — the task status and cancel endpoints are effectively broken.

Fix: promote to a module-level singleton.

Bug 2 — load failure leaves task stuck as 'running'

# before
except Exception as e:
    print(f'Error loading workflow: {e}')
    return   # active_tasks[task_id].status stays 'running' forever

If Workflow.load_from_file raises (corrupt file, missing deps, schema error), the task is never marked 'failed'. Callers poll forever seeing status: running with no error details. The message also went to stdout instead of the structured log.

Fix: set status = 'failed', populate error, write to the log file.

Tests

workflows/backend/tests/test_service_singleton_and_load_failure.py — 6 unit tests, no browser/LLM required:

Test What it checks
test_returns_workflow_service_instance get_service() returns a WorkflowService
test_same_instance_on_repeated_calls second call returns the exact same object
test_task_state_survives_across_calls task written in call 1 visible in call 2
test_singleton_reset_creates_new_instance sanity — reset gives a new object
test_load_failure_sets_status_to_failed status is 'failed' after load raises
test_load_failure_logs_error error message appears in the log
PASSED  test_returns_workflow_service_instance
PASSED  test_same_instance_on_repeated_calls
PASSED  test_task_state_survives_across_calls
PASSED  test_singleton_reset_creates_new_instance
PASSED  test_load_failure_sets_status_to_failed
PASSED  test_load_failure_logs_error
6 passed in 0.XX s

Summary by cubic

Fixes two backend bugs: make WorkflowService a module-level singleton and mark tasks as failed (with logged errors) when workflow loading fails. This restores persistent task state across requests and prevents tasks from getting stuck as running.

  • Bug Fixes
    • get_service() now returns a shared WorkflowService instance, so active_tasks, workflow_tasks, and cancel_events persist and status/cancel/log endpoints find tasks instead of 404.
    • If Workflow.load_from_file throws, set task status to 'failed', populate error, and write the message to the task log (no stdout), so clients stop polling and see the failure reason.

Written for commit 10fc4e1. Summary will update on new commits. Review in cubic

Two logic bugs in the workflow backend:

1. get_service() created a *new* WorkflowService instance on every HTTP
   request.  Because active_tasks / workflow_tasks / cancel_events are
   stored on the instance, any follow-up request (status poll, cancel,
   log fetch) hit a fresh empty service and got 404 / "Task not found".
   Fixed by promoting the instance to a module-level singleton.

2. When Workflow.load_from_file raised an exception the handler called
   print() and returned without updating active_tasks[task_id].status,
   leaving the task permanently stuck as "running". The caller could
   never know the workflow failed to start. Fixed by setting status to
   "failed", populating the error field, and routing the message through
   the structured log instead of stdout.
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 3 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="workflows/backend/tests/test_service_singleton_and_load_failure.py">

<violation number="1" location="workflows/backend/tests/test_service_singleton_and_load_failure.py:64">
P2: Module-level mocking of `aiofiles` and `yaml` leaks stubbed modules into the shared test process and can break unrelated tests.</violation>
</file>

Reply with feedback, questions, or to request a fix.

Fix all with cubic | Re-trigger cubic

@@ -0,0 +1,222 @@
"""
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot May 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Module-level mocking of aiofiles and yaml leaks stubbed modules into the shared test process and can break unrelated tests.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At workflows/backend/tests/test_service_singleton_and_load_failure.py, line 64:

<comment>Module-level mocking of `aiofiles` and `yaml` leaks stubbed modules into the shared test process and can break unrelated tests.</comment>

<file context>
@@ -0,0 +1,222 @@
+_aiofiles_open_cm.__aenter__ = AsyncMock(return_value=_aiofiles_file)
+_aiofiles_open_cm.__aexit__ = AsyncMock(return_value=False)
+_aiofiles_mock.open = MagicMock(return_value=_aiofiles_open_cm)
+sys.modules["aiofiles"] = _aiofiles_mock
+
+sys.modules["yaml"].safe_load = MagicMock(return_value={})
</file context>
Fix with Cubic

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant