Skip to content

Cooperative timeout context manager--the deadline-scheduler approach#45

Open
blhsing wants to merge 7 commits into
masterfrom
feature/cooperative-timeout
Open

Cooperative timeout context manager--the deadline-scheduler approach#45
blhsing wants to merge 7 commits into
masterfrom
feature/cooperative-timeout

Conversation

@blhsing

@blhsing blhsing commented Jun 24, 2026

Copy link
Copy Markdown
Owner

In response to the discourse thread:
https://discuss.python.org/t/a-lightweight-cooperative-timeout-mechanism-for-synchronous-code-with-timeout-seconds/107811

I tried a different implementation strategy that avoids keeping the eval breaker hot for the whole duration of an active timeout block. I am calling it the deadline-scheduler approach below, because the main difference is that a scheduler thread watches deadlines and only notifies the eval loop once a deadline has actually expired.

The current PR/prototype sets a timeout bit while inside with timeout(...), so the eval loop enters pending-handling frequently while the timeout is active. The actual clock read can be amortized with a skip counter, but _Py_HandlePending() is still reached repeatedly. That appears to be the source of the active-timeout overhead discussed above.

The deadline-scheduler prototype keeps the eval loop on its normal fast path until the timeout actually expires:

  • with timeout(seconds) pushes a deadline block onto the current PyThreadState.
  • A runtime scheduler thread tracks active deadline blocks.
  • While the timeout has not expired, no timeout eval-breaker bit is set.
  • When a deadline expires, the scheduler sets _PY_TIMEOUT_EXPIRED_BIT on the target thread state.
  • The target thread raises TimeoutError at the next normal eval-breaker check.

So pure Python code is still interrupted cooperatively at bytecode/eval-breaker boundaries, but an active, non-expired timeout does not force pending handling on every bytecode/check point.

CI passed here: https://github.com/blhsing/cpython/actions/runs/28073335858

Benchmark environment:

  • Linux 6.14, x86_64
  • Intel Xeon Silver 4410Y
  • GCC 13.3.0, -O3
  • CPython 3.16.0a0
  • pyperf
  • timeout_seconds=3600.0, so the timeout never expires during the benchmark
  • inner_loops=1000

Mean ± std dev:

Benchmark Baseline OP approach Deadline-scheduler approach
pass_loop 22.9 us ± 0.1 us 33.3 us ± 0.2 us, 1.45x slower 22.2 us ± 0.1 us, not meaningfully different
arithmetic_loop 28.5 us ± 0.2 us 37.8 us ± 0.2 us, 1.33x slower 25.5 us ± 0.2 us, not meaningfully different
listcomp_work 35.5 us ± 0.7 us 46.5 us ± 0.4 us, 1.31x slower 35.8 us ± 0.7 us, not significant
empty timeout context enter/exit 342 ns ± 2 ns, nullcontext baseline 1.06 us ± 0.01 us 1.73 us ± 0.05 us

The small "faster than baseline" results for the deadline-scheduler approach are just build/run noise, not a claimed speedup. The same-binary no-timeout control for the deadline-scheduler build measured pass_loop at 22.1 us ± 0.1 us, arithmetic_loop at 25.4 us ± 0.1 us, and listcomp_work at 35.9 us ± 0.7 us, so the active timeout overhead is effectively lost in noise for these workloads.

The tradeoff is visible in the empty enter/exit benchmark: the scheduler-based approach pays more to register and unregister a timeout block. For very tiny timeout scopes, the OP approach is cheaper. For timeout scopes around non-trivial Python work, avoiding repeated pending handling while the timeout is active seems to dominate.

For _sre, I made the engine cooperative by reusing its existing periodic signal-check hook. _sre already has a MAYBE_CHECK_SIGNALS path that runs every 4096 iterations of the matching engine and calls PyErr_CheckSignals(). The prototype adds:

if (_PyTimeout_CheckNow(_PyThreadState_GET())) {
    RETURN_ERROR(SRE_ERROR_INTERRUPTED);
}

at the same point. _PyTimeout_CheckNow() only raises when the current thread has an expired timeout block. If no timeout is active, or if the active timeout has not expired, it returns immediately. If it has expired, it sets TimeoutError, _sre returns through its existing interrupted-error path, and the Python-level regex call propagates the timeout exception.

That means catastrophic backtracking can be interrupted without adding a public re-specific timeout parameter and without adding a separate polling mechanism to the regex engine. It also keeps the polling cadence tied to _sre's existing signal-check cadence instead of checking on every regex VM transition.

This still has the usual async-exception caveat: a timeout in pure Python is delivered as a normal Python exception at an eval-breaker boundary, so finally blocks and context-manager exits run, but arbitrary Python code can still be interrupted between bytecodes just like with KeyboardInterrupt.


with self.assertRaises(TimeoutError):
with timeout(0.01):
re.fullmatch(r"(a+)+\Z", "a" * 100_000 + "!")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants