Cooperative timeout context manager--the deadline-scheduler approach#45
Open
blhsing wants to merge 7 commits into
Open
Cooperative timeout context manager--the deadline-scheduler approach#45blhsing wants to merge 7 commits into
blhsing wants to merge 7 commits into
Conversation
|
|
||
| with self.assertRaises(TimeoutError): | ||
| with timeout(0.01): | ||
| re.fullmatch(r"(a+)+\Z", "a" * 100_000 + "!") |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
In response to the discourse thread:
https://discuss.python.org/t/a-lightweight-cooperative-timeout-mechanism-for-synchronous-code-with-timeout-seconds/107811
I tried a different implementation strategy that avoids keeping the eval breaker hot for the whole duration of an active timeout block. I am calling it the deadline-scheduler approach below, because the main difference is that a scheduler thread watches deadlines and only notifies the eval loop once a deadline has actually expired.
The current PR/prototype sets a timeout bit while inside
with timeout(...), so the eval loop enters pending-handling frequently while the timeout is active. The actual clock read can be amortized with a skip counter, but_Py_HandlePending()is still reached repeatedly. That appears to be the source of the active-timeout overhead discussed above.The deadline-scheduler prototype keeps the eval loop on its normal fast path until the timeout actually expires:
with timeout(seconds)pushes a deadline block onto the currentPyThreadState._PY_TIMEOUT_EXPIRED_BITon the target thread state.TimeoutErrorat the next normal eval-breaker check.So pure Python code is still interrupted cooperatively at bytecode/eval-breaker boundaries, but an active, non-expired timeout does not force pending handling on every bytecode/check point.
CI passed here: https://github.com/blhsing/cpython/actions/runs/28073335858
Benchmark environment:
-O3pyperftimeout_seconds=3600.0, so the timeout never expires during the benchmarkinner_loops=1000Mean ± std dev:
pass_looparithmetic_looplistcomp_worknullcontextbaselineThe small "faster than baseline" results for the deadline-scheduler approach are just build/run noise, not a claimed speedup. The same-binary no-timeout control for the deadline-scheduler build measured
pass_loopat 22.1 us ± 0.1 us,arithmetic_loopat 25.4 us ± 0.1 us, andlistcomp_workat 35.9 us ± 0.7 us, so the active timeout overhead is effectively lost in noise for these workloads.The tradeoff is visible in the empty enter/exit benchmark: the scheduler-based approach pays more to register and unregister a timeout block. For very tiny timeout scopes, the OP approach is cheaper. For timeout scopes around non-trivial Python work, avoiding repeated pending handling while the timeout is active seems to dominate.
For
_sre, I made the engine cooperative by reusing its existing periodic signal-check hook._srealready has aMAYBE_CHECK_SIGNALSpath that runs every 4096 iterations of the matching engine and callsPyErr_CheckSignals(). The prototype adds:at the same point.
_PyTimeout_CheckNow()only raises when the current thread has an expired timeout block. If no timeout is active, or if the active timeout has not expired, it returns immediately. If it has expired, it setsTimeoutError,_srereturns through its existing interrupted-error path, and the Python-level regex call propagates the timeout exception.That means catastrophic backtracking can be interrupted without adding a public
re-specific timeout parameter and without adding a separate polling mechanism to the regex engine. It also keeps the polling cadence tied to_sre's existing signal-check cadence instead of checking on every regex VM transition.This still has the usual async-exception caveat: a timeout in pure Python is delivered as a normal Python exception at an eval-breaker boundary, so
finallyblocks and context-manager exits run, but arbitrary Python code can still be interrupted between bytecodes just like withKeyboardInterrupt.