fix(sdk): raise asyncio StreamReader buffer in Python AsyncHostTransport#2760
Open
michaelreavant wants to merge 1 commit intosuperdoc-dev:mainfrom
Open
Conversation
The Python async transport spawned the host CLI without passing a `limit=`
to `asyncio.create_subprocess_exec`, so its stdout `StreamReader` inherited
asyncio's default 64 KiB buffer. Every host response is written as a single
newline-delimited JSON line, so any `cli.invoke` whose serialized result
exceeds 64 KiB (e.g. `superdoc_get_content` on larger documents) caused
`readline()` to raise `ValueError: Separator is not found, and chunk
exceed the limit` inside `_reader_loop`. The exception was caught by the
generic reader-loop handler and pending requests were rejected with the
misleading `HOST_DISCONNECTED` error β even though the host process was
still alive and healthy.
Pass `limit=` to `create_subprocess_exec` and expose it as a new
`stdout_buffer_limit_bytes` constructor option on `AsyncHostTransport`,
threaded through `SuperDocAsyncRuntime` and `AsyncSuperDocClient`. The
default of 64 MiB safely covers the host's own 32 MiB
`DEFAULT_MAX_STDIN_BYTES` input cap with room for ~2x JSON expansion.
`SyncHostTransport` is unaffected β it uses raw blocking `subprocess.Popen`
which has no asyncio buffer limit.
Adds a `TestAsyncLargeResponse` regression suite that:
1. Round-trips a 200 KB response through the default-configured transport.
2. Pins that an explicitly tightened `stdout_buffer_limit_bytes` still
reproduces the original failure mode, guaranteeing the option is
wired through to `create_subprocess_exec`.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
superdoc_get_contenton larger documents fails in the Python async SDK with a misleadingHOST_DISCONNECTEDerror. Root cause is a missinglimit=on the subprocess spawn β the stdoutStreamReaderinherits asyncio's 64 KiB default buffer, which any non-trivial response trips. This PR raises the buffer ceiling and exposes it as a configurable option.Problem
The Python async transport spawned the host CLI without passing a
limit=toasyncio.create_subprocess_exec, so its stdoutStreamReaderinherited asyncio's default 64 KiB buffer. Every host response is written as a single newline-delimited JSON line, so anycli.invokewhose serialized result exceeds 64 KiB (e.g.superdoc_get_contenton larger documents) causedreadline()to raiseValueError: Separator is not found, and chunk exceed the limitinside_reader_loop. The exception was caught by the generic reader-loop handler and pending requests were rejected with the misleadingHOST_DISCONNECTEDerror β even though the host process was still alive and healthy.Fix
Pass
limit=tocreate_subprocess_execand expose it as a newstdout_buffer_limit_bytesconstructor option onAsyncHostTransport, threaded throughSuperDocAsyncRuntimeandAsyncSuperDocClient. The default of 64 MiB safely covers the host's own 32 MiBDEFAULT_MAX_STDIN_BYTESinput cap with room for ~2x JSON expansion.Scope
SyncHostTransportis unaffected β it uses raw blockingsubprocess.Popenwhich has no asyncio buffer limit. No changes to the Node SDK or the host server.Tests
Adds a
TestAsyncLargeResponseregression suite that:stdout_buffer_limit_bytesstill reproduces the original failure mode, guaranteeing the option is wired through tocreate_subprocess_exec.Bug reproduction was verified by stashing the fix and running the new test against the unmodified
transport.pyβ it raised the exactSuperDocError: Host process disconnected.seen in production. With the fix in place, the full Python SDK test suite passes (90 tests, including 26 transport tests).Test plan
pytest packages/sdk/langs/python/tests/β 90 passedHOST_DISCONNECTEDfailure when the fix is revertedpnpm run generate:allclean (no codegen drift in this branch)