feat: add Blob serialization support over RPC#155
feat: add Blob serialization support over RPC#155G4brym wants to merge 1 commit intocloudflare:mainfrom
Conversation
🦋 Changeset detectedLatest commit: 880a7fe The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
All contributors have signed the CLA ✍️ ✅ |
commit: |
|
I have read the CLA Document and I hereby sign the CLA |
539e59a to
a08d1a8
Compare
|
Hmm, it looks like you went straight for the ReadableStream approach instead of the Uint8Array approach I recommended. This approach presents additional complexities. On the receiving end, the RPC system will wait for the blob bytes to arrive before delivering anything to the app. This means the call is delivered out-of-order, technically a violation of e-order. Of course, this is the norm when delivering payloads containing promises, so we could just say "blobs behave like promises, they may delay delivery". But I'm not totally sure if we have to make that compromise. Reading from a Either way, I do think that for small blobs, it is not worth the overhead for streaming, and we really should still encode it as |
|
Hey @kentonv, ive updated the pr to encode small blobs as Uint8Array, because encoding calls |
|
Hi. Don't want to sound like a killjoy, but wouldn't it make more sense to tackle custom serialization as an API first and then implement whatever missing defaults, like That would also allow the developers to implement whichever serialization they are missing instead of waiting for individual PRs to land for those things. It's #32, basically. |
|
Apologies, my inbox has been so full that I haven't gotten to Cap'n Web in a few weeks. I think I can get back to this tomorrow.
No absolutely not. Lettings apps add custom serializers is an enormous security footgun that has been a disaster for e.g. Java serialization. |
kentonv
left a comment
There was a problem hiding this comment.
This is too much complexity to support a relatively obscure type.
I think we should simplify:
- Don't optimize small blobs after all. It doesn't actually accomplish what I wanted anyway (maintaining e-order) since you have to read the blob content asynchronously on the sending side. So it's not worth it.
- Don't add special handling of blob promises, reuse RpcPromise infrastructure.
I think when simplified this implementation should be almost entirely in serialize.ts, with only small changes (adding some new switch cases) in core.ts, and probably no changes at all in rpc.ts.
0ebae8c to
4c063f7
Compare
Blobs can now be passed as RPC call arguments and return values, with MIME type preserved across the wire. Wire format: ["blob", type, ["readable", pipeId]]. Bytes always stream through a pipe — reading a Blob's bytes is inherently async, so there's no way to preserve send-side e-order regardless of encoding; the uniform pipe path keeps the encoder synchronous and matches the wire semantics of a payload containing a promise. Implementation is almost entirely in serialize.ts: - serialize.ts: Devaluator encode case creates a pipe from blob.stream(). Evaluator decode case wraps the incoming ReadableStream in an RpcPromise via a new streamToBlobPromise() helper that mirrors fixBrokenRequestBody(): the promise is pushed into the Evaluator's existing promises list and the payload-delivery machinery substitutes the real Blob before user code sees it. No dedicated blob-promise plumbing. - core.ts: add "blob" to TypeForRpc, BLOB_PROTOTYPE constant, the typeForRpc case, and immutable case arms in deepCopy / disposeReturn / deliverTo / followPath. No other changes. - rpc.ts: unchanged. - types.d.ts: Blob added to BaseType. - README.md: Blob added to pass-by-value types list. - __tests__/test-util.ts: echoBlob() on TestTarget. - __tests__/index.test.ts: decode rejection tests, round-trip coverage, wire-format verification. Not supported: File (Blob subclass with different prototype), and serialize()/deserialize() of Blobs outside an RPC session (same limitation as streams and stubs — createPipe() requires a session).
4c063f7 to
880a7fe
Compare
|
Hey @kentonv, i just simplified the pr, can you take a look again? |
| typeof Buffer !== "undefined" ? Buffer.prototype : undefined; | ||
|
|
||
| // Blob is available in every runtime we support (Node >=18, browsers, workerd). | ||
| const BLOB_PROTOTYPE = Blob.prototype; |
There was a problem hiding this comment.
Let's inline this constant into the switch below, consistent with all the other types.
| // Blobs are always streamed through a pipe. Reading a Blob's bytes is inherently async, | ||
| // so there's no way to preserve send-side e-order regardless of encoding; always using | ||
| // the pipe keeps the encoder synchronous (matching the wire semantics of a payload | ||
| // containing a promise). | ||
| // | ||
| // When devaluating via NULL_EXPORTER (i.e. `serialize()`), createPipe() throws | ||
| // "Cannot create pipes without an RPC session." — same behaviour as streams and stubs. |
There was a problem hiding this comment.
Let's replace this AI slop comment.
| // Blobs are always streamed through a pipe. Reading a Blob's bytes is inherently async, | |
| // so there's no way to preserve send-side e-order regardless of encoding; always using | |
| // the pipe keeps the encoder synchronous (matching the wire semantics of a payload | |
| // containing a promise). | |
| // | |
| // When devaluating via NULL_EXPORTER (i.e. `serialize()`), createPipe() throws | |
| // "Cannot create pipes without an RPC session." — same behaviour as streams and stubs. | |
| // Blobs are streamed through a pipe. This allows very large blobs to be sent without | |
| // causing excessively large individual messages nor blocking other messages in the | |
| // meantime. | |
| // | |
| // Ideally, small Blobs would be inlined. But, there is no way to read a blob | |
| // synchronously, and we MUST serialize the message synchronously. Hence, we have no choice | |
| // but to use streaming even for small blobs. |
| // Assemble a Blob from a pipe ReadableStream asynchronously, wrapped in an RpcPromise so it plugs | ||
| // into the existing payload-delivery substitution machinery. Same pattern as | ||
| // fixBrokenRequestBody() above: the caller pushes the returned RpcPromise into the Evaluator's | ||
| // `promises` list, and deliverTo() replaces it with the resolved Blob before user code runs. |
There was a problem hiding this comment.
| // Assemble a Blob from a pipe ReadableStream asynchronously, wrapped in an RpcPromise so it plugs | |
| // into the existing payload-delivery substitution machinery. Same pattern as | |
| // fixBrokenRequestBody() above: the caller pushes the returned RpcPromise into the Evaluator's | |
| // `promises` list, and deliverTo() replaces it with the resolved Blob before user code runs. | |
| // Unfortuntaely, even though Blobs can only be read asynchronously, there is no way to create | |
| // a blob backed by an asynchronous source; the bytes MUST all be provided upfront. This | |
| // effectively makes it impossible to manitain e-order when sending Blobs. | |
| // | |
| // As a compromise, we deliver a message as if it contained an RpcPromise that resolves to the | |
| // Blob. This has the effect that the RPC system will wait for the whole Blob to stream in before | |
| // delivering the message -- reusing the existing machinery for handling promises. |
Summary
Adds
Blobas a first-class serializable type —Blobobjects can now be passed as RPC call arguments and return values, with MIME type preserved across the wire.Wire format
Bytes always stream through a pipe. Reading a Blob's bytes is inherently asynchronous, so the uniform pipe path keeps the encoder synchronous and matches the wire semantics of a payload containing a promise.
E-order
The call message is sent synchronously (send-side e-order preserved). On the receiving end, the Evaluator wraps the pipe stream in an
RpcPromisethat resolves to the Blob; delivery to user code waits for all promises in the payload to resolve, same as any other payload containing a promise.