Skip to content

fix: Add backpressure on S3 stream upload to prevent unbounded buffering#3405

Open
guille-moe wants to merge 2 commits into
aws:version-3from
guille-moe:fix/streaming_upload_backpressure
Open

fix: Add backpressure on S3 stream upload to prevent unbounded buffering#3405
guille-moe wants to merge 2 commits into
aws:version-3from
guille-moe:fix/streaming_upload_backpressure

Conversation

@guille-moe

Copy link
Copy Markdown

TL;DR

The S3 stream uploader does not currently apply backpressure between reading from the input stream and uploading multipart parts.
When the input is written faster than parts can be uploaded, the uploader can read ahead and queue buffered parts indefinitely, causing unbounded memory usage or disk usage (with tempfile: true).

Context

MultipartStreamUploader reads stream data into part buffers (StringIO or Tempfile) on a dedicated thread and posts each part to DefaultExecutor for upload when a executor Thread become available.

Previously, the reader loop could read ahead without waiting for executor, so parts could accumulate in the executor queue, causing unbounded memory use with StringIO or unbounded disk use of Tempfile with tempfile: true.

MultipartFileUploader does not have this issue because parts offsets are computed ahead and read inside the executor Thread.

This change tracks in-flight read slots and defers further reads until a posted part completes (or fails), keeping buffered data bounded by executor concurrency.

Testing

Added an example reproducing unbounded read in gems/aws-sdk-s3/spec/multipart_stream_uploader_spec.rb.

bundle exec rspec gems/aws-sdk-s3/spec/multipart_stream_uploader_spec.rb

Real-world

This issue resulted in memory leaks (near file size !) during a stream upload, to investigate we enabled the tempfile option which push to a Tempfile to follow what happen, this what we show:

Screencast.from.2026-07-02.11-06-15.webm

For context, here we just read a big local file and send chunks to the stream upload.

Changes

  • Limit concurrent reads in MultipartStreamUploader to @executor.max_threads, so the reader does not pull data from IO.pipe faster than the executor can upload parts
  • Expose max_threads on DefaultExecutor so the uploader can align read concurrency with executor capacity
  • Add a spec with a single-threaded blocking executor to verify the executor queue stays empty when the writer is faster than uploads

on IO.pipe work as expect to prevent memory leak.
@guille-moe guille-moe requested a review from a team as a code owner July 3, 2026 08:51
@guille-moe guille-moe force-pushed the fix/streaming_upload_backpressure branch from 969b9f3 to 1621e7f Compare July 3, 2026 08:55
to prevent memory or disk leaks.
@guille-moe guille-moe force-pushed the fix/streaming_upload_backpressure branch from 1621e7f to 644899a Compare July 3, 2026 08:55
@guille-moe guille-moe changed the title Fix/streaming upload backpressure fix: Add backpressure on S3 stream upload to prevent unbounded buffering Jul 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant