Skip to content

io.netty.handler.timeout.ReadTimeoutException not getting retried for S3AsyncClient::getObject #6866

@bgruber

Description

@bgruber

Describe the bug

This is a re-report of #6581, as we're seeing the exact same behavior. When calls to S3AsyncClient::getObject timeout, the ReadTimeoutExcpetions thrown are not wrapped in IOExceptions, so they are not retried by the default retry policy.

Regression Issue

  • Select this option if this issue appears to be a regression.

Expected Behavior

Client timeouts for getObject should be retried

Current Behavior

When joining the future we get back, we get an exception with this stacktrace:

java.util.concurrent.CompletionException: software.amazon.awssdk.core.exception.SdkClientException: Unable to execute HTTP request: null (SDK Attempt Count: 1)
	at software.amazon.awssdk.utils.CompletableFutureUtils.errorAsCompletionException(CompletableFutureUtils.java:64)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncExecutionFailureExceptionReportingStage.lambda$execute$0(AsyncExecutionFailureExceptionReportingStage.java:51)
	at java.base/java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:955)
	at java.base/java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:932)
	at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:531)
	at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2221)
	at software.amazon.awssdk.utils.CompletableFutureUtils.lambda$forwardExceptionTo$0(CompletableFutureUtils.java:78)
	at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:884)
	at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:862)
	at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:531)
	at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2221)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage$RetryingExecutor.maybeAttemptExecute(AsyncRetryableStage.java:135)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage$RetryingExecutor.maybeRetryExecute(AsyncRetryableStage.java:152)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage$RetryingExecutor.lambda$attemptExecute$1(AsyncRetryableStage.java:113)
	at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:884)
	at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:862)
	at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:531)
	at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2221)
	at software.amazon.awssdk.utils.CompletableFutureUtils.lambda$forwardExceptionTo$0(CompletableFutureUtils.java:78)
	at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:884)
	at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:862)
	at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:531)
	at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2221)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.MakeAsyncHttpRequestStage.lambda$execute$0(MakeAsyncHttpRequestStage.java:108)
	at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:884)
	at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:862)
	at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:531)
	at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2221)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.MakeAsyncHttpRequestStage.completeResponseFuture(MakeAsyncHttpRequestStage.java:255)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.MakeAsyncHttpRequestStage.lambda$executeHttpRequest$3(MakeAsyncHttpRequestStage.java:167)
	at java.base/java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:955)
	at java.base/java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:932)
	at java.base/java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:503)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1090)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:614)
	at java.base/java.lang.Thread.run(Thread.java:1474)
Caused by: software.amazon.awssdk.core.exception.SdkClientException: Unable to execute HTTP request: null (SDK Attempt Count: 1)
	at software.amazon.awssdk.core.exception.SdkClientException$BuilderImpl.build(SdkClientException.java:130)
	at software.amazon.awssdk.core.exception.SdkClientException$BuilderImpl.build(SdkClientException.java:95)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.utils.RetryableStageHelper.retryPolicyDisallowedRetryException(RetryableStageHelper.java:168)
	... 25 common frames omitted
Caused by: io.netty.handler.timeout.ReadTimeoutException

Reproduction Steps

https://gist.github.com/bgruber/e5cd1abb1a4e634f0dc21022dd5e0ff9

In order to reproduce the bug, you have to make sure the client is only timing out after there's been some response from the server; that's why adding a 1ms timeout to timeouts doesn't show the bug. This part is a little murkier to me, but I think it has something to do with the HandlerPublisher that invokes ResponseHandler.PublisherActor::onError not being added to the handler chain until some data has been seen.

Possible Solution

Examining the stack trace confirms that this is because the ReadTimeoutException isn't getting wrapped in an IOException as expected.

I believe the cause is that for streaming responses (like those from getObject), the future is completed exceptionally by ResponseHandler.PublisherActor::onError, which doesn't decorate the exception first.

Additional Information/Context

No response

AWS Java SDK version used

2.42.34

JDK version used

openjdk version "11.0.30" 2026-01-20 OpenJDK Runtime Environment (build 11.0.30+7-post-Ubuntu-1ubuntu122.04) OpenJDK 64-Bit Server VM (build 11.0.30+7-post-Ubuntu-1ubuntu122.04, mixed mode, sharing)

Operating System and version

Ubuntu 22.04.5

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugThis issue is a bug.p2This is a standard priority issue

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions