netty: Propagate initial handshake failure before close by becomeStar · Pull Request #12626 · grpc/grpc-java

becomeStar · 2026-01-25T12:58:44Z

Handshake failures that occur before any writes are buffered can currently be lost to downstream inbound handlers. In this case, the failure is surfaced via the write / promise path, but exceptionCaught is never observed by handlers placed after WriteBufferingAndExceptionHandler.

This makes the original handshake error difficult to diagnose and inconsistent with failures that occur after buffering has started.

This change propagates the exception via fireExceptionCaught before closing the channel when handling the first failure on an active channel. Doing so preserves the original failure while the pipeline is still intact and avoids losing the exception due to close-triggered teardown or reentrancy.

Fixes #8495

When a handshake failure occurs before any writes are buffered on the server side, WriteBufferingAndExceptionHandler can record the failure internally but never surface it to downstream inbound handlers. This makes the original handshake error unobservable and complicates debugging and instrumentation. Propagate only the first failure via exceptionCaught, gated on the absence of a previous failure, so that the canonical error becomes observable while avoiding duplicate propagation and preserving existing close semantics.

kannanjgithub · 2026-02-09T10:47:58Z

Replied my thought on issue #8495.

becomeStar · 2026-02-10T14:43:21Z

@kannanjgithub

Thank you very much for your detailed analysis and for taking the time to simulate the failure. Your observation about the object handle changing is incredibly helpful and provides a clear clue as to why the original root cause may be getting lost.

It does seem that failCause can effectively be reset when a new instance of WriteBufferingAndExceptionHandler is introduced into the pipeline, which explains why a secondary exception ends up being surfaced instead of the original handshake failure.

I’ll dig further into where and why the handler instance is being replaced and look for a way to ensure the first meaningful exception is preserved across instances.

Based on your feedback, I’ll work toward a refined solution that addresses this state-loss issue directly. Once I have a clearer fix, I can either update this PR or follow up with a new one, depending on what you think makes the most sense.

Thanks again for the detailed investigation and guidance — it’s been extremely helpful.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

netty: Propagate initial handshake failure before close#12626

netty: Propagate initial handshake failure before close#12626
becomeStar wants to merge 1 commit intogrpc:masterfrom
becomeStar:netty/propagate-handshake-failure

becomeStar commented Jan 25, 2026 •

edited

Loading

Uh oh!

kannanjgithub commented Feb 9, 2026

Uh oh!

becomeStar commented Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

becomeStar commented Jan 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kannanjgithub commented Feb 9, 2026

Uh oh!

becomeStar commented Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

becomeStar commented Jan 25, 2026 •

edited

Loading