Skip to content

Use local flag instead of $ERROR_INFO to gate ensure-block close#71

Open
danmayer wants to merge 1 commit intomainfrom
better_fiber_fix
Open

Use local flag instead of $ERROR_INFO to gate ensure-block close#71
danmayer wants to merge 1 commit intomainfrom
better_fiber_fix

Conversation

@danmayer
Copy link
Copy Markdown

Summary

Follow-up to #68 and #69. Fixes a real production-symptom-matching bug where dc.get_multi(...) { … } (and any pipelined_get) called from inside a rescue clause silently returns {} instead of the expected results, because Base#request's ensure was reading $ERROR_INFO (a.k.a. $!) — which is preserved into nested method calls when the call is made from a rescue handler.

The bug

Old ensure:

ensure
  close if $ERROR_INFO && @connection_manager.request_in_progress?
end

$! is set during a rescue clause and stays set across method calls made from inside that clause. So when application code does the very common cache-fallback pattern:

begin
  primary_lookup
rescue StandardError
  cache.get_multi(keys) { |k, v|  }   # passes through Base#request(:pipelined_get, …)
end

…inside Base#request's ensure:

  • $ERROR_INFO is the outer rescued exception (truthy)
  • request_in_progress? is true because :pipelined_get intentionally skips finish_request! (the caller still has to drain responses)

Both conditions met → close fires, socket torn down mid-pipeline. make_getkq_requests swallows the resulting DalliError (pipelined_getter.rb:60), the drain finds nothing on the wire, and get_multi returns {}. The application sees an empty result and either treats those keys as cache misses (extra DB load + refill churn) or — worse — interprets the empty hash as a definitive "not present" signal.

The fix

Replace the global-state read with a local flag set at the very end of the begin block:

begin
  request_local_completed = false
  @connection_manager.start_request!
  response = send(opkey, *args)
  @connection_manager.finish_request! unless opkey == :pipelined_get
  request_local_completed = true
  response
rescue 
ensure
  close unless request_local_completed
end

The local flag is scoped to this method invocation; nothing outside this frame can affect it. The same close-on-abort guarantee from #68 is preserved for non-StandardError exceptions like Async::Stop.

Also dropped require 'English' from base.rb since the file no longer references any of those globals.

Test

test/integration/test_fiber_concurrency.rb grows a third case that:

  1. Warms two keys
  2. From inside a rescue StandardError, calls dc.get_multi(%w[a b]) { … } (block form is required to force :pipelined_get rather than the single-server-optimized :read_multi_req)
  3. Asserts the result hash is correct AND close was never called

Verified pre/post-fix on main head:

  • Pre-fix (revert just lib/dalli/protocol/base.rb):
    Failure: expected {"a"=>"val_a","b"=>"val_b"}, got {}
    
  • Post-fix: passes.

Test plan

The ensure clause in Base#request previously read $ERROR_INFO to decide
whether the begin block was unwinding from an exception:

    close if $ERROR_INFO && @connection_manager.request_in_progress?

$! / $ERROR_INFO is preserved when a method is called from inside a
rescue clause — the called method's ensure sees the *outer* rescued
exception even when its own begin block completed cleanly.  On the
:pipelined_get happy path, request_in_progress? is intentionally left
true (the caller still has to drain the responses), so this check
fired close() and tore the socket out from under any pipelined_get
issued from inside a rescue handler — a common cache-fallback pattern.

The new test reproduces the failure on single-server get_multi(keys) {…}
(block form forces :pipelined_get): pre-fix the call returns {} because
make_getkq_requests swallows the resulting DalliError; post-fix it
returns the expected hash.

Replaces the global-state check with a local request_local_completed
flag set true at the end of the begin block, so the ensure decision
is scoped to this method only.
verify_state(opkey)

begin
request_local_completed = false
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's okay to be explicit, but this line is unnecessary.

# pipelined_get emit query but doesn't read the response(s)
@connection_manager.finish_request! unless opkey == :pipelined_get

request_local_completed = true
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also slightly prefer to call this success but it's fine either way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants