Skip to content

client-api: stop logging suspended-database errors as 500s#4918

Open
jdetter wants to merge 1 commit intomasterfrom
jdetter/leader-suspended-no-500-log
Open

client-api: stop logging suspended-database errors as 500s#4918
jdetter wants to merge 1 commit intomasterfrom
jdetter/leader-suspended-no-500-log

Conversation

@jdetter
Copy link
Copy Markdown
Collaborator

@jdetter jdetter commented Apr 30, 2026

Description of Changes

Disclaimer: This description was written by claude:

The two .leader() call sites in client-api/src/routes/database.rs and client-api/src/routes/subscribe.rs pipe GetLeaderHostError through log_and_500, which:

  1. Logs every error variant at error level — including Suspended, Bootstrapping, NoLeader, etc., which are normal operational states. This produces noisy log lines like:
    ERROR /app/.../client-api/src/lib.rs:623: internal error: database is suspended
    
  2. Forces every such error into a 500 Internal Server Error response, even when the appropriate status code is something else (e.g. 503 Service Unavailable for a suspended database).

GetLeaderHostError already implements Into<axum::response::ErrorResponse> with the correct per-variant mapping:

Variant Status
NoSuchDatabase 404 Not Found
LaunchError, Misdirected 500 Internal Server Error
NoNodeId, NoLeader, ControlConnection, Suspended, Bootstrapping 503 Service Unavailable

The standalone implementation already uses ?-propagation directly. This PR makes the two client-api call sites match that pattern.

Result

  • Suspended / bootstrapping / no-leader databases now return 503 instead of 500.
  • These expected states no longer produce error-level log spam in the request path. Genuinely unexpected internal errors elsewhere in the codebase continue to log via log_and_500 unchanged.

API and ABI breaking changes

None

Expected complexity level and risk

1

Testing

  • Deploy to staging to see if we still see this error when trying to access a suspended database

@jdetter jdetter requested a review from kim April 30, 2026 02:59
The two leader() call sites in database.rs and subscribe.rs were piping
GetLeaderHostError through log_and_500, which logs every variant at error
level and forces a 500 response. GetLeaderHostError already implements
Into<axum::response::ErrorResponse> with proper status codes (503 for
Suspended/Bootstrapping/NoLeader/NoNodeId/ControlConnection, 404 for
NoSuchDatabase, 500 for LaunchError/Misdirected), and standalone already
uses ?-propagation directly. Match that pattern so suspended databases
no longer log at error level and return the correct 503 status.
@jdetter jdetter force-pushed the jdetter/leader-suspended-no-500-log branch from b0bf9c2 to 548b28d Compare April 30, 2026 03:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants