Skip to content

fix(outbound): implement drain grace period#4775

Open
MasterPtato wants to merge 1 commit into04-24-fix_pb_fix_pool_config_defaultsfrom
04-24-fix_outbound_implement_drain_grace_period
Open

fix(outbound): implement drain grace period#4775
MasterPtato wants to merge 1 commit into04-24-fix_pb_fix_pool_config_defaultsfrom
04-24-fix_outbound_implement_drain_grace_period

Conversation

@MasterPtato
Copy link
Copy Markdown
Contributor

Description

Please include a summary of the changes and the related issue. Please also include relevant motivation and context.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes.

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

@claude
Copy link
Copy Markdown

claude Bot commented Apr 25, 2026

PR Review: fix(outbound): implement drain grace period

Overview

This PR changes pegboard-outbound to read drain_grace_period from the per-pool runner config instead of the global config.pegboard.serverless_drain_grace_period() setting. This is the right direction. Per-pool config is more flexible than a global fallback.


Critical: Migration default is wrong (pre-existing bug now exposed)

In engine/sdks/rust/data/src/versioned/namespace_runner_config.rs, the v4 to v5 migration sets drain_grace_period: 10_000. The deprecated config defaults to 10,000 ms (10 seconds). But drain_grace_period is documented as seconds (see runner_configs.rs where unwrap_or(10) = 10 seconds).

Before this PR, the outbound code ignored this field and always read from the global config. Now that pegboard-outbound actually consumes drain_grace_period, actors whose configs went through the v4 to v5 migration path will get a 10,000-second (~2.7-hour) drain grace period instead of the intended 10 seconds. The migration default should be 10, not 10_000.


Parity gap: conn.rs still uses global config

engine/packages/pegboard/src/workflows/serverless/conn.rs still reads the old global config for both drain usages (lines 444 and 549). After this PR, the per-pool drain_grace_period is respected in pegboard-outbound but silently ignored in conn.rs. These two code paths should behave consistently, or the inconsistency should be explicitly documented.


Minor: integer saturating_sub

The arithmetic change from Duration subtraction to integer saturating_sub before conversion is fine. Both fields are whole-second values so the result is equivalent.


Test coverage

No tests verify the new behavior. The existing test in runner_config_refresh_metadata.rs sets drain_grace_period: 5 but does not assert on drain timing. A test confirming the per-pool value is used rather than the global fallback would be valuable.


Summary

Issue Severity
Migration default 10_000 treated as seconds (should be 10) High
conn.rs still uses global config instead of per-pool field Medium
No test coverage for the new behavior Low

The direction is right, but the migration default bug could cause severely inflated grace periods for configs that passed through the v4 to v5 migration path.

@MasterPtato MasterPtato force-pushed the 04-24-fix_pb_fix_pool_config_defaults branch from a37d543 to ac730ee Compare April 25, 2026 01:14
@MasterPtato MasterPtato force-pushed the 04-24-fix_outbound_implement_drain_grace_period branch from e50b3e3 to 5f322c0 Compare April 25, 2026 01:14
@MasterPtato MasterPtato force-pushed the 04-24-fix_pb_fix_pool_config_defaults branch from ac730ee to 3167ce3 Compare April 27, 2026 18:04
@MasterPtato MasterPtato force-pushed the 04-24-fix_outbound_implement_drain_grace_period branch from 5f322c0 to 08e2e50 Compare April 27, 2026 18:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants