Skip to content

[SYSTEMDS-2651] Replace fixed-sleep federated worker startup with Poll#2468

Closed
Baunsgaard wants to merge 1 commit into
apache:mainfrom
Baunsgaard:FederatedWorkerReadyPolling
Closed

[SYSTEMDS-2651] Replace fixed-sleep federated worker startup with Poll#2468
Baunsgaard wants to merge 1 commit into
apache:mainfrom
Baunsgaard:FederatedWorkerReadyPolling

Conversation

@Baunsgaard

@Baunsgaard Baunsgaard commented May 15, 2026

Copy link
Copy Markdown
Contributor

This PR replace the thread.sleep with a poll based startup of federated workers in testing. The change helps our test suites to not have timeouts, or failures because of inconsistent launches of federated workers.

@github-project-automation github-project-automation Bot moved this to In Progress in SystemDS PR Queue May 15, 2026
@Baunsgaard Baunsgaard changed the title [SYSTEMDS-2651][] Replace fixed-sleep federated worker startup with Poll [SYSTEMDS-2651] Replace fixed-sleep federated worker startup with Poll May 15, 2026
…rtup

Replace fixed Thread.sleep after each federated worker start with TCP
port polling that returns as soon as the worker accepts a connection.
Add bulk helpers that spawn N workers in parallel and wait once for the
slowest to become ready, instead of summing per-worker waits.

Cuts the federated CI total by ~7 min (-5%) vs main, with the biggest
wins in setup-heavy suites such as transform+fedplanner (-66%) and
codegen (-25%).

Closes apache#2468.
@Baunsgaard Baunsgaard force-pushed the FederatedWorkerReadyPolling branch from 8804921 to 0c830d4 Compare May 18, 2026 16:01
@codecov

codecov Bot commented May 18, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 71.38%. Comparing base (3f7b17b) to head (0c830d4).
⚠️ Report is 49 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #2468      +/-   ##
============================================
- Coverage     71.55%   71.38%   -0.17%     
- Complexity    47461    48707    +1246     
============================================
  Files          1539     1570      +31     
  Lines        182631   188757    +6126     
  Branches      35919    37039    +1120     
============================================
+ Hits         130677   134744    +4067     
- Misses        41944    43571    +1627     
- Partials      10010    10442     +432     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-project-automation github-project-automation Bot moved this from In Progress to Done in SystemDS PR Queue May 20, 2026
@Baunsgaard Baunsgaard deleted the FederatedWorkerReadyPolling branch June 10, 2026 15:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

1 participant