[MINOR] Fix CholeskyTest crash when residual is exactly zero by Baunsgaard · Pull Request #2487 · apache/systemds

Baunsgaard · 2026-06-09T11:21:37Z

CholeskyTest reconstructs A from its Cholesky factor and asserts that the 1x1 residual D = sum(A-B) is approximately zero. The output was read back with dmlOut.keySet().iterator().next(), which assumes at least one cell is present. When the residual is exactly 0.0, the sparse text writer omits the cell entirely, so the result map comes back empty and the iterator throws NoSuchElementException. A perfect reconstruction therefore caused the test to error out instead of pass.

This is not data-dependent flakiness: the input matrix is already seeded, so A is identical on every run. The variability comes from the reduction order of sum(A-B), which differs across Spark partitions and CP threads. Because floating-point addition is not associative, the residual lands on either an exact 0.0 (empty output) or a tiny non-zero value depending on execution, which is why only some runs (notably testLargeCholeskyDenseSP) failed. The fix treats an empty output as 0.0, making the assertion robust to both outcomes, and drops the now-unused MatrixValue import.

Error: https://github.com/apache/systemds/actions/runs/27172521960/job/80214522399

CholeskyTest reconstructs A from its Cholesky factor and asserts that the 1x1 residual D = sum(A-B) is approximately zero. The output was read back with dmlOut.keySet().iterator().next(), which assumes at least one cell is present. When the residual is exactly 0.0, the sparse text writer omits the cell entirely, so the result map comes back empty and the iterator throws NoSuchElementException. A perfect reconstruction therefore caused the test to error out instead of pass. This is not data-dependent flakiness: the input matrix is already seeded, so A is identical on every run. The variability comes from the reduction order of sum(A-B), which differs across Spark partitions and CP threads. Because floating-point addition is not associative, the residual lands on either an exact 0.0 (empty output) or a tiny non-zero value depending on execution, which is why only some runs (notably testLargeCholeskyDenseSP) failed. The fix treats an empty output as 0.0, making the assertion robust to both outcomes, and drops the now-unused MatrixValue import.

github-project-automation Bot added this to SystemDS PR Queue Jun 9, 2026

github-project-automation Bot moved this to In Progress in SystemDS PR Queue Jun 9, 2026

Baunsgaard merged commit d90ecf6 into apache:main Jun 9, 2026
43 of 44 checks passed

github-project-automation Bot moved this from In Progress to Done in SystemDS PR Queue Jun 9, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MINOR] Fix CholeskyTest crash when residual is exactly zero#2487

[MINOR] Fix CholeskyTest crash when residual is exactly zero#2487
Baunsgaard merged 1 commit into
apache:mainfrom
Baunsgaard:fix-cholesky-empty-output

Baunsgaard commented Jun 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Baunsgaard commented Jun 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant