Skip to content

hand_mfd dask path materializes all tiles eagerly #1416

@brendancol

Description

@brendancol

Describe the bug

hand_mfd._hand_mfd_dask (xrspatial/hydro/hand_mfd.py L600-628) builds the
final result by computing every tile via blocks[iy, ix].compute() into a
list-of-lists of numpy arrays, then calls da.block(rows) to assemble. Peak
driver memory holds the full output array, not one chunk at a time.

The sister implementation hand_dinf._hand_dinf_dask (xrspatial/hydro/hand_dinf.py
L605-624) handles the same step correctly: it uses da.map_blocks so each
tile is recomputed lazily from the converged boundary state.

The boundary-propagation phase before assembly (L577-596) is fine — it iterates
per-tile via _process_tile_hand_mfd which holds at most one tile in RAM.
The bug is purely in the assembly phase.

Expected behavior

hand_mfd on a dask DataArray returns a lazy dask result whose memory cost
scales with chunk size, not with the full grid — matching hand_dinf and the
rest of the hydro subpackage.

Affected backends

  • dask+numpy: directly affected.
  • dask+cupy: affected via _hand_mfd_dask_cupy (L631-647), which converts
    to numpy dask, calls _hand_mfd_dask, then converts back.

Severity

HIGH. On a 30 TB grid this materializes 30 TB on the driver before output.
Tests use 3x3 chunks so the issue does not surface in CI.

Fix sketch

Replace the assembly loop with da.map_blocks over (fa_da, elev_da), slicing
fractions_da[:, y_start:y_end, x_start:x_end] per tile inside the closure
(or rechunk fractions once up front to align with fa_da chunks).

Additional context

Found during the 2026-05-01 hydro performance sweep. CSV updated:
hydro row now reads oom_verdict=WILL OOM, bottleneck=memory-bound,
high_count=1.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinggpuCuPy / CUDA GPU supportoomOut-of-memory risk with large datasetsperformancePR touches performance-sensitive code

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions