Skip to content

Skip morphology memory guard for lazy dask inputs (#3401)#3410

Open
brendancol wants to merge 4 commits into
mainfrom
deep-sweep-performance-morphology-2026-06-20
Open

Skip morphology memory guard for lazy dask inputs (#3401)#3410
brendancol wants to merge 4 commits into
mainfrom
deep-sweep-performance-morphology-2026-06-20

Conversation

@brendancol

Copy link
Copy Markdown
Contributor

Closes #3401

What changed

  • _dispatch() in xrspatial/morphology.py no longer runs the full-shape memory guard when the input is dask-backed. The guard budgets a full padded float64 copy of the input, which only the eager numpy/cupy backends allocate. Dask processes the raster chunk-by-chunk via map_overlap, so peak memory scales with chunk size, not the full shape.
  • Before this change, a large lazy dask raster (e.g. 200000x200000) raised a false MemoryError and rejected work that would have run fine.
  • The guard still runs for the eager numpy and cupy backends, so their behavior is unchanged.

Backend coverage

  • dask+numpy: guard skipped (fixed)
  • dask+cupy: guard skipped (fixed)
  • numpy: guard preserved
  • cupy: guard preserved

Test plan

  • New regression test builds the graph for a 4096x4096 lazy dask raster under a 1 MB memory budget without raising (graph construction only, no .compute()).
  • Existing eager-backend guard tests still pass (oversized kernel raises, normal use allowed, all public APIs covered).
  • Full test_morphology.py suite passes (48 tests).

_check_kernel_memory budgets a full padded float64 copy of the input,
which only the eager numpy/cupy backends allocate. Dask processes the
raster chunk-by-chunk via map_overlap, so peak memory scales with chunk
size, not the full shape. Running the full-shape guard on a lazy dask
array raised a false MemoryError that rejected large workloads which
would have run fine.

Skip the guard when the input is dask-backed; keep it for numpy/cupy.
Add a regression test that a large lazy dask raster builds the graph
under a tiny memory budget without raising.

@brendancol brendancol left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: Skip morphology memory guard for lazy dask inputs (#3401)

Blockers

None.

Suggestions

None.

Nits

  • xrspatial/morphology.py:451 -- The skip is keyed on the input being any dask.Array, regardless of chunking. A user who passes a single-chunk dask array (chunks=(-1, -1)) now bypasses the guard even though that chunk function materializes a full padded copy per block. Unusual chunking choice, and the eager backends still guard, so not worth blocking on. A one-line note in the comment that the skip assumes reasonable chunking would help the next reader.

What looks good

  • The fix matches the cause in the issue: the guard budgets a full padded copy that only the eager backends allocate, so skipping it for dask is correct.
  • The condition reuses has_dask_array() and isinstance(agg.data, da.Array), the same pattern ArrayTypeFunctionMapping uses to detect the dask backends, so it stays consistent with dispatch.
  • The eager numpy and cupy paths are untouched, so the existing guard tests still apply.
  • The regression test builds the graph only and never calls .compute(), and uses a tiny monkeypatched memory budget so it actually exercises the skip rather than passing by luck.

Checklist

  • Algorithm unchanged; only the guard gate changed: ok
  • Backends: numpy/cupy guard preserved, dask+numpy/dask+cupy guard skipped: ok
  • NaN handling: unchanged
  • Edge cases: new dask test added; eager guard tests intact
  • Dask chunk boundaries: unchanged (map_overlap untouched)
  • No premature materialization: confirmed, graph-only test
  • Benchmark: not needed (gate change, not a hot path)
  • Docstrings: comment added explaining the skip rationale

@brendancol brendancol left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up review (after b1bafaa)

The one nit from the prior review is resolved: the guard-skip comment now notes that the skip assumes reasonable chunking and that a single giant chunk would still materialize a full padded copy per block. No code-path change, so the test results stand (16 memory/guard/dask tests pass; full file 48 pass).

No remaining blockers, suggestions, or nits.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

morphology: memory guard raises false MemoryError on lazy dask rasters

1 participant