Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .claude/sweep-metadata-state.csv
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ focal,2026-06-10,3217,MEDIUM,4;5,"Re-audited 2026-06-10 (agent-ad0d55a894c6abc60
geotiff,2026-06-09,3116,HIGH,2;3,"Re-audited 2026-06-09 (agent-ae89ff94a64e3ee8f worktree, branch deep-sweep-metadata-geotiff-2026-06-09). CUDA available; all 4 backends (numpy/cupy/dask+numpy/dask+cupy) run live. Focus: surfaces changed since the 2026-05-18 audit (unpack rename + GPU/dask+GPU support #3075, pack=True #3065/#3079, masked int->float promotion #2994, bbox= reads, rioxarray param alignment #2963, no-georef VRT coord synthesis #2824, GeoTransform omission #2971). Live probes: unpack attrs (scale_factor/add_offset/mask_and_scale_dtype/nodata/masked_nodata), masked=True promotion, default masked=False, bbox window+transform shift, multi-band band=N, dims/name/coords (incl. coord dtype) all identical across the 4 backends; nodata_pixels_present absent on dask paths is the documented lazy contract, not a bug. pack->unpack round trips verified on numpy/dask/gpu-write; pack of a cupy-backed read raises via the known cupy+xarray xp.astype incompat (see memory cupy_where_astype_incompat; dependency-pin fix, raises loudly, not a metadata bug). VRT reads (full/masked/window/bbox) and no-georef TIFF reads agree across the 4 backends. NEW HIGH finding #3116 (Cat 2+3): to_geotiff(non_georef_da, out.vrt, tile_size=N) wrote a corrupt index for arrays spanning >1 tile -- write_vrt derives placement from each source GeoTransform and non-georef tiles all carry the identity transform, so rasterX/YSize collapsed to one tile and every DstRect landed at the origin; reads silently returned a single tile (24x32 in -> 16x16 out). Gap left by #2966/#2971 (tests only covered one non-georef source). Fix: _write_vrt_tiled threads per-tile pixel offsets through _build_vrt -> write_vrt via internal dst_offsets kwarg; write_vrt refuses >1 all-non-georef sources without explicit placement and rejects dst_offsets alongside georeferenced sources. 18 new tests in tests/vrt/test_non_georef_placement_3116.py incl. 4-backend round trip, dask-backed and plain-ndarray writes, XML DstRect assertions, georef placement regression, and the write_vrt error contract. Full vrt suite 520 passed; write+round-trip suites 1292 passed."
interpolate,2026-06-12,3288,MEDIUM,5,kriging K_inv-None fallback was numpy-backed on all backends and misnamed the variance raster; fixed via #3288. All 4 backends verified end-to-end on GPU host. LOW (documented only): template nodatavals/_FillValue copied verbatim while fill_value is the actual output sentinel; tests codify attrs==template.attrs
mcda,2026-06-10,3147,HIGH,1,"constrain() dropped all attrs (res/crs/nodatavals) whenever exclude non-empty (xr.where takes attrs from scalar fill); fixed via attrs restore, tests for numpy/dask/dask+cupy. All other mcda funcs keep attrs/coords/dims on all 4 backends. Out-of-scope crashes noted for backend-parity: owa broken on cupy (numpy order-weights x cupy) and on dask (da.sort does not exist); sensitivity monte_carlo crashes on cupy/dask+cupy (.values on cupy); xr.where compute on cupy/dask+cupy hits known cupy13.6/xarray2025.12 incompat."
multispectral,2026-06-20,3429,MEDIUM,2;3,"true_color() hardcoded y/x dims + dropped extra coords; fixed PR #3434 (all 4 backends verified, CUDA available)"
polygonize,2026-06-12,3293,MEDIUM,1,"Audited 2026-06-12 (agent-a86d90abea41b04cf worktree, branch deep-sweep-metadata-polygonize-2026-06-12). CUDA available; all 4 backends (numpy/cupy/dask+numpy/dask+cupy) run live for int, float+NaN, and no-georef rasters. polygonize returns vector output (numpy/awkward/geopandas/spatialpandas/geojson), not a DataArray, so Cats 2-4 reinterpreted as transform/CRS/value-dtype propagation. Transform auto-detect (attrs['transform'] -> rio.transform() -> x/y coords, #2536/#2607) and CRS resolution run in public polygonize() before dispatch, so all 4 backends emit identical columns, bounds, and CRS (verified live). Column value dtype follows input dtype on every backend. NEW MEDIUM finding #3293 (Cat 1): _detect_raster_crs ignored the _xrspatial_no_georef marker that _detect_raster_transform honours, so a geotiff-reader crs_only raster (attrs carry both crs and the marker; metadata_to_attrs writes crs independent of has_georef) produced a GeoDataFrame claiming EPSG:#### over pixel-space geometries -- the #2536 metadata-lies-about-the-data mismatch through the marker channel. contour.py imports the same helper and inherits the fix. Fix: early return None in _detect_raster_crs on the marker + docstring note; 2 new tests in TestPolygonizeCRSPropagation. polygonize+contour suites 274 passed; all 9 auxiliary polygonize test files 303 passed. rotated-read path unaffected (reader drops CRS there). No CRITICAL/HIGH/LOW findings."
proximity,2026-05-29,2723,MEDIUM,4;5,"Audited 2026-05-29 (agent-a61dbadc2452a2003 worktree, branch deep-sweep-metadata-proximity-2026-05-29). CUDA+cupy available; all 4 backends (numpy/cupy/dask+numpy/dask+cupy) run live end-to-end for proximity/allocation/direction, both bounded (finite max_distance) and unbounded. Cat 1 (attrs res/crs/transform/nodatavals/_FillValue), Cat 2 (coords + coord dtype), and Cat 3 (dims) all preserved and identical across the 4 backends -- public funcs wrap with xr.DataArray(coords=raster.coords, dims=raster.dims, attrs=raster.attrs). NEW MEDIUM finding #2723 (Cat 4 + Cat 5): (a) bounded dask+numpy path (_process_dask -> da.map_overlap with meta=np.array(())) declared output dtype float64 while the chunk fn returns float32 and numpy/cupy/dask+cupy + the unbounded KDTree path all declare float32; docstrings show dtype=float32. Fix: meta=np.array((), dtype=np.float32). (b) dask backends leaked an internal dask op name (_trim-<hash>, _kdtree_chunk_fn-<hash>, asarray-<hash>) into result.name while numpy/cupy return None. Fix: assign result.name=None after construction in all 3 public funcs (xarray ignores a name=None kwarg for named dask arrays, so the reset must happen post-construction). Same .name-leak class as zonal #2611. PR #2728 off child branch deep-sweep-metadata-proximity-2026-05-29-01. New parametrized regression test test_output_metadata_consistent_across_backends asserts declared dtype float32 + name None across all 4 backends x 3 funcs x bounded/unbounded; full test_proximity.py suite 93 passed. No other CRITICAL/HIGH/MEDIUM/LOW findings."
rasterize,2026-06-09,3087,MEDIUM,1,GeoDataFrame .crs dropped on no-like path (Cat 1); fixed via #3087 emitting attrs crs/crs_wkt when output has no CRS. like-path attrs/coords/dims/nodata verified live on all 4 backends (CUDA available); Cats 2-5 clean.
Expand Down
12 changes: 6 additions & 6 deletions xrspatial/multispectral.py
Original file line number Diff line number Diff line change
Expand Up @@ -1821,19 +1821,19 @@ def true_color(r, g, b, nodata=1, c=10.0, th=0.125, name='true_color'):
warnings.simplefilter('ignore')
out = mapper(r)(r, g, b, nodata, c, th)

# TODO: output metadata: coords, dims, attrs
_dims = ['y', 'x', 'band']
_attrs = r.attrs
_coords = {'y': r['y'],
'x': r['x'],
# Preserve the input's spatial dims/coords instead of hardcoding y/x,
# then append the band dim. Hardcoding raised KeyError on lat/lon
# rasters and dropped extra coords like spatial_ref (issue #3429).
_dims = [*r.dims, 'band']
_coords = {**{name: r[name] for name in r.coords},
'band': [0, 1, 2, 3]}

return DataArray(
out,
name=name,
dims=_dims,
coords=_coords,
attrs=_attrs,
attrs=r.attrs,
)


Expand Down
36 changes: 36 additions & 0 deletions xrspatial/tests/test_multispectral.py
Original file line number Diff line number Diff line change
Expand Up @@ -884,6 +884,42 @@ def test_true_color_mismatched_backends_raises():
true_color(red, green, blue)


# true_color metadata propagation (issue #3429) ----------
def _tc_band(backend):
data = np.random.default_rng(3429).random((6, 6)).astype(np.float32)
return create_test_raster(
data, backend=backend, dims=['lat', 'lon'],
attrs={'res': (0.5, 0.5), 'crs': 'EPSG: 5070'},
)


@pytest.mark.parametrize(
"backend",
["numpy",
pytest.param("dask+numpy", marks=dask_array_available),
pytest.param("cupy", marks=cuda_and_cupy_available),
pytest.param("dask+cupy", marks=cuda_and_cupy_available)],
)
def test_true_color_preserves_non_yx_dims(backend):
# true_color used to hardcode y/x and raised KeyError on lat/lon input.
r = _tc_band(backend)
g = _tc_band(backend)
b = _tc_band(backend)
out = true_color(r, g, b)
assert out.dims == ('lat', 'lon', 'band')
np.testing.assert_allclose(out['lat'].data, r['lat'].data)
np.testing.assert_allclose(out['lon'].data, r['lon'].data)
assert out.attrs == r.attrs


def test_true_color_preserves_extra_coords():
# A non-spatial coord (e.g. rioxarray's spatial_ref) must pass through.
r = _tc_band('numpy').assign_coords(spatial_ref=0)
out = true_color(r, r.copy(), r.copy())
assert 'spatial_ref' in out.coords
assert int(out['spatial_ref']) == 0


# NDSI ----------
@pytest.fixture
def expected_ndsi():
Expand Down
Loading