Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 26 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -118,6 +118,32 @@ import logging
logging.basicConfig(level=logging.DEBUG)
```

#### Returning xarray Datasets

The Water Data time-series, peaks, field-measurement, statistics, and samples
getters have [xarray](https://docs.xarray.dev/) counterparts in
`dataretrieval.waterdata.xarray` that return a CF-conventions
`xarray.Dataset` instead of a DataFrame — ready to write to netCDF or hand to
the CF-aware scientific Python stack. Install the optional dependency with
`pip install dataretrieval[xarray]`:

```python
from dataretrieval.waterdata import xarray as wdx

# Same arguments as waterdata.get_daily, but returns a CF Dataset
ds = wdx.get_daily(
monitoring_location_id='USGS-01646500',
parameter_code='00060', # Discharge
time='2024-10-01/2025-09-30',
)
```

See the
[xarray demo notebook](https://github.com/DOI-USGS/dataretrieval-python/blob/main/demos/waterdata_xarray_demo.ipynb)
for a walkthrough of the default dense `(site, time)` grid and the
`dense=False` contiguous-ragged layout for large multi-site pulls, the CF
metadata, and writing to netCDF.

### Water Quality Portal (WQP)

Access water quality data from multiple agencies:
Expand Down
59 changes: 59 additions & 0 deletions dataretrieval/waterdata/types.py
Original file line number Diff line number Diff line change
Expand Up @@ -74,3 +74,62 @@
"count",
],
}


# --- CF / xarray vocabulary mappings ---------------------------------------
# Lookup tables used by :mod:`dataretrieval.waterdata.xarray` to translate
# USGS terms into CF-conventions metadata. Each is intentionally partial:
# anything not listed falls back to a sensible default (raw unit string kept
# verbatim; no standard_name emitted) rather than guessing a wrong CF term.
# They are plain data, so they live here rather than in the (xarray-optional)
# converter module and can be extended without importing xarray.

# USGS unit strings -> UDUNITS / CF-canonical form.
CF_UNIT_MAP = {
"ft^3/s": "ft3 s-1",
"ft3/s": "ft3 s-1",
"ft": "ft",
"in": "in",
"degC": "degC",
"deg C": "degC",
"uS/cm": "uS/cm",
"mg/l": "mg L-1",
"mg/L": "mg L-1",
# UDUNITS 'ton' is the US short ton; 'short_ton' is not a valid UDUNITS name.
"tons/day": "ton day-1",
"%": "percent",
}

# USGS statistic_id -> the operator in a CF ``cell_methods`` string.
CF_CELL_METHODS = {
"00001": "maximum",
"00002": "minimum",
"00003": "mean",
"00006": "sum",
"00008": "median",
"00011": "point", # instantaneous
}

# USGS 5-digit parameter code -> CF standard_name. Deliberately conservative;
# codes without a confident match are left without a standard_name.
CF_STANDARD_NAMES = {
"00060": "water_volume_transport_in_river_channel",
# 00010 (water temperature) is intentionally omitted: ``water_temperature``
# is NOT a CF standard name, and the only valid CF water-temperature name,
# ``sea_water_temperature``, is wrong-domain for USGS freshwater/groundwater.
# Leaving it unmapped keeps the variable's ``long_name`` without emitting an
# invalid or misleading ``standard_name``.
"00065": "water_surface_height_above_reference_datum",
"63160": "water_surface_height_above_reference_datum",
"00045": "lwe_thickness_of_precipitation_amount",
}

# USGS parameter code -> vertical reference datum, attached as a
# ``vertical_datum`` attribute. The two water-surface-height parameters share
# the CF standard_name water_surface_height_above_reference_datum, so the datum
# distinguishes them: gage height (00065) is measured from a local site (gage)
# datum, while stream water level (63160) is referenced to NAVD88.
CF_VERTICAL_DATUM = {
"00065": "local site datum",
"63160": "NAVD88",
}
Loading
Loading