Add LA-level household land value calibration targets#371
Add LA-level household land value calibration targets#371vahid-ahmadi wants to merge 3 commits intomainfrom
Conversation
Generalises targets/sources/mhclg_regional_land.py to local-authority level. Each LA's share of national household land is proportional to households x avg_house_price, scaled to the ONS National Balance Sheet household-land series. Inputs (all already used elsewhere in the repo): - storage/la_land_values.csv: 360 LAs with households (from the existing local_authority_weights.h5 matrix) and avg_house_price (HM Land Registry UK HPI Dec 2025). - _land.HOUSEHOLD_LAND_VALUES for the national anchor. Tests cover CSV data quality, share/target aggregation, sensible ordering (K&C > Blackpool by >3x, London boroughs in top quintile), and registry integration. Updates test_regional_land_value_targets.py to filter by GeographicLevel.REGION now that LA targets share the same name prefix. Closes #370 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Note for whoever picks up #357: this PR mirrors |
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Blocker: data bug in Impact: IoS alone absorbs 8.6 % of the national household share ( Quick verification: Looks like a UK-HPI 'national-total-as-fallback' path leaked into one LA row. Likely two lines to fix:
Happy to approve once that's in. The methodology itself is sound — mirrors |
The E06000053 row carried households=2,492,115 — roughly the South West region total — from an upstream fallback that fired during CSV generation. Real IoS has ~1,115 households per ONS mid-2023. With the bug, IoS absorbed 7.85% of the national property-wealth share, understating every other LA's 2024 target by ~8.5% (e.g. K&C moved from £42.6bn to £46.2bn after the fix). Two new tests prevent the regression: - test_households_within_plausible_range: bounds every LA to [500, 500_000] so any future 10x+ outlier fails immediately. - test_isles_of_scilly_households_are_thousands_not_millions: tight [500, 5_000] bound on the specific row that leaked. Methodology unchanged; LA targets still sum to the ONS national household-land series within 1e-6.
|
@MaxGhenis thanks — fixed in 3ed729c. Data fix
Quantified impact of the fix
Tests added
Full suite: 20/20 pass locally via Generation-path note: the 2,492,115 figure matches the South West regional household total, so the fallback that fired during CSV generation was a regional sum, not "national-avg" as the PR body suggested. I'll correct the PR description; worth flagging for whoever regenerates the CSV next. |
Summary
ons/household_land_value/{code}calibration targetsmhclg_regional_land.pymethodology to local-authority granularityCloses #370.
What this PR does
Extends the regional methodology to LA level
Each LA's share of national household land value is proportional to its total property wealth (
households × avg_house_price), scaled so the LA totals match the ONS national household-land series. Exactly the same formula asmhclg_regional_land.py::_compute_regional_shares, one geography deeper.Files
New
policyengine_uk_data/storage/la_land_values.csv— 360 rows:code, name, households, avg_house_price.householdsfrom the existinglocal_authority_weights.h5(sum of each LA's 2025 weight row) — keeps household-count semantics aligned with the rest of the LA calibration.avg_house_pricefrom HM Land Registry UK HPI (Dec 2025). Primary match on ONS code, name-based fallback for LAs with re-allocated codes (e.g. Sheffield E08000019 → E08000039 in HPI), NI country-level HPI fallback for missing NI LGD months, national-avg fallback for Isles of Scilly.policyengine_uk_data/targets/sources/la_land.py—_compute_la_shares(),_compute_la_targets(),get_targets()returning 360Targetobjects withgeographic_level=LOCAL_AUTHORITY.policyengine_uk_data/tests/test_la_land_value_targets.py— 18 unit tests.changelog.d/370.md.Modified
policyengine_uk_data/tests/test_regional_land_value_targets.py—test_target_registry_includes_regionalnow filters byGeographicLevel.REGION(the regional and LA targets share theons/household_land_value/name prefix, so filtering by prefix alone now pulls both).Tests
All unit tests, no baseline fixture needed:
CSV data quality
local_authorities_2021.csv(360)[£50k, £2m], households positiveShare / target aggregation
Registry integration
get_targets()returns exactly 360ons/household_land_value/{code};geo_code == codeGeographicLevel.LOCAL_AUTHORITYHOUSEHOLD_LAND_VALUESget_all_targets(year=2024, geographic_level=LOCAL_AUTHORITY)returns 360 LA land targetsResults of running the new tests plus adjacent suites (regional land, land targets, target DB, release manifest): 47 passed, 8 skipped.
Out of scope
Wiring these targets into
datasets/local_areas/local_authorities/loss.pyso the LA reweighting actually calibrates on them. Planned follow-up PR.Sanity check — top 10 LAs by avg household land value (2024)
Bottom 10 are all post-industrial / deprived areas (Inverclyde, East Ayrshire, West Dunbartonshire, Hull, Burnley, Hartlepool, Aberdeen, North Ayrshire, Hyndburn, Blackpool — all at £60–72k).
Sources
Related