Skip to content

Add LA-level council tax calibration targets (Band D + band distribution)#374

Open
vahid-ahmadi wants to merge 3 commits intomainfrom
feat/la-council-tax-targets
Open

Add LA-level council tax calibration targets (Band D + band distribution)#374
vahid-ahmadi wants to merge 3 commits intomainfrom
feat/la-council-tax-targets

Conversation

@vahid-ahmadi
Copy link
Copy Markdown
Collaborator

Summary

Adds 350 LA-level ons/council_tax_band_d/{code} targets (Band D amount per billing authority) and 2,541 ons/council_tax_band_count/{code}/{band} targets (dwellings per band A-H per LA) built from four public sources, covering all 296 English + 22 Welsh + 32 Scottish LAs in local_authorities_2021.csv. Follows the pattern of the regional land-value targets, one geography deeper.

Like #371 this is plumbing + data only — wiring these into datasets/local_areas/local_authorities/loss.py is a deliberate follow-up.

What this PR does

Four public sources, one canonical CSV

storage/la_council_tax.csv (360 rows, 31 KB) joins:

Column Source Coverage
band_d_amount (England) MHCLG Council Tax levels set by local authorities in England 2026-27, Table 10, column 17 296/296 English LAs
band_d_amount (Wales) Welsh Government Council Tax levels April 2026 to March 2027, Table 1 "Overall average band D" 22/22 Welsh LAs
band_d_amount (Scotland) Scottish Government Council Tax Assumptions 2025, "CT by Band 2025-26" Band D column 32/32 Scottish LAs
band_Aband_H + total_dwellings VOA Council Tax: Stock of Properties, 2025, CTSOP1.0 summary table 295 English + 22 Welsh LAs

Column 17 in DLUHC Table 10 is the right one: "Average (Band D 2 adult equivalent) council tax for area of the billing authority including both local and major precepts" — i.e. the full amount households pay, inclusive of county, police, fire and parish precepts.

Code joins reconciled

  • Post-2023 South Yorkshire E-codes (E08000038 Barnsley, E08000039 Sheffield) are remapped to the pre-2023 codes used in the reference LA list (E08000016, E08000019).
  • Scottish source name quirks normalised: Argyll & ButeArgyll and Bute, Dumfries & GallowayDumfries and Galloway, Perth & KinrossPerth and Kinross, Shetland Islands (double space) → Shetland Islands, Edinburgh, City ofCity of Edinburgh.

New module

targets/sources/la_council_tax.py:

  • get_targets() returns Target objects at geographic_level=LOCAL_AUTHORITY.
  • Band D targets tagged with year 2026 (England, Wales) or 2025 (Scotland) based on the source's latest release.
  • Per-country reference URLs so downstream consumers can audit each value to its publication.
  • Missing values (NI has no council tax, Scottish band counts) are filtered out of values rather than emitted as NaN so calibrators cleanly skip them.

Documented coverage gaps

  • Northern Ireland (10 LAs): NI uses domestic rates, not council tax. has_council_tax=False flag set; no targets emitted for NI.
  • Scotland band counts (32 LAs): the VOA summary doesn't cover Scotland. Scottish Assessors publishes per-LA chargeable-dwellings separately; follow-up PR.
  • City of London Band A: VOA suppresses this cell ([c]) for disclosure control; other bands populated.

Testing (the #371 lesson)

22 hermetic tests, all green locally:

  • CSV structure (4): row count matches the reference LA list, expected columns present, four UK countries represented, every code matches the reference.
  • Value plausibility (4): Band D amount in [£900, £3,500], total dwellings in [200, 800,000], explicit Isles of Scilly regression (total in [500, 5,000], guarding against the 2.49M outlier that slipped into Add LA-level household land value calibration targets #371), band totals sum to total dwellings within 20-property slack.
  • Coverage expectations (4): every English / Welsh / Scottish LA has a Band D value; Northern Ireland is explicitly flagged as no-council-tax.
  • Spot-checks (2): Wandsworth and Westminster are the two lowest-Band-D LAs (catches row-swap bugs); Scottish average Band D is £500+ below English average.
  • Target-API invariants (8): get_targets() returns non-empty without network, Band D target count matches CSV, band count target count matches Σ non-null band columns, every target carries LOCAL_AUTHORITY geo level + geo_code, correct units and is_count=True on count targets, every target has ≥1 year of values, every band count ≤ 500k.

Sanity check — top 10 LAs by Band D amount (2026-27 where available)

LA Country Band D
Dorset ENGLAND £2,765
Lewes ENGLAND £2,756
Nottingham ENGLAND £2,755
Rutland ENGLAND £2,738
Wealden ENGLAND £2,728
Merthyr Tydfil WALES £2,594
Bridgend WALES £2,555
Hartlepool ENGLAND £2,560
Newport WALES £2,443
Blaenau Gwent WALES £2,452

Bottom of the table is dominated by the central-London LAs with the lowest precepts: Wandsworth (£1,028), Westminster (£1,050), City of London (£1,330).

Out of scope for this PR (follow-ups)

  • Wiring these targets into datasets/local_areas/local_authorities/loss.py so the LA reweighting actually calibrates on them.
  • Scottish Assessors per-LA chargeable-dwellings to fill the Scotland band-count gap.
  • Council Tax Support caseload per LA (DWP StatXplore) — caseload-style, different target shape.
  • Single Person Discount rate per LA (CIPFA Council Tax Base) — same.

Related

  • #371 — Add LA-level household land value calibration targets — the regional methodology this PR parallels; also the origin of the disclosure-control bound-check test lesson.
  • targets/sources/voa_council_tax.py already provides band targets at the REGION level; this PR adds the LA level alongside it without modifying the regional path.

Two families of LA-level targets, covering all 360 LAs in
local_authorities_2021.csv, built from four public sources:

- `ons/council_tax_band_d/{code}` (350 targets): average Band D
  council tax inclusive of all precepts per billing authority.
  Sources: MHCLG *Council Tax levels set by local authorities in
  England 2026-27*, Welsh Government *Council Tax levels April 2026
  to March 2027*, Scottish Government *Council Tax Assumptions 2025*.
  All 296 English + 22 Welsh + 32 Scottish LAs covered.
- `ons/council_tax_band_count/{code}/{band}` (2,541 targets): number
  of dwellings per band A-H per LA. Source: VOA *Council Tax: Stock
  of Properties, 2025*. Covers England + Wales (318 LAs × ~8 bands,
  minus City of London Band A which is VOA-suppressed).

NI is excluded: domestic rates, not council tax. Scotland band
counts are not in VOA; Scottish Assessors publishes them separately
and is a follow-up.

Files
-----

- `storage/la_council_tax.csv` (31 KB, 360 rows): canonical CSV
  joining DLUHC Table 10 column 17, Welsh Table 1 "Overall average
  band D", Scottish Gov "CT by Band 2025-26" Band D column, and VOA
  CTSOP1.0 bands A-H onto the reference LA list.
  - Post-2023 South Yorkshire E-codes (E08000038/39) re-mapped to
    pre-2023 codes (E08000016/19) to match the reference list.
  - Scottish ampersand/double-space naming normalised
    ("Argyll & Bute" → "Argyll and Bute", etc.).
- `targets/sources/la_council_tax.py`: reads the CSV, emits Target
  objects at geographic_level=LOCAL_AUTHORITY with per-country year
  tagging and per-country reference URL.

Testing
-------

22 hermetic tests (no network access, no baseline fixture needed):

Structure
- Row count matches local_authorities_2021.csv.
- Every expected column present.
- Four UK country codes represented.
- Every LA code matches the reference list.

Value plausibility (the #371 lesson)
- Band D amount in [£900, £3,500] for every row with a value.
- Total dwellings in [200, 800,000] for every row with a value.
- Explicit Isles of Scilly regression test: total dwellings in
  [500, 5,000], not the 2.49M outlier that slipped into #371.
- Band A-H counts sum to total dwellings within 20-property slack
  (VOA 10-property suppression allowance).
- Every band-count target value ≤ 500k (largest LA stock).

Coverage expectations
- Every English, Welsh and Scottish LA has a Band D value.
- Northern Ireland has no council tax flagged (has_council_tax=False).

Spot-checks of published facts
- Wandsworth (E09000032) and Westminster (E09000033) are the two
  lowest-Band-D English LAs (catches row-swap bugs).
- Scottish average Band D is £500+ below English average.

Target-API invariants
- get_targets() returns a non-empty list without network access.
- Band D target count matches the CSV's non-null Band D count.
- Band count target count matches Σ non-null band columns.
- Every target carries geographic_level=LOCAL_AUTHORITY and a
  geo_code.
- Band D targets use Unit.GBP; band count targets use Unit.COUNT
  with is_count=True.
- Every target has at least one year of values.

Sources
-------

- MHCLG (England 2026-27):
  https://www.gov.uk/government/statistics/council-tax-levels-set-by-local-authorities-in-england-2026-to-2027
- Welsh Government (Wales 2026-27):
  https://www.gov.wales/council-tax-levels-april-2026-march-2027-html
- Scottish Government (Scotland 2025-26):
  https://www.gov.scot/publications/council-tax-datasets/
- VOA (England + Wales 2025):
  https://www.gov.uk/government/statistics/council-tax-stock-of-properties-2025

Out of scope for this PR (follow-ups)
-------------------------------------

- Wiring these targets into
  datasets/local_areas/local_authorities/loss.py so the LA
  reweighting actually calibrates on them. Planned follow-up PR.
- Scottish Assessors per-LA chargeable-dwellings to fill the Scotland
  band-count gap.
- Council Tax Support caseload per LA (DWP StatXplore).
- Single Person Discount rate per LA (CIPFA).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@vahid-ahmadi vahid-ahmadi self-assigned this Apr 21, 2026
@vahid-ahmadi vahid-ahmadi requested a review from MaxGhenis April 21, 2026 11:42
@vahid-ahmadi
Copy link
Copy Markdown
Collaborator Author

Review — items to address before merge

Blocking

1. Welsh Band I is dropped. Wales has had 9 council tax bands (A–I) since its 2005 revaluation. _BAND_COUNT_COLUMNS in la_council_tax.py only maps A–H, and the CSV has no band_I column, so every Welsh Band I dwelling is silently missing from the emitted targets. If a calibrator consumes these to set the band distribution, Welsh Band I is implicitly targeted to zero.

Fix: add band_I column to la_council_tax.csv (populated for Welsh rows, NaN elsewhere), add "I": "band_I" to _BAND_COUNT_COLUMNS, update the module docstring, and add a test asserting non-null Band I for the larger Welsh LAs (Cardiff, Swansea, Vale of Glamorgan, Monmouthshire).

2. total_dwellings is Σ(A..H) for Welsh rows, not VOA "All properties". Verified from the CSV:

  • Cardiff (W06000015): 5320 + 20420 + 34920 + 37370 + 31540 + 22250 + 10590 + 2850 = 165,260 — matches CSV exactly.
  • Monmouthshire (W06000021): 520 + 3450 + 7140 + 9460 + 7660 + 8100 + 5520 + 1790 = 43,640 — matches CSV exactly.

Because both sides of test_band_counts_sum_to_total are derived from the same A–H sum, the test passes trivially and does not validate fidelity to the published total. Fix: re-source total_dwellings from the VOA "All properties" column. Once done, the test becomes a real fidelity check with the existing 20-property slack.

Coherence

3. variable field diverges from voa_council_tax.py. The existing regional source uses variable="council_tax_band" with no breakdown_variable; this PR uses variable="council_tax_band_count" with breakdown_variable=f"council_tax_band_{band}". Downstream loss.py wiring will then need to know two names for the same conceptual quantity. Prefer aligning — either update voa_council_tax.py to the new scheme or match its convention here.

4. CSV column asymmetry. band_A, band_B, band_C, band_D_count, band_E, band_F, band_G, band_H — the _count suffix only on D to avoid clashing with band_d_amount. Cleaner to use a count_band_X prefix scheme, or rename band_d_amountamount_band_d so the count columns stay symmetric. This also simplifies _BAND_COUNT_COLUMNS to a plain prefix map.

Nits

5. _load_table() re-reads the CSV on every get_targets() call. @lru_cache matches the pattern in voa_council_tax.py and is free.

6. has_council_tax column is derivable from country == "NORTHERN_IRELAND". Arguably self-documenting, so leave if preferred.

7. Module docstring opens with "band A–H" — update once Band I is added.

PR description

With Band I added, the "2,541 band-count targets" figure will rise by the number of non-null Welsh Band I cells — worth regenerating the description from the final CSV before merging. My earlier arithmetic of 318 × 8 = 2,544 minus City of London band A = 2,543 gave 2 more suppressions than the description claimed; Band I coverage may resolve part of that gap too.

Review points addressed:

- Add count_band_I column to la_council_tax.csv, populated for all 22
  Welsh LAs (Wales revalued in 2005 and introduced a 9th band). Cardiff
  1480, Monmouthshire 670, Vale of Glamorgan 1060, etc. English rows
  keep Band I null; VOA marks it [z] (not applicable).
- Re-source total_dwellings from VOA "All properties" column instead
  of deriving it as the sum of A-H. Previously Σ(A..H) was used for
  both sides of test_band_counts_sum_to_total, making the test
  self-referential; now it validates against the published total with
  a 20-property slack for VOA rounding.
- Rename count columns symmetrically: band_A..band_H + band_D_count →
  count_band_A..count_band_I. Removes the lopsided band_D_count name
  that existed only to avoid clashing with band_d_amount.
- Align band-count target names with voa_council_tax.py:
  voa/council_tax/{code}/{band} (was ons/council_tax_band_count/...);
  variable="council_tax_band" (was council_tax_band_count, which is
  not a real PolicyEngine-UK variable); drop breakdown_variable to
  match the regional VOA module.
- Cache the CSV read with @lru_cache(maxsize=1), matching voa_council_tax.
- Update module docstring: "A-H in England/Scotland, A-I in Wales".

Tests:
- New: test_welsh_las_have_band_i (all 22 Welsh LAs populated).
- New: test_english_las_have_no_band_i (guard against spurious fills).
- New: test_cardiff_band_i_matches_published_figure (~1,480 per VOA 2025).

Final target counts:
- 350 Band D amount targets (unchanged).
- 2,563 band-count targets, up from 2,541: +22 Welsh Band I plus two
  band-H rows that were null due to the earlier truncation.
@vahid-ahmadi
Copy link
Copy Markdown
Collaborator Author

Pushed d2d2acf addressing the review items:

Blocking fixes

  • Welsh Band I now sourced from VOA CTSOP1.0 for all 22 Welsh LAs (Cardiff 1480, Monmouthshire 670, Vale of Glamorgan 1060, etc.); emitted as voa/council_tax/{code}/I targets. New tests: test_welsh_las_have_band_i, test_english_las_have_no_band_i, test_cardiff_band_i_matches_published_figure.
  • total_dwellings now sourced from VOA "All properties" for every English + Welsh row rather than derived from Σ(A..H). test_band_counts_sum_to_total is now a real fidelity check (max discrepancy 10, VOA rounding).

Coherence fixes

  • Band-count targets renamed to voa/council_tax/{code}/{band} with variable="council_tax_band" (aligned with the existing regional voa_council_tax.py). breakdown_variable dropped to match. council_tax_band is the actual PolicyEngine-UK enum variable and has Band I in its possible values.
  • CSV count columns renamed to the symmetric count_band_A..count_band_I scheme (old band_D_count outlier removed).

Nits

  • _load_table() now @lru_cache-ed.
  • Module docstring clarified: "A–H in England/Scotland, A–I in Wales".

Final target counts: 350 Band D amount + 2,563 band counts (up from 2,541: +22 Welsh Band I, +2 resolved [c] suppressions). 25/25 tests pass locally via uv run pytest.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant