Skip to content

Vend PolicyEngine bundle 4.16.0: populace-us as the certified US default#399

Open
MaxGhenis wants to merge 2 commits into
mainfrom
automation/policyengine-bundle-4.16.0
Open

Vend PolicyEngine bundle 4.16.0: populace-us as the certified US default#399
MaxGhenis wants to merge 2 commits into
mainfrom
automation/policyengine-bundle-4.16.0

Conversation

@MaxGhenis

@MaxGhenis MaxGhenis commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Vends bundle 4.16.0 from policyengine-bundles: populace-us becomes the certified US data release (populace-data 0.1.0, default dataset populace_us_2024, pinned by immutable tag + sha256, 571 artifacts including inherited long-term/sub-national datasets).

The dataset is built entirely from primary sources (CPS ASEC, IRS PUF, Fed SCF 2022, SIPP, CPS-ORG, MEPS-IC, ACS 2022 — the enhanced CPS is benchmark-only) and beats the enhanced CPS on the sound comparison at matched 41,314 households: train 0.176 vs 1.089, holdout 0.037 vs 0.317, full 0.213 vs 1.406, with parity 0 gaps and zero super-weighted records. Full evidence: https://populace.dev

Supporting changes:

  • ArtifactPathReference gains optional repo_id/repo_type and the bundle importer + dataset resolver honor them, so inherited artifacts (e.g. long-term CRFB datasets pinned to policyengine-us-data@crfb-longrun-20260517) keep resolving from their original repos while populace-data is the data package.
  • https_release_manifest_uri is repo-type-aware (datasets/ URL prefix for HF dataset repos).
  • Build provenance (data_build_id, built_with_model_version) flows from the release manifest through bundle compatibility metadata into the vendored country manifest (bundles #31).
  • policyengine-us pinned to 1.723.0 (uv.lock); household calculator snapshots refreshed for engine 1.722.4 → 1.723.0.

Generated with the bundles repo's own packaging + import scripts (validate exit 0 end to end).

🤖 Generated with Claude Code

MaxGhenis and others added 2 commits June 11, 2026 18:28
… release

The eCPS-free populace dataset (populace-us-2024-5da5a95-20260611,
populace-data 0.1.0) becomes the certified default for country 'us' —
it beats the enhanced CPS on train (0.176 vs 1.089), holdout (0.037 vs
0.317), and full-surface (0.213 vs 1.406) loss in the matched
symmetric-refit comparison, with parity 0 and bounded weights.
Artifact path references now carry optional repo_id/repo_type pins so
inherited datasets (long-term, state, district) keep resolving from
their original repos while populace-data is the data package; dataset
and region resolution prefer those pins. Release manifest URLs are
repo-type-aware (datasets/ prefix for HF dataset repos). The vendored
bundle is re-imported with build provenance, region dataset templates,
and policyengine-core 3.27.1 (3.27.0's computation-mode check crashed
update_variable reforms; fixed upstream in core#501).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant