diff --git a/.github/CONTRIBUTING.md b/.github/CONTRIBUTING.md index 2bc52164..d7760a01 100644 --- a/.github/CONTRIBUTING.md +++ b/.github/CONTRIBUTING.md @@ -1,14 +1,63 @@ -## Updating the versioning +# Contributing -Please add to `changelog_entry.yaml` an entry in the format: +Thanks for contributing to PolicyEngine.py. -```yaml -- bump: minor - changes: - added: - - New feature. - fixed: - - Bug fix. - changed: - - Change. +## Ways to contribute + +- Open an issue for bugs, documentation gaps, or feature requests. +- Submit a pull request for code, tests, documentation, or examples. +- If you are unsure where to start, open an issue describing the workflow or analysis problem you want to improve. + +## Development setup + +```bash +git clone https://github.com/PolicyEngine/policyengine.py.git +cd policyengine.py +uv pip install -e ".[dev]" +``` + +This installs the package, both country models, and the development tools used in CI. + +## Running checks + +Before opening a pull request, run the checks relevant to your change: + +```bash +make format +ruff check . +mypy src/policyengine +make test +``` + +Documentation changes can be checked with: + +```bash +make docs ``` + +Tests that download representative datasets require a `HUGGING_FACE_TOKEN`: + +```bash +export HUGGING_FACE_TOKEN=hf_... +``` + +## Changelog fragments + +Pull requests that change user-facing behaviour should include a changelog fragment in `changelog.d/`: + +```bash +echo "Describe the change." > changelog.d/my-change.fixed +``` + +Valid fragment types are `breaking`, `added`, `changed`, `fixed`, and `removed`. + +## Pull requests + +- Keep pull requests focused and explain the user-facing impact. +- Add or update tests when behaviour changes. +- Update documentation and examples when the public workflow changes. + +## Getting help + +- Use GitHub issues for bugs, regressions, and feature requests. +- For questions that do not fit a public issue, contact `hello@policyengine.org`. diff --git a/.github/workflows/draft-pdf.yml b/.github/workflows/draft-pdf.yml new file mode 100644 index 00000000..566f1232 --- /dev/null +++ b/.github/workflows/draft-pdf.yml @@ -0,0 +1,25 @@ +on: + push: + paths: + - paper.md + - paper.bib + - architecture.png + pull_request: + paths: + - paper.md + - paper.bib + - architecture.png + +jobs: + paper: + runs-on: ubuntu-latest + name: Draft PDF + steps: + - uses: actions/checkout@v4 + - uses: openjournals/openjournals-draft-action@master + with: + journal: joss + - uses: actions/upload-artifact@v4 + with: + name: paper + path: paper.pdf diff --git a/CITATION.cff b/CITATION.cff new file mode 100644 index 00000000..254f35f3 --- /dev/null +++ b/CITATION.cff @@ -0,0 +1,33 @@ +cff-version: 1.2.0 +message: "If you use this software, please cite it as below." +type: software +title: policyengine +version: 3.4.4 +date-released: "2026-04-13" +url: "https://github.com/PolicyEngine/policyengine.py" +abstract: "An open-source Python package that provides a common analysis layer for tax-benefit microsimulation across the US and the UK." +license: AGPL-3.0 +authors: + - family-names: Ghenis + given-names: Max + orcid: "https://orcid.org/0000-0002-1335-8277" + affiliation: PolicyEngine + - family-names: Ahmadi + given-names: Vahid + orcid: "https://orcid.org/0009-0004-1093-6272" + affiliation: PolicyEngine + - family-names: Woodruff + given-names: Nikhil + orcid: "https://orcid.org/0009-0009-5004-4910" + affiliation: PolicyEngine + - family-names: Makarchuk + given-names: Pavel + orcid: "https://orcid.org/0009-0003-4869-7409" + affiliation: PolicyEngine +keywords: + - microsimulation + - tax + - benefit + - public policy + - Python +repository-code: "https://github.com/PolicyEngine/policyengine.py" diff --git a/CODE_OF_CONDUCT.md b/CODE_OF_CONDUCT.md new file mode 100644 index 00000000..56cdab0f --- /dev/null +++ b/CODE_OF_CONDUCT.md @@ -0,0 +1,64 @@ +# Contributor covenant code of conduct + +## Our pledge + +We as members, contributors, and leaders pledge to make participation in our +community a harassment-free experience for everyone, regardless of age, body +size, visible or invisible disability, ethnicity, sex characteristics, gender +identity and expression, level of experience, education, socio-economic status, +nationality, personal appearance, race, religion, or sexual identity +and orientation. + +We pledge to act and interact in ways that contribute to an open, welcoming, +diverse, inclusive, and healthy community. + +## Our standards + +Examples of behavior that contributes to a positive environment for our +community include: + +* Demonstrating empathy and kindness toward other people +* Being respectful of differing opinions, viewpoints, and experiences +* Giving and gracefully accepting constructive feedback +* Accepting responsibility and apologizing to those affected by our mistakes, + and learning from the experience +* Focusing on what is best not just for us as individuals, but for the + overall community + +Examples of unacceptable behavior include: + +* The use of sexualized language or imagery, and sexual attention or + advances of any kind +* Trolling, insulting or derogatory comments, and personal or political attacks +* Public or private harassment +* Publishing others' private information, such as a physical or email + address, without their explicit permission +* Other conduct which could reasonably be considered inappropriate in a + professional setting + +## Enforcement responsibilities + +Community leaders are responsible for clarifying and enforcing our standards of +acceptable behavior and will take appropriate and fair corrective action in +response to any behavior that they deem inappropriate, threatening, offensive, +or harmful. + +## Scope + +This Code of Conduct applies within all community spaces, and also applies when +an individual is officially representing the community in public spaces. + +## Enforcement + +Instances of abusive, harassing, or otherwise unacceptable behavior may be +reported to the community leaders responsible for enforcement at +hello@policyengine.org. All complaints will be reviewed and investigated +promptly and fairly. + +## Attribution + +This Code of Conduct is adapted from the [Contributor Covenant][homepage], +version 2.0, available at +https://www.contributor-covenant.org/version/2/0/code_of_conduct.html. + +[homepage]: https://www.contributor-covenant.org diff --git a/Makefile b/Makefile index f62643e1..bb380401 100644 --- a/Makefile +++ b/Makefile @@ -12,7 +12,7 @@ docs-serve: cd docs && $(MYST_CMD) start install: - uv pip install -e .[dev] + uv pip install -e ".[dev]" format: ruff format . diff --git a/README.md b/README.md index 7fc607d5..eb8ebb8b 100644 --- a/README.md +++ b/README.md @@ -4,24 +4,31 @@ A Python package for tax-benefit microsimulation analysis. Run policy simulation ## Quick start +Install the UK country model first: + +```bash +pip install "policyengine[uk]" +``` + ```python from policyengine.core import Simulation -from policyengine.tax_benefit_models.uk import PolicyEngineUKDataset, uk_latest from policyengine.outputs.aggregate import Aggregate, AggregateType +from policyengine.tax_benefit_models.uk import ensure_datasets, uk_latest -# Load representative microdata -dataset = PolicyEngineUKDataset( - name="FRS 2023-24", - filepath="./data/frs_2023_24_year_2026.h5", - year=2026, +# First run downloads representative microdata to ./data; later runs reuse it +datasets = ensure_datasets( + datasets=["hf://policyengine/policyengine-uk-data/enhanced_frs_2023_24.h5"], + years=[2026], + data_folder="./data", ) +dataset = datasets["enhanced_frs_2023_24_2026"] # Run simulation simulation = Simulation( dataset=dataset, tax_benefit_model_version=uk_latest, ) -simulation.run() +simulation.ensure() # Calculate total universal credit spending agg = Aggregate( @@ -34,6 +41,15 @@ agg.run() print(f"Total UC spending: £{agg.result / 1e9:.1f}bn") ``` +## Smoke test + +To verify a fresh install without downloading representative datasets: + +```bash +pip install "policyengine[uk,us]" +python examples/household_impact_example.py +``` + ## Documentation **Core concepts:** @@ -55,11 +71,13 @@ print(f"Total UC spending: £{agg.result / 1e9:.1f}bn") pip install policyengine ``` -This installs both UK and US country models. To install only one: +This installs the shared analysis layer only. Add country model extras for the +systems you want to analyze: ```bash -pip install policyengine[uk] # UK model only -pip install policyengine[us] # US model only +pip install "policyengine[uk]" # shared layer + UK model +pip install "policyengine[us]" # shared layer + US model +pip install "policyengine[uk,us]" # shared layer + both country models ``` ### For development @@ -67,7 +85,7 @@ pip install policyengine[us] # US model only ```bash git clone https://github.com/PolicyEngine/policyengine.py.git cd policyengine.py -uv pip install -e .[dev] # install with dev dependencies (pytest, ruff, mypy, etc.) +uv pip install -e ".[dev]" # install with dev dependencies (pytest, ruff, mypy, etc.) ``` ## Development @@ -76,10 +94,11 @@ uv pip install -e .[dev] # install with dev dependencies (pytest, ruff, m | Configuration | Install | Use case | |---------------|---------|----------| -| **Library user** | `pip install policyengine` | Using the package in your own code | -| **UK only** | `pip install policyengine[uk]` | Only need UK simulations | -| **US only** | `pip install policyengine[us]` | Only need US simulations | -| **Developer** | `uv pip install -e .[dev]` | Contributing to the package | +| **Library user** | `pip install policyengine` | Shared analysis layer only | +| **UK only** | `pip install "policyengine[uk]"` | Shared layer plus UK simulations | +| **US only** | `pip install "policyengine[us]"` | Shared layer plus US simulations | +| **Both countries** | `pip install "policyengine[uk,us]"` | Shared layer plus UK and US simulations | +| **Developer** | `uv pip install -e ".[dev]"` | Contributing to the package | ### Common commands @@ -125,7 +144,7 @@ PRs trigger the following checks: | Tests (Python 3.13) | Required | `make test` | | Tests (Python 3.14) | Required | `make test` | | Mypy | Informational | `mypy src/policyengine` | -| Docs build | Required | Jupyter Book build | +| Docs build | Required | `make docs` | ### Versioning and releases @@ -164,14 +183,14 @@ On first run this will create `./data/enhanced_frs_2023_24_year_2026.h5`. Datasets contain microdata at entity level (person, household, tax unit). Load representative data or create custom scenarios: ```python -from policyengine.tax_benefit_models.uk import PolicyEngineUKDataset +from policyengine.tax_benefit_models.uk import ensure_datasets -dataset = PolicyEngineUKDataset( - name="Representative data", - filepath="./data/frs_2023_24_year_2026.h5", - year=2026, +datasets = ensure_datasets( + datasets=["hf://policyengine/policyengine-uk-data/enhanced_frs_2023_24.h5"], + years=[2026], + data_folder="./data", ) -dataset.load() +dataset = datasets["enhanced_frs_2023_24_2026"] ``` ### Simulations @@ -275,7 +294,7 @@ Key taxes: Federal income tax, payroll tax ## Contributing -See [CONTRIBUTING.md](CONTRIBUTING.md) for development setup and guidelines. +See [.github/CONTRIBUTING.md](.github/CONTRIBUTING.md) for development setup and guidelines. ## License diff --git a/architecture.png b/architecture.png new file mode 100644 index 00000000..62eeea3d Binary files /dev/null and b/architecture.png differ diff --git a/architecture.svg b/architecture.svg new file mode 100644 index 00000000..1df8d390 --- /dev/null +++ b/architecture.svg @@ -0,0 +1,48 @@ + + + + + + + + + + + + + + Policies + Tax-benefit rules + & parameters + (US & UK country packages) + + + + Households + Survey microdata + with calibrated weights + (CPS, Family Resources Survey) + + + + Dynamics + Behavioral responses + to policy changes + (in country packages) + + + + + + + + + Simulation + + + + + + + Distributional impacts · Fiscal impacts · Regional breakdowns · Poverty rates · Inequality metrics + diff --git a/changelog.d/joss-paper.added.md b/changelog.d/joss-paper.added.md new file mode 100644 index 00000000..95a2b9b2 --- /dev/null +++ b/changelog.d/joss-paper.added.md @@ -0,0 +1 @@ +Added JOSS paper (paper.md and paper.bib) for submission to the Journal of Open Source Software. diff --git a/examples/paper_repro_uk.py b/examples/paper_repro_uk.py index 0c753e0e..f6d27a17 100644 --- a/examples/paper_repro_uk.py +++ b/examples/paper_repro_uk.py @@ -1,7 +1,8 @@ -"""Reproduce the UK policy-reform analysis used in the JOSS paper draft. +"""Reproduce a UK policy-reform analysis for the JOSS paper. -This script uses the same reform shown in `paper.md`, but adds the missing -dataset setup so it can run end-to-end from a fresh checkout. +This script demonstrates the population-level workflow described in the paper, +using a UK reform (raising the personal allowance). The paper's inline code +example uses a US reform; this script complements it with the UK equivalent. Run: uv run --python 3.14 --extra uk python examples/paper_repro_uk.py diff --git a/paper-preview.html b/paper-preview.html new file mode 100644 index 00000000..7b44931b --- /dev/null +++ b/paper-preview.html @@ -0,0 +1,915 @@ + + + + + + policyengine: A Microsimulation Tool for Tax-Benefit Policy Analysis + + + + + +
+
+
JOSS Paper Preview
+

policyengine: A Microsimulation Tool for Tax-Benefit Policy Analysis

+
+ Vahid Ahmadi1 * + Max Ghenis1 + Nikhil Woodruff1 + Pavel Makarchuk1 +
+
1 PolicyEngine, Washington, DC, United States
+ +
+ Python + microsimulation + tax + benefit + public policy + economic analysis +
+
+
+ + +
+ + +
+

Summary

+

The policyengine Python package (PolicyEngine Contributors 2026) is +open-source software for analyzing how tax and benefit policies affect +household incomes and government budgets in the US and UK. It gives +analysts a common workflow for running simulations on representative +microdata or custom households and for comparing current law with +proposed reforms. Country-specific rules live in dedicated packages +(policyengine-us and policyengine-uk), while +policyengine provides shared tools for datasets, reforms, +outputs, and reproducible release bundles. The package also powers the +interactive web application at policyengine.org.

+

Statement of Need

+

Tax-benefit microsimulation models are standard tools for evaluating +the distributional impacts of fiscal policy. Governments, think tanks, +and researchers use them to estimate how policy reforms affect household +incomes, poverty rates, and government budgets. In practice, however, +analysts work across separate components: statutory rules, +representative microdata, reform definitions, and distributional outputs +live in different tools and interfaces. Reproducing a +baseline-versus-reform workflow, or carrying the same analysis pattern +from one country model to another, therefore often requires bespoke +scripts and project-specific conventions. Historical replication is +especially difficult when policy rules, analysis tooling, and +representative microdata are versioned independently and the analyst +must reconstruct which combination produced a published estimate.

+

The policyengine package provides a consistent Python +API for tax-benefit analysis across multiple country models. Users can +supply their own microdata or use companion representative datasets, +then compute the impact of current law or hypothetical reforms, +including parametric changes to existing policy parameters and +structural modifications to the tax-benefit system, on any household or +a national population. The calculate_household_impact +function computes results for a single household, while the +Simulation class runs population-level analysis on +representative survey datasets with calibrated weights. Optional +behavioral-response assumptions, such as labor supply elasticities, are +applied after the static reform. Version-pinned releases reduce the +bookkeeping needed for replication.

+

State of the Field

+

Microsimulation, which Orcutt (1957) pioneered and Bourguignon and +Spadaro (2006) surveyed for redistribution +analysis, underpins much of modern fiscal policy evaluation. In the US, +TAXSIM (Feenberg and Coutts +1993) at the National Bureau of Economic Research provides tax +calculations, while the Congressional Budget Office and Tax Policy +Center maintain microsimulation tax models (Congressional Budget +Office 2018; Tax Policy Center 2022). In the UK, the primary +microsimulation models are UKMOD, which the Institute for Social and +Economic Research (ISER) at the University of Essex maintains as part of +the EUROMOD family (Sutherland and +Figari 2013; EUROMOD 2026); HM Treasury’s Intra-Governmental Tax +and Benefit Microsimulation model (IGOTM) (HM Treasury 2025); and TAXBEN, which +the Institute for Fiscal Studies maintains (Waters 2017).

+

OpenFisca (OpenFisca +Contributors 2024) initiated the open-source approach to +tax-benefit microsimulation in France. Other open-source efforts include +the Policy Simulation Library, a collection of policy models and +data-preparation routines (Policy Simulation Library 2026), and The +Budget Lab’s public US tax-model codebases, including Tax-Simulator and +Cost-Recovery-Simulator (The +Budget Lab at Yale 2024, 2025). The PolicyEngine developers +originally forked the codebase from OpenFisca and built it on the +policyengine-core framework (Woodruff et al. 2024).

+

Existing tools serve complementary needs — country-specific +microsimulation (TAXSIM, TAXBEN, IGOTM) or simulation infrastructure +without a shared cross-country analyst API +(policyengine-core, OpenFisca) — leaving a gap in the +open-source ecosystem. The policyengine package fills this +gap by adding shared dataset management, a stable baseline-versus-reform +pattern, structured output types for distributional and regional +analysis, and interfaces for downstream dashboards and reports. +Concretely, the package provides reusable outputs such as +Aggregate, ChangeAggregate, and +IntraDecileImpact, together with bundled analyses such as +economic_impact_analysis(). This separation lets +country-model packages focus on statutory rules while shared analysis +methods evolve independently.

+

Table 1: Comparison of policyengine with selected tax-benefit +microsimulation tools. Entries refer to capabilities documented for +external users at the time of submission.

+ +++++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
DimensionpolicyengineTAXSIMUKMODOpenFisca
Open sourceYesPartialYesYes
Country coverageUS and UKUSUKUS, UK, and ~10 other jurisdictions
Tax and benefit analysisYesTax onlyYesYes
Python-native implementationYesNoNoYes
Shared reform and output API across countriesYesNoNoShared core, country-specific parameters
+

Software Design

+

The PolicyEngine software stack has four components. +policyengine-core provides reusable simulation +abstractions, versioned parameters, and dataset interfaces that country +packages share (Woodruff et al. 2024). The +policyengine-us and policyengine-uk packages +contain statutory logic, variables, and entity structures specific to +each tax-benefit system. The policyengine package is the +analyst-facing component: it defines shared simulation orchestration, +structured output types, and canonical baseline-versus-reform workflows +such as economic_impact_analysis(). Companion data +repositories hold enhanced survey microdata derived from the Current +Population Survey (CPS) (Woodruff and Ghenis 2024) +and Family Resources Survey (Department for Work and Pensions et al. +2021). The package does not include a macroeconomic model and +does not capture general equilibrium effects.

+

This architecture reflects two deliberate trade-offs. Keeping country +statutory rules in separate packages, rather than bundling them into a +monolithic tool, lets each country model release independently; the cost +is that policyengine must track and certify compatible +combinations. Modeling reforms statically, with optional post-hoc +behavioral responses, gives fast and deterministic baselines at the +expense of general equilibrium effects, which are better suited to +dedicated macroeconomic models.

+
+ + +
+

For reproducibility, the top-level package acts as a certification +boundary across these components. Country data repositories build +immutable microdata artifacts and publish release manifests with +checksums and the country-model version used during data construction. +Bundled country manifests in policyengine then certify the +runtime bundle: the runtime country-model version, the microdata-package +release, the dataset artifact, and the compatibility basis linking that +runtime model to the build-time data provenance. Analysts can request a +dataset such as enhanced_frs_2023_24, while the runtime +resolves it to a specific versioned artifact and records both runtime +and build-time provenance. The same certification record can be emitted +as a TRACE Transparent Research Object declaration (TRACE Project 2024), so +the internal bundle and data manifests remain the operational source of +truth while a standardized signed provenance document is available for +external exchange.

+

At runtime, a simulation combines a country-model version, household +microdata, and an optional reform; country packages can also apply +behavioral responses, such as labor supply elasticities, after the +static policy reform. The policyengine package then +produces reusable outputs for decile changes, program statistics, +poverty, inequality, and regional impacts. Because the runtime exposes +the resolved certified bundle and compatibility basis, results can be +traced to a specific policyengine release, runtime +country-model release, microdata release, versioned dataset artifact, +and build-time country-model version.

+

The following household-level example computes a household’s net +income under baseline law and under a reform that doubles the US +single-filer standard deduction (to $32,200 for 2026):

+
import datetime
+from policyengine.core import Parameter, ParameterValue, Policy
+from policyengine.tax_benefit_models.us import (
+    USHouseholdInput, calculate_household_impact, us_latest,
+)
+
+param = Parameter(
+    name="gov.irs.deductions.standard.amount.SINGLE",
+    tax_benefit_model_version=us_latest,
+)
+reform = Policy(
+    name="Double standard deduction",
+    parameter_values=[ParameterValue(
+        parameter=param,
+        start_date=datetime.date(2026, 1, 1),
+        end_date=datetime.date(2026, 12, 31),
+        value=32_200,
+    )],
+)
+
+household = USHouseholdInput(
+    people=[{"age": 40, "employment_income": 50_000,
+             "is_tax_unit_head": True}],
+    tax_unit={"filing_status": "SINGLE"},
+    household={"state_code_str": "CA"},
+    year=2026,
+)
+baseline = calculate_household_impact(household)
+reformed = calculate_household_impact(household, policy=reform)
+# The reform increases this household's net income relative to baseline.
+

The us_latest sentinel resolves to the bundled +policyengine-us version installed alongside +policyengine, so results are stable for a given pinned +environment. This paper describes policyengine version +3.4.4 (PolicyEngine +Contributors 2026), and the checked-in UK reproduction script +examples/paper_repro_uk.py documents an executable +population-level workflow using a pinned interpreter +(uv run --python 3.14 --extra uk python examples/paper_repro_uk.py).

+

Research Impact Statement

+

PolicyEngine has seen use in government, policy, and research +settings. In the UK, HM Treasury registered PolicyEngine in the +Algorithmic Transparency Recording Standard as a tool under evaluation +by its Personal Tax, Welfare and Pensions team (HM Treasury 2024), and co-author Nikhil +Woodruff adapted PolicyEngine during an Innovation Fellowship with the +data science team at 10 Downing Street (Woodruff 2026). In the US, the +Joint Economic Committee built an immigration fiscal impact calculator +on top of PolicyEngine’s microsimulation model (Joint Economic Committee, U.S. Congress +2026).

+

The package has also been used in external policy analysis and +validation exercises. Under a memorandum of understanding with the +Federal Reserve Bank of Atlanta, PolicyEngine runs three-way comparisons +against TAXSIM and the Atlanta Fed’s Policy Rules Database (Federal Reserve Bank of +Atlanta 2021). Organizations including the Niskanen Center and +the National Institute of Economic and Social Research have used +PolicyEngine in published distributional analyses (McCabe and Sargeant 2024; +Mosley et al. 2025), and co-author Max Ghenis and Jason DeBacker +presented related methodology work on the Enhanced CPS at the 117th +Annual Conference on Taxation of the National Tax Association (Ghenis and DeBacker +2024). Additional examples of public-facing use include research +collaborations and published analyses with the Better Government Lab, +the Beeck Center at Georgetown University, the Institute of Economic +Affairs, and Matt Unrath at the University of Southern California (Ghenis +2024b; Kennan et al. 2023, 2025; Woodruff 2024, 2025; Institute for +Research on Poverty 2025).

+

Acknowledgements

+

Arnold Ventures (Arnold Ventures 2023), NEO +Philanthropy (Ghenis 2024a), the Gerald Huff +Fund for Humanity, and the National Science Foundation (NSF POSE Phase +I, Award 2518372) (National +Science Foundation 2025) funded this work in the US. The Nuffield +Foundation has funded the UK work since September 2024 (Nuffield Foundation +2024). These funders had no involvement in the design, +development, or content of this software or paper. All authors are +employed by PolicyEngine and may benefit reputationally from the +software’s adoption; this relationship is disclosed here as a potential +conflict of interest.

+

We thank all PolicyEngine contributors and the OpenFisca community +for the microsimulation framework from which PolicyEngine was forked +(OpenFisca Contributors +2024). We acknowledge the US Census Bureau for providing access +to the Current Population Survey, and the UK Data Service and the +Department for Work and Pensions for providing access to the Family +Resources Survey.

+

AI Usage Disclosure

+

The authors used generative AI tools, specifically Claude Opus 4 by +Anthropic (Anthropic +2026), to assist with code refactoring. Human authors reviewed, +edited, and validated all AI-assisted outputs and made all design +decisions regarding software architecture, policy modeling, and +parameter implementation. The authors remain fully responsible for the +accuracy, originality, and correctness of all submitted materials.

+

References

+
+
+Anthropic. 2026. Claude. Released. https://www.anthropic.com/claude. +
+
+Arnold Ventures. 2023. Public Finance Program. https://www.arnoldventures.org/work/public-finance. +
+
+Bourguignon, François, and Amedeo Spadaro. 2006. “Microsimulation +as a Tool for Evaluating Redistribution Policies.” The +Journal of Economic Inequality 4: 77–106. https://doi.org/10.1007/s10888-005-9012-6. +
+
+Congressional Budget Office. 2018. An Overview of CBO’s +Microsimulation Tax Model. https://www.cbo.gov/system/files/2018-06/54096-taxmodel.pdf. +
+
+Department for Work and Pensions, Office for National Statistics, and +NatCen Social Research. 2021. Family Resources Survey, +2019-2020. UK Data Service. https://doi.org/10.5255/UKDA-SN-8802-1. +
+
+EUROMOD. 2026. Download. https://euromod-web.jrc.ec.europa.eu/download-euromod. +
+
+Federal Reserve Bank of Atlanta. 2021. Policy Rules Database. +https://github.com/Research-Division/policy-rules-database. +
+
+Feenberg, Daniel R., and Elisabeth Coutts. 1993. +TAXSIM: A Tool for Calculating Federal and State +Income Tax Liabilities.” National Tax Journal 46 (3): +271–80. https://doi.org/10.2307/3325474. +
+
+Ghenis, Max. 2024a. NEO Philanthropy Awards $200,000 +Grant to PolicyEngine.” https://policyengine.org/us/research/neo-philanthropy. +
+
+Ghenis, Max. 2024b. PolicyEngine and Better +Government Lab Collaboration.” https://www.policyengine.org/us/research/policyengine-better-government-lab-collaboration. +
+
+Ghenis, Max, and Jason DeBacker. 2024. Enhanced Current +Population Survey: Integrating IRS Public Use File +Data Using Quantile Regression Forests. https://ntanet.org/2024/07/117th-annual-conference-on-taxation-full/. +
+
+HM Treasury. 2024. HMT: PolicyEngine UK – +Algorithmic Transparency Recording Standard. https://www.gov.uk/algorithmic-transparency-records/hmt-modelling-policy-engine. +
+
+HM Treasury. 2025. Impact on Households: Distributional Analysis to +Accompany Spring Statement 2025. https://www.gov.uk/government/publications/supporting-documents-for-spring-statement-2025/impact-on-households-distributional-analysis-to-accompany-spring-statement-2025. +
+
+Institute for Research on Poverty. 2025. 2025–2026 IRP +Extramural Large Grants. https://www.irp.wisc.edu/2025-2026-irp-extramural-large-grants/. +
+
+Joint Economic Committee, U.S. Congress. 2026. Immigration Fiscal +Impact Calculator. https://www.jec.senate.gov/public/index.cfm/republicans/2026/3/immigration-fiscal-impact-calculator. +
+
+Kennan, Ariel, Alessandra Garcia Guevara, and Jason Goodman. 2025. +AI-Powered Rules as Code: Experiments with Public +Benefits Policy. Beeck Center for Social Impact; Innovation, +Georgetown University. https://beeckcenter.georgetown.edu/report/ai-powered-rules-as-code-experiments-with-public-benefits-policy/. +
+
+Kennan, Ariel, Lisa Singh, Bianca Dammholz, Keya Sengupta, and Jason Yi. +2023. Exploring Rules Communication: Moving Beyond Static Documents +to Standardized Code for U.S. Public Benefits +Programs. Beeck Center for Social Impact; Innovation, Georgetown +University. https://beeckcenter.georgetown.edu/report/exploring-rules-communication-moving-beyond-static-documents-to-standardized-code-for-u-s-public-benefits-programs/. +
+
+McCabe, Joshua, and Leah Sargeant. 2024. Building a Stronger +Foundation for American Families: Options for Child +Tax Credit Reform. Niskanen Center. https://www.niskanencenter.org/building-a-stronger-foundation-for-american-families-options-for-child-tax-credit-reform/. +
+
+Mosley, Max, Ryan Wattam, and Carol Vincent. 2025. UK +Living Standards Review 2025. National Institute of Economic; +Social Research. https://niesr.ac.uk/publications/uk-living-standards-review-2025. +
+
+National Science Foundation. 2025. POSE: Phase +I: PolicyEngine – Advancing Public Policy +Analysis. https://www.nsf.gov/awardsearch/showAward.jsp?AWD_ID=2518372. +
+
+Nuffield Foundation. 2024. Enhancing, Localising and Democratising +Tax-Benefit Policy Analysis. https://www.nuffieldfoundation.org/project/enhancing-localising-and-democratising-tax-benefit-policy-analysis. +
+
+OpenFisca Contributors. 2024. OpenFisca: Open Rules as +Code for Tax-Benefit Systems. Released. https://openfisca.org. +
+
+Orcutt, Guy H. 1957. “A New Type of Socio-Economic System.” +Review of Economics and Statistics 39 (2): 116–23. https://doi.org/10.2307/1928528. +
+
+Policy Simulation Library. 2026. Policy Simulation +Library (PSL). https://pslmodels.org/. +
+
+PolicyEngine Contributors. 2026. policyengine. V. 3.4.4. Released. https://github.com/PolicyEngine/policyengine.py. +
+
+Sutherland, Holly, and Francesco Figari. 2013. +EUROMOD: The European Union Tax-Benefit +Microsimulation Model.” International Journal of +Microsimulation 6 (1): 4–26. https://doi.org/10.34196/ijm.00075. +
+
+Tax Policy Center. 2022. Brief Description of the Tax Model. https://taxpolicycenter.org/resources/brief-description-tax-model. +
+
+The Budget Lab at Yale. 2024. Tax Microsimulation at the Budget +Lab. https://budgetlab.yale.edu/research/tax-microsimulation-budget-lab. +
+
+The Budget Lab at Yale. 2025. The Budget Lab’s Model for Tax +Depreciation. https://budgetlab.yale.edu/research/budget-labs-model-tax-depreciation. +
+
+TRACE Project. 2024. TRACE Transparent Research Object +Vocabulary (TROV). https://w3id.org/trace/trov/. +
+
+Waters, Tom. 2017. TAXBEN: The IFS Tax and Benefit Microsimulation +Model. The IFS. https://ifs.org.uk/publications/taxben-ifs-tax-and-benefit-microsimulation-model. +
+
+Woodruff, Nikhil. 2024. Raising Employer NIC in the +Autumn Budget. Institute of Economic Affairs. https://iea.org.uk/publications/raising-employer-nic-in-the-autumn-budget/. +
+
+Woodruff, Nikhil. 2025. Impact of Tax Changes 2025–2026. +Institute of Economic Affairs. https://iea.org.uk/publications/impact-of-tax-changes-2025-2026/. +
+
+Woodruff, Nikhil. 2026. Informing Policy Using +Micro-Simulations. https://policyengine.org/ca/research/policyengine-10-downing-street. +
+
+Woodruff, Nikhil, and Max Ghenis. 2024. “Enhancing Survey +Microdata with Administrative Records: A Novel Approach to +Microsimulation Dataset Construction.” https://github.com/PolicyEngine/policyengine-us-data/tree/main/paper. +
+
+Woodruff, Nikhil, Max Ghenis, and Anthony Volk. 2024. +PolicyEngine Core: A Microsimulation Framework. +Released. https://github.com/PolicyEngine/policyengine-core. +
+
+
+ + + + diff --git a/paper.bib b/paper.bib new file mode 100644 index 00000000..a2360005 --- /dev/null +++ b/paper.bib @@ -0,0 +1,371 @@ +@article{orcutt1957, + title={A New Type of Socio-Economic System}, + author={Orcutt, Guy H.}, + journal={Review of Economics and Statistics}, + volume={39}, + number={2}, + pages={116--123}, + year={1957}, + doi={10.2307/1928528} +} + +@article{bourguignon2006, + title={Microsimulation as a Tool for Evaluating Redistribution Policies}, + author={Bourguignon, Fran{\c{c}}ois and Spadaro, Amedeo}, + journal={The Journal of Economic Inequality}, + volume={4}, + pages={77--106}, + year={2006}, + doi={10.1007/s10888-005-9012-6} +} + +@article{sutherland2013euromod, + title={{EUROMOD}: the {European Union} tax-benefit microsimulation model}, + author={Sutherland, Holly and Figari, Francesco}, + journal={International Journal of Microsimulation}, + volume={6}, + number={1}, + pages={4--26}, + year={2013}, + doi={10.34196/ijm.00075} +} + +@software{openfisca, + title={{OpenFisca}: Open Rules as Code for Tax-Benefit Systems}, + author={{OpenFisca Contributors}}, + url={https://openfisca.org}, + year={2024} +} + +@misc{psl2026, + title={{Policy Simulation Library} ({PSL})}, + author={{Policy Simulation Library}}, + year={2026}, + note={A collection of open-source models and data preparation routines for policy analysis}, + url={https://pslmodels.org/} +} + +@software{policyengine_core, + title={{PolicyEngine Core}: A Microsimulation Framework}, + author={Woodruff, Nikhil and Ghenis, Max and Volk, Anthony}, + url={https://github.com/PolicyEngine/policyengine-core}, + year={2024} +} + +@software{policyengine_py, + title={{policyengine}}, + author={{PolicyEngine Contributors}}, + version={3.4.4}, + url={https://github.com/PolicyEngine/policyengine.py}, + year={2026} +} + +@unpublished{woodruff2024enhanced_cps, + title={Enhancing Survey Microdata with Administrative Records: A Novel Approach to Microsimulation Dataset Construction}, + author={Woodruff, Nikhil and Ghenis, Max}, + year={2024}, + note={PolicyEngine working paper}, + url={https://github.com/PolicyEngine/policyengine-us-data/tree/main/paper} +} + +@article{taxsim, + title={{TAXSIM}: A Tool for Calculating Federal and State Income Tax Liabilities}, + author={Feenberg, Daniel R. and Coutts, Elisabeth}, + journal={National Tax Journal}, + volume={46}, + number={3}, + pages={271--280}, + year={1993}, + doi={10.2307/3325474} +} + +@misc{cbo2018taxmodel, + title={An Overview of {CBO}'s Microsimulation Tax Model}, + author={{Congressional Budget Office}}, + year={2018}, + month={6}, + note={Presentation}, + url={https://www.cbo.gov/system/files/2018-06/54096-taxmodel.pdf} +} + +@misc{tpc2022taxmodel, + title={Brief Description of the Tax Model}, + author={{Tax Policy Center}}, + year={2022}, + month={3}, + day={9}, + url={https://taxpolicycenter.org/resources/brief-description-tax-model} +} + +@misc{euromod_download_2026, + title={Download}, + author={{EUROMOD}}, + year={2026}, + note={Documents that the software source code is open source, the coded policy rules are open, and a Python connector is available}, + url={https://euromod-web.jrc.ec.europa.eu/download-euromod} +} + +@misc{frs2020, + title={Family Resources Survey, 2019-2020}, + author={{Department for Work and Pensions} and {Office for National Statistics} and {NatCen Social Research}}, + year={2021}, + publisher={UK Data Service}, + note={SN: 8802}, + doi={10.5255/UKDA-SN-8802-1} +} + +@misc{hansard2026nic, + title={National Insurance Contributions (Employer Pensions Contributions) Bill -- Grand Committee}, + author={{House of Lords}}, + year={2026}, + month={2}, + day={24}, + note={Hansard, GC 371--372. Baroness Altmann citing PolicyEngine and its interactive dashboard for distributional analysis of pension contribution reforms}, + url={https://hansard.parliament.uk/Lords/2026-02-24/debates/A381F7D6-0A3C-48FD-8D9E-67751E25877A/NationalInsuranceContributions(EmployerPensionsContributions)Bill} +} + +@techreport{niesr2025living, + title={{UK} Living Standards Review 2025}, + author={Mosley, Max and Wattam, Ryan and Vincent, Carol}, + institution={National Institute of Economic and Social Research}, + year={2025}, + url={https://niesr.ac.uk/publications/uk-living-standards-review-2025} +} + +@misc{hmt2024atrs, + title={{HMT}: {PolicyEngine UK} -- Algorithmic Transparency Recording Standard}, + author={{HM Treasury}}, + year={2024}, + month={12}, + day={17}, + note={ATRS v3.0. HM Treasury Personal Tax, Welfare and Pensions team exploring PolicyEngine UK for advising policymakers on the impact of tax and welfare measures on households}, + url={https://www.gov.uk/algorithmic-transparency-records/hmt-modelling-policy-engine} +} + +@misc{hmt2025igotm, + title={Impact on households: distributional analysis to accompany Spring Statement 2025}, + author={{HM Treasury}}, + year={2025}, + month={4}, + day={2}, + note={Describes HM Treasury's Intra-Governmental Tax and Benefit Microsimulation model (IGOTM)}, + url={https://www.gov.uk/government/publications/supporting-documents-for-spring-statement-2025/impact-on-households-distributional-analysis-to-accompany-spring-statement-2025} +} + +@misc{waters2017taxben, + title={TAXBEN: The IFS tax and benefit microsimulation model}, + author={Waters, Tom}, + year={2017}, + month={11}, + day={15}, + publisher={The IFS}, + url={https://ifs.org.uk/publications/taxben-ifs-tax-and-benefit-microsimulation-model} +} + +@misc{budgetlab_taxsim_2024, + title={Tax Microsimulation at The Budget Lab}, + author={{The Budget Lab at Yale}}, + year={2024}, + month={4}, + day={12}, + note={Documents the public Tax-Simulator and Tax-Data codebases}, + url={https://budgetlab.yale.edu/research/tax-microsimulation-budget-lab} +} + +@misc{budgetlab_costrecovery_2025, + title={The Budget Lab's Model for Tax Depreciation}, + author={{The Budget Lab at Yale}}, + year={2025}, + month={1}, + day={27}, + note={Documents the open-source Cost-Recovery-Simulator model}, + url={https://budgetlab.yale.edu/research/budget-labs-model-tax-depreciation} +} + +@article{youngman2026carbon, + title={Agent-based macroeconomics for the {UK}'s {Seventh Carbon Budget}}, + author={Youngman, Tom and Lennox, Tim and Lopes Alves, M. and Palola, Pirta and Tankwa, Brendon and Bailey, Emma and Ravigne, Emilien and Ter Horst, Thijs and Wagenvoort, Benjamin and Lightfoot Brown, Harry and Moran, Jose and Farmer, Doyne}, + year={2026}, + eprint={2602.15607}, + archiveprefix={arXiv}, + primaryclass={econ.GN}, + doi={10.48550/arXiv.2602.15607}, + url={https://arxiv.org/abs/2602.15607} +} + +@techreport{woodruff2024nic, + title={Raising employer {NIC} in the {Autumn Budget}}, + author={Woodruff, Nikhil}, + institution={Institute of Economic Affairs}, + year={2024}, + month={10}, + url={https://iea.org.uk/publications/raising-employer-nic-in-the-autumn-budget/} +} + +@techreport{woodruff2025tax, + title={Impact of Tax Changes 2025--2026}, + author={Woodruff, Nikhil}, + institution={Institute of Economic Affairs}, + year={2025}, + month={3}, + url={https://iea.org.uk/publications/impact-of-tax-changes-2025-2026/} +} + +@misc{mcgarvey2024yatc, + title={Congressman {Morgan McGarvey} Introduces {Young Adult Tax Credit Act}}, + author={{Office of Representative Morgan McGarvey}}, + year={2024}, + month={3}, + day={5}, + note={Press release citing PolicyEngine analysis of H.R.7547}, + url={https://mcgarvey.house.gov/media/press-releases/congressman-morgan-mcgarvey-introduces-young-adult-tax-credit-act} +} + +@misc{ghenis2024nta, + title={Enhanced {Current Population Survey}: Integrating {IRS} Public Use File Data Using Quantile Regression Forests}, + author={Ghenis, Max and DeBacker, Jason}, + year={2024}, + month={11}, + note={Presented at the 117th Annual Conference on Taxation, National Tax Association, Detroit, Michigan}, + url={https://ntanet.org/2024/07/117th-annual-conference-on-taxation-full/} +} + +@techreport{mccabe2024ctc, + title={Building a Stronger Foundation for {American} Families: Options for {Child Tax Credit} Reform}, + author={McCabe, Joshua and Sargeant, Leah}, + institution={Niskanen Center}, + year={2024}, + month={3}, + url={https://www.niskanencenter.org/building-a-stronger-foundation-for-american-families-options-for-child-tax-credit-reform/} +} + +@online{pe_bgl, + title={{PolicyEngine} and {Better Government Lab} Collaboration}, + author={Ghenis, Max}, + year={2024}, + url={https://www.policyengine.org/us/research/policyengine-better-government-lab-collaboration} +} + +@misc{pe_usc, + title={2025--2026 {IRP} Extramural Large Grants}, + author={{Institute for Research on Poverty}}, + year={2025}, + note={University of Wisconsin--Madison. Includes PolicyEngine collaboration with Matt Unrath (USC) on effective marginal tax rates}, + url={https://www.irp.wisc.edu/2025-2026-irp-extramural-large-grants/} +} + +@techreport{beeck2023rac, + title={Exploring Rules Communication: Moving Beyond Static Documents to Standardized Code for {U.S.} Public Benefits Programs}, + author={Kennan, Ariel and Singh, Lisa and Dammholz, Bianca and Sengupta, Keya and Yi, Jason}, + institution={Beeck Center for Social Impact and Innovation, Georgetown University}, + year={2023}, + month={6}, + url={https://beeckcenter.georgetown.edu/report/exploring-rules-communication-moving-beyond-static-documents-to-standardized-code-for-u-s-public-benefits-programs/} +} + +@techreport{beeck2025ai, + title={{AI}-Powered Rules as Code: Experiments with Public Benefits Policy}, + author={Kennan, Ariel and Garcia Guevara, Alessandra and Goodman, Jason}, + institution={Beeck Center for Social Impact and Innovation, Georgetown University}, + year={2025}, + month={3}, + url={https://beeckcenter.georgetown.edu/report/ai-powered-rules-as-code-experiments-with-public-benefits-policy/} +} + +@misc{pe_dctc, + title={{District Child Tax Credit Amendment Act} of 2023}, + author={{Council of the District of Columbia}}, + year={2023}, + note={Bill B25-0190, introduced by Councilmember Zachary Parker}, + url={https://lims.dccouncil.gov/Legislation/B25-0190} +} + +@misc{pe_keepyourpay, + title={Booker Announces {Keep Your Pay Act}}, + author={{Office of Senator Cory Booker}}, + year={2026}, + month={3}, + url={https://www.booker.senate.gov/news/press/booker-announces-keep-your-pay-act} +} + +@misc{arnold_ventures, + title={Public Finance Program}, + author={{Arnold Ventures}}, + year={2023}, + note={Grant to PolicyEngine for congressional district-level policy analysis}, + url={https://www.arnoldventures.org/work/public-finance} +} + +@misc{nsf_pose, + title={{POSE}: Phase {I}: {PolicyEngine} -- Advancing Public Policy Analysis}, + author={{National Science Foundation}}, + year={2025}, + note={Award 2518372. PI: Max Ghenis, PSL Foundation. \$299,974}, + url={https://www.nsf.gov/awardsearch/showAward.jsp?AWD_ID=2518372} +} + +@online{neo_philanthropy, + title={{NEO Philanthropy} Awards \$200,000 Grant to {PolicyEngine}}, + author={Ghenis, Max}, + year={2024}, + url={https://policyengine.org/us/research/neo-philanthropy} +} + +@misc{atlanta_fed_prd, + title={Policy Rules Database}, + author={{Federal Reserve Bank of Atlanta}}, + year={2021}, + note={Collaboration between the Atlanta Fed, National Center for Children in Poverty, and PolicyEngine for multi-model validation}, + url={https://github.com/Research-Division/policy-rules-database} +} + +@misc{jec2026immigration, + title={Immigration Fiscal Impact Calculator}, + author={{Joint Economic Committee, U.S. Congress}}, + year={2026}, + month={3}, + url={https://www.jec.senate.gov/public/index.cfm/republicans/2026/3/immigration-fiscal-impact-calculator} +} + +@misc{tlaib2023endchildpoverty, + title={Tlaib Re-Introduces the {End Child Poverty Act} to Cut Child Poverty by Nearly Two-Thirds}, + author={{Office of Representative Rashida Tlaib}}, + year={2023}, + month={4}, + day={6}, + note={Press release citing PolicyEngine analysis}, + url={https://tlaib.house.gov/posts/tlaib-re-introduces-the-end-child-poverty-act-to-cut-child-poverty-by-nearly-two-thirds} +} + +@misc{no10fellowship2026, + title={Informing Policy Using Micro-Simulations}, + author={Woodruff, Nikhil}, + year={2026}, + month={1}, + note={No10 Innovation Fellowship blog. Co-author Nikhil Woodruff served as an Innovation Fellow with the 10DS data science team at 10 Downing Street, adapting PolicyEngine for government use}, + url={https://policyengine.org/ca/research/policyengine-10-downing-street} +} + +@software{claude2026, + title={{Claude}}, + author={{Anthropic}}, + year={2026}, + note={Opus 4 model used for code refactoring assistance}, + url={https://www.anthropic.com/claude} +} + +@misc{trace_trov, + title={{TRACE Transparent Research Object Vocabulary (TROV)}}, + author={{TRACE Project}}, + year={2024}, + note={Linked-data vocabulary for research provenance}, + url={https://w3id.org/trace/trov/} +} + +@misc{nuffield2024grant, + title={Enhancing, localising and democratising tax-benefit policy analysis}, + author={{Nuffield Foundation}}, + year={2024}, + note={General Election Analysis and Briefing Fund grant to PolicyEngine}, + url={https://www.nuffieldfoundation.org/project/enhancing-localising-and-democratising-tax-benefit-policy-analysis} +} diff --git a/paper.md b/paper.md new file mode 100644 index 00000000..3d92027c --- /dev/null +++ b/paper.md @@ -0,0 +1,124 @@ +--- +title: "policyengine: A Microsimulation Tool for Tax-Benefit Policy Analysis" +tags: + - Python + - microsimulation + - tax + - benefit + - public policy + - economic analysis +authors: + - name: Vahid Ahmadi + orcid: 0009-0004-1093-6272 + affiliation: '1' + corresponding: true + - name: Max Ghenis + orcid: 0000-0002-1335-8277 + affiliation: '1' + - name: Nikhil Woodruff + orcid: 0009-0009-5004-4910 + affiliation: '1' + - name: Pavel Makarchuk + orcid: 0009-0003-4869-7409 + affiliation: '1' +affiliations: + - name: PolicyEngine, Washington, DC, United States + index: '1' +date: 17 April 2026 +bibliography: paper.bib +--- + +# Summary + +The `policyengine` Python package [@policyengine_py] is open-source software for analyzing how tax and benefit policies affect household incomes and government budgets in the US and UK. It gives analysts a common workflow for running simulations on representative microdata or custom households and for comparing current law with proposed reforms. Country-specific rules live in dedicated packages (`policyengine-us` and `policyengine-uk`), while `policyengine` provides shared tools for datasets, reforms, outputs, and reproducible release bundles. The package also powers the interactive web application at [policyengine.org](https://policyengine.org). + +# Statement of Need + +Tax-benefit microsimulation models are standard tools for evaluating the distributional impacts of fiscal policy. Governments, think tanks, and researchers use them to estimate how policy reforms affect household incomes, poverty rates, and government budgets. In practice, however, analysts work across separate components: statutory rules, representative microdata, reform definitions, and distributional outputs live in different tools and interfaces. Reproducing a baseline-versus-reform workflow, or carrying the same analysis pattern from one country model to another, therefore often requires bespoke scripts and project-specific conventions. Historical replication is especially difficult when policy rules, analysis tooling, and representative microdata are versioned independently and the analyst must reconstruct which combination produced a published estimate. + +The `policyengine` package provides a consistent Python API for tax-benefit analysis across multiple country models. Users can supply their own microdata or use companion representative datasets, then compute the impact of current law or hypothetical reforms, including parametric changes to existing policy parameters and structural modifications to the tax-benefit system, on any household or a national population. The `calculate_household_impact` function computes results for a single household, while the `Simulation` class runs population-level analysis on representative survey datasets with calibrated weights. Optional behavioral-response assumptions, such as labor supply elasticities, are applied after the static reform. Version-pinned releases reduce the bookkeeping needed for replication. + +# State of the Field + +Microsimulation, which Orcutt [-@orcutt1957] pioneered and Bourguignon and Spadaro [-@bourguignon2006] surveyed for redistribution analysis, underpins much of modern fiscal policy evaluation. In the US, TAXSIM [@taxsim] at the National Bureau of Economic Research provides tax calculations, while the Congressional Budget Office and Tax Policy Center maintain microsimulation tax models [@cbo2018taxmodel; @tpc2022taxmodel]. In the UK, the primary microsimulation models are UKMOD, which the Institute for Social and Economic Research (ISER) at the University of Essex maintains as part of the EUROMOD family [@sutherland2013euromod; @euromod_download_2026]; HM Treasury's Intra-Governmental Tax and Benefit Microsimulation model (IGOTM) [@hmt2025igotm]; and TAXBEN, which the Institute for Fiscal Studies maintains [@waters2017taxben]. + +OpenFisca [@openfisca] initiated the open-source approach to tax-benefit microsimulation in France. Other open-source efforts include the Policy Simulation Library, a collection of policy models and data-preparation routines [@psl2026], and The Budget Lab's public US tax-model codebases, including Tax-Simulator and Cost-Recovery-Simulator [@budgetlab_taxsim_2024; @budgetlab_costrecovery_2025]. The PolicyEngine developers originally forked the codebase from OpenFisca and built it on the `policyengine-core` framework [@policyengine_core]. + +Existing tools serve complementary needs — country-specific microsimulation (TAXSIM, TAXBEN, IGOTM) or simulation infrastructure without a shared cross-country analyst API (`policyengine-core`, OpenFisca) — leaving a gap in the open-source ecosystem. The `policyengine` package fills this gap by adding shared dataset management, a stable baseline-versus-reform pattern, structured output types for distributional and regional analysis, and interfaces for downstream dashboards and reports. Concretely, the package provides reusable outputs such as `Aggregate`, `ChangeAggregate`, and `IntraDecileImpact`, together with bundled analyses such as `economic_impact_analysis()`. This separation lets country-model packages focus on statutory rules while shared analysis methods evolve independently. + +Table 1: Comparison of policyengine with selected tax-benefit microsimulation tools. Entries refer to capabilities documented for external users at the time of submission. + +| Dimension | policyengine | TAXSIM | UKMOD | OpenFisca | +|---|---|---|---|---| +| Open source | Yes | Partial | Yes | Yes | +| Country coverage | US and UK | US | UK | US, UK, and ~10 other jurisdictions | +| Tax and benefit analysis | Yes | Tax only | Yes | Yes | +| Python-native implementation | Yes | No | No | Yes | +| Shared reform and output API across countries | Yes | No | No | Shared core, country-specific parameters | + +# Software Design + +The PolicyEngine software stack has four components. `policyengine-core` provides reusable simulation abstractions, versioned parameters, and dataset interfaces that country packages share [@policyengine_core]. The `policyengine-us` and `policyengine-uk` packages contain statutory logic, variables, and entity structures specific to each tax-benefit system. The `policyengine` package is the analyst-facing component: it defines shared simulation orchestration, structured output types, and canonical baseline-versus-reform workflows such as `economic_impact_analysis()`. Companion data repositories hold enhanced survey microdata derived from the Current Population Survey (CPS) [@woodruff2024enhanced_cps] and Family Resources Survey [@frs2020]. The package does not include a macroeconomic model and does not capture general equilibrium effects. + +This architecture reflects two deliberate trade-offs. Keeping country statutory rules in separate packages, rather than bundling them into a monolithic tool, lets each country model release independently; the cost is that `policyengine` must track and certify compatible combinations. Modeling reforms statically, with optional post-hoc behavioral responses, gives fast and deterministic baselines at the expense of general equilibrium effects, which are better suited to dedicated macroeconomic models. + +![Figure 1: PolicyEngine runtime architecture. Inputs (rules, microdata, and behavioral responses) flow through the simulation pipeline to produce structured outputs.](architecture.png){width="100%"} + +For reproducibility, the top-level package acts as a certification boundary across these components. Country data repositories build immutable microdata artifacts and publish release manifests with checksums and the country-model version used during data construction. Bundled country manifests in `policyengine` then certify the runtime bundle: the runtime country-model version, the microdata-package release, the dataset artifact, and the compatibility basis linking that runtime model to the build-time data provenance. Analysts can request a dataset such as `enhanced_frs_2023_24`, while the runtime resolves it to a specific versioned artifact and records both runtime and build-time provenance. The same certification record can be emitted as a TRACE Transparent Research Object declaration [@trace_trov], so the internal bundle and data manifests remain the operational source of truth while a standardized signed provenance document is available for external exchange. + +At runtime, a simulation combines a country-model version, household microdata, and an optional reform; country packages can also apply behavioral responses, such as labor supply elasticities, after the static policy reform. The `policyengine` package then produces reusable outputs for decile changes, program statistics, poverty, inequality, and regional impacts. Because the runtime exposes the resolved certified bundle and compatibility basis, results can be traced to a specific `policyengine` release, runtime country-model release, microdata release, versioned dataset artifact, and build-time country-model version. + +The following household-level example computes a household's net income under baseline law and under a reform that doubles the US single-filer standard deduction (to \$32,200 for 2026): + +```python +import datetime +from policyengine.core import Parameter, ParameterValue, Policy +from policyengine.tax_benefit_models.us import ( + USHouseholdInput, calculate_household_impact, us_latest, +) + +param = Parameter( + name="gov.irs.deductions.standard.amount.SINGLE", + tax_benefit_model_version=us_latest, +) +reform = Policy( + name="Double standard deduction", + parameter_values=[ParameterValue( + parameter=param, + start_date=datetime.date(2026, 1, 1), + end_date=datetime.date(2026, 12, 31), + value=32_200, + )], +) + +household = USHouseholdInput( + people=[{"age": 40, "employment_income": 50_000, + "is_tax_unit_head": True}], + tax_unit={"filing_status": "SINGLE"}, + household={"state_code_str": "CA"}, + year=2026, +) +baseline = calculate_household_impact(household) +reformed = calculate_household_impact(household, policy=reform) +# The reform increases this household's net income relative to baseline. +``` + +The `us_latest` sentinel resolves to the bundled `policyengine-us` version installed alongside `policyengine`, so results are stable for a given pinned environment. This paper describes `policyengine` version 3.4.4 [@policyengine_py], and the checked-in UK reproduction script `examples/paper_repro_uk.py` documents an executable population-level workflow using a pinned interpreter (`uv run --python 3.14 --extra uk python examples/paper_repro_uk.py`). + +# Research Impact Statement + +PolicyEngine has seen use in government, policy, and research settings. In the UK, HM Treasury registered PolicyEngine in the Algorithmic Transparency Recording Standard as a tool under evaluation by its Personal Tax, Welfare and Pensions team [@hmt2024atrs], and co-author Nikhil Woodruff adapted PolicyEngine during an Innovation Fellowship with the data science team at 10 Downing Street [@no10fellowship2026]. In the US, the Joint Economic Committee built an immigration fiscal impact calculator on top of PolicyEngine's microsimulation model [@jec2026immigration]. + +The package has also been used in external policy analysis and validation exercises. Under a memorandum of understanding with the Federal Reserve Bank of Atlanta, PolicyEngine runs three-way comparisons against TAXSIM and the Atlanta Fed's Policy Rules Database [@atlanta_fed_prd]. Organizations including the Niskanen Center and the National Institute of Economic and Social Research have used PolicyEngine in published distributional analyses [@mccabe2024ctc; @niesr2025living], and co-author Max Ghenis and Jason DeBacker presented related methodology work on the Enhanced CPS at the 117th Annual Conference on Taxation of the National Tax Association [@ghenis2024nta]. Additional examples of public-facing use include research collaborations and published analyses with the Better Government Lab, the Beeck Center at Georgetown University, the Institute of Economic Affairs, and Matt Unrath at the University of Southern California [@pe_bgl; @beeck2023rac; @beeck2025ai; @woodruff2024nic; @woodruff2025tax; @pe_usc]. + +# Acknowledgements + +Arnold Ventures [@arnold_ventures], NEO Philanthropy [@neo_philanthropy], the Gerald Huff Fund for Humanity, and the National Science Foundation (NSF POSE Phase I, Award 2518372) [@nsf_pose] funded this work in the US. The Nuffield Foundation has funded the UK work since September 2024 [@nuffield2024grant]. These funders had no involvement in the design, development, or content of this software or paper. All authors are employed by PolicyEngine and may benefit reputationally from the software's adoption; this relationship is disclosed here as a potential conflict of interest. + +We thank all PolicyEngine contributors and the OpenFisca community for the microsimulation framework from which PolicyEngine was forked [@openfisca]. We acknowledge the US Census Bureau for providing access to the Current Population Survey, and the UK Data Service and the Department for Work and Pensions for providing access to the Family Resources Survey. + +# AI Usage Disclosure + +The authors used generative AI tools, specifically Claude Opus 4 by Anthropic [@claude2026], to assist with code refactoring. Human authors reviewed, edited, and validated all AI-assisted outputs and made all design decisions regarding software architecture, policy modeling, and parameter implementation. The authors remain fully responsible for the accuracy, originality, and correctness of all submitted materials. + +# References