Bundle LLM guide files in wheel with get_llm_guide() accessor by igerber · Pull Request #310 · igerber/diff-diff

igerber · 2026-04-16T22:31:44Z

Summary

Move docs/llms*.txt into diff_diff/guides/ so the three LLM guide files ship inside the wheel and are discoverable with no network access.
Add diff_diff.get_llm_guide(variant) accessor using importlib.resources; variants: "concise" (default), "full", "practitioner".
Surface the accessor in help(diff_diff) via the module docstring so AI agents find it from a pip install alone.
Explicit [tool.maturin] include entry as defense-in-depth against auto-inclusion behavior drift across maturin 1.4-2.0.
Internal cross-references inside guide files rewritten to point at diff_diff.get_llm_guide("practitioner") so the air-gap breadcrumb works end-to-end.
pyproject.toml "Practitioner Guide" URL switched to ReadTheDocs (newly publishes llms-practitioner.txt at /en/stable/llms-practitioner.txt).
Path references updated in README.md, CLAUDE.md, docs/doc-deps.yaml, docs/methodology/REGISTRY.md, and .claude/commands/bump-version.md.

Methodology references (required if estimator / math changes)

N/A — no estimator, inference, SE, or identification logic changed. REGISTRY.md update is a path reference only (points at the moved practitioner guide).

Validation

Tests added/updated: tests/test_guides.py — 18 tests (variant loading, default=concise, full>concise>practitioner length, content-stability assertions, UTF-8 encoding roundtrip, package-resource equivalence, error path for 7 invalid variants, namespace export, docstring hint).
pytest tests/test_guides.py -v → 18/18 passing locally.
Manual verification: python -c "import diff_diff; help(diff_diff)" shows the AI-agent hint; get_llm_guide(), get_llm_guide("full"), get_llm_guide("practitioner") return 13166 / 56793 / 19807 chars respectively.
Deferred to pre-merge / CI: local Sphinx build to confirm parent-relative html_extra_path resolves; wheel-install smoke verifying .txt files are inside the built artifact; live check that the new RTD URL resolves after docs deploy.

Security / privacy

Confirm no secrets/PII in this PR: Yes

Move docs/llms*.txt into diff_diff/guides/ so they ship inside the wheel and are discoverable via `diff_diff.get_llm_guide()` with no network access. Surface the helper in the module docstring so help(diff_diff) guides AI agents to it. - Move llms.txt, llms-full.txt, llms-practitioner.txt to diff_diff/guides/ - Add get_llm_guide(variant) using importlib.resources - Explicit [tool.maturin] include entry for defense-in-depth across 1.4-2.0 - Switch pyproject "Practitioner Guide" URL to ReadTheDocs - Rewrite internal cross-refs inside guide files to point at the accessor - Update docs/conf.py html_extra_path to the new location (adds llms-practitioner.txt to the RTD surface) - Update all path references (doc-deps.yaml, README, CLAUDE, REGISTRY, bump-version) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Clarifies that "CONCISE", " concise", etc. raise ValueError so callers don't have to discover this by experimentation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

github-actions · 2026-04-16T22:36:57Z

Overall Assessment

✅ Looks good — no unmitigated P0/P1 findings.

This PR is packaging/docs/API-surface work only. No estimator math, weighting, variance/SE, identification checks, or default behaviors changed. Execution-based validation was not possible in this environment because pytest and required runtime deps such as numpy are unavailable.

Executive Summary

No methodology-sensitive code changed; the only methodology-adjacent edit is a path update in docs/methodology/REGISTRY.md:L2770.
get_llm_guide() is a small whitelist-based resource accessor, so I did not find new correctness, security, or performance risks in the implementation.
P3: the bundled practitioner guide still points to a repo-only docs/methodology/REGISTRY.md, which breaks the offline breadcrumb for wheel-only installs.
P3-informational: build-surface validation is still thin; the added tests exercise source-tree resources, and related .txt guide CI-validation debt is already tracked in TODO.md:L95.

Methodology

No findings. Affected method(s): none. The REGISTRY change at docs/methodology/REGISTRY.md:L2770 is path-only and preserves the existing Baker et al. adaptation note.

Code Quality

No findings.

Performance

No findings. The new code path is an explicit on-demand text read, not estimator-path logic.

Maintainability

No findings.

Tech Debt

P3-informational Impact: The feature depends on the distribution surfaces configured in pyproject.toml:L83-L97 and docs/conf.py:L74-L77, but the added tests in tests/test_guides.py:L35-L42 only validate source-tree resources. Related AI-guide CI debt is already tracked in TODO.md:L95. Concrete fix: when CI is next touched, add a wheel-install smoke test plus a docs-build assertion that the three llms*.txt files are published at the expected output paths.

Security

No findings. get_llm_guide() uses a fixed variant whitelist instead of interpolating user input into filesystem paths, so it does not introduce an arbitrary file-read/path-traversal surface.

Documentation/Tests

P3 Impact: The shipped practitioner guide still tells readers to see docs/methodology/REGISTRY.md in diff_diff/guides/llms-practitioner.txt:L6. That path will not exist in a wheel-only/offline install, leaving a dead local breadcrumb inside the artifact this PR is trying to make self-contained. Concrete fix: replace that repo-only reference with an inline summary of the relevant deviation note, or point to another bundled/offline surface instead.

The guide's opening blockquote previously pointed at `docs/methodology/REGISTRY.md` for details on reorganizations relative to Baker et al. (2025). That path does not exist inside a wheel-only install, so it was a dead breadcrumb for offline AI agents - exactly the scenario this PR is meant to fix. Inline the four deviation bullets (numbering, PT as Step 3, uncertainty folding into Step 5, Step 8 = Robustness & Reporting) directly into the guide so it is self-contained. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

igerber · 2026-04-16T22:44:40Z

/ai-review

github-actions · 2026-04-16T22:49:17Z

🔁 AI review rerun (requested by @igerber)

Head SHA: e65be1665c2d27c2f421619fe1cdcc1164a63cd2

Overall Assessment

✅ Looks good

No unmitigated P0/P1 findings. This re-review resolves the earlier dead offline-breadcrumb issue. The only remaining item I found is a tracked P3-informational validation gap around built artifacts and docs publication.

Executive Summary

No estimator, weighting, variance/SE, identification-assumption, or default-behavior code paths changed.
Methodology cross-check passed: the practitioner-guide deviations now inlined in diff_diff/guides/llms-practitioner.txt:L8-L20 match the documented **Note:** entries in docs/methodology/REGISTRY.md:L2774-L2785.
The prior P3 about the bundled practitioner guide pointing wheel users at a repo-only path is resolved.
Residual P3-informational: guide tests still stop at source-tree resource reads; wheel-install and docs-publication validation remain deferred, but that debt is already tracked in TODO.md:L95-L96.
I could not execute pytest or import diff_diff in this environment because numpy is not installed.

Methodology

Affected method(s): none.

Severity: P3-informational
Impact: The only methodology-adjacent change is the bundled practitioner guide text. The inlined Baker et al. adaptation details in diff_diff/guides/llms-practitioner.txt:L8-L20 mirror the documented deviations in docs/methodology/REGISTRY.md:L2774-L2785, so under the review rubric this is a documented deviation, not an undocumented methodology change.
Concrete fix: None.

Code Quality

No findings.

Performance

No findings. The new guide read in diff_diff/_guides_api.py:L13-L48 is on-demand and outside estimator/inference hot paths.

Maintainability

No findings.

Tech Debt

Severity: P3-informational
Impact: The new “wheel content” check in tests/test_guides.py:L35-L42 still compares source-tree package resources rather than a built wheel, and the new public docs surface depends on docs/conf.py:L74-L78 and pyproject.toml:L83-L97. This validation gap is already tracked in TODO.md:L95-L96, so it is not blocking.
Concrete fix: Keep the tracked follow-up to add wheel-install and docs-build smoke checks.

Security

No findings. The new accessor uses a fixed variant whitelist instead of interpolating user input into filesystem paths in diff_diff/_guides_api.py:L6-L48.

Documentation/Tests

No new untracked findings. The added coverage in tests/test_guides.py:L10-L67 is reasonable for variant loading, invalid inputs, namespace export, and docstring surfacing, and the earlier offline-breadcrumb issue is fixed by diff_diff/guides/llms-practitioner.txt:L3-L20. Verification note: I could not run the tests here because importing diff_diff fails without numpy in this review environment.

maturin 1.13.1's GlobPattern enum only accepts `sdist` or `wheel` for the `format` field, not `all`. That rejected pyproject.toml at line 97 and broke the build step in every Python Tests job on the PR. Use the bare-string form `include = ["diff_diff/guides/*.txt"]`, which includes the glob in both sdist and wheel by default. Verified locally with `maturin sdist`: all three .txt files land in the sdist tarball. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

igerber · 2026-04-16T22:58:31Z

/ai-review

github-actions · 2026-04-16T23:02:59Z

🔁 AI review rerun (requested by @igerber)

Head SHA: b5f063a6d8a141b43a930100701d6f1d209f8ab8

Overall Assessment

✅ Looks good

No unmitigated P0/P1 findings. This re-review stays clear of estimator, weighting, variance/SE, identification-assumption, and default-behavior changes. The only issues I found are non-blocking P3 items around docs/dependency tracking and release validation.

Executive Summary

No methodology-bearing estimator code changed; the PR is limited to packaging, docs, and a public guide accessor.
Methodology cross-check passed: the inlined Baker et al. adaptation notes in diff_diff/guides/llms-practitioner.txt:L3 match the documented **Note:** entries in docs/methodology/REGISTRY.md:L2770.
The prior offline-breadcrumb issue remains resolved: bundled guides now point users to diff_diff.get_llm_guide("practitioner") in diff_diff/guides/llms.txt:L30 and diff_diff/guides/llms-full.txt:L36.
get_llm_guide() is correctly surfaced through the top-level namespace and module docstring in diff_diff/init.py:L12, diff_diff/init.py:L205, and diff_diff/init.py:L409, and the accessor uses a fixed variant whitelist in diff_diff/_guides_api.py:L6.
P3: docs/doc-deps.yaml updates moved guide paths but still does not map the new diff_diff/_guides_api.py source file near docs/doc-deps.yaml:L628; that is a non-blocking maintainability gap and falls under the already-tracked doc-deps integrity debt in TODO.md:L96.
P3: the new guide tests are still source-tree oriented rather than built-artifact oriented; tests/test_guides.py:L35 does not validate a built wheel or the published docs path configured in docs/conf.py:L74 and pyproject.toml:L83.

Methodology

Severity: P3-informational
Impact: Affected method(s): none. The only methodology-adjacent change is the practitioner-guide framing text, and its deviations from Baker et al. are explicitly documented in both diff_diff/guides/llms-practitioner.txt:L3 and docs/methodology/REGISTRY.md:L2770. Under the review rubric, that is a documented deviation, not a defect.
Concrete fix: None.

Code Quality

No findings.

Performance

No findings. get_llm_guide() is an on-demand text read outside estimator/inference hot paths in diff_diff/_guides_api.py:L13.

Maintainability

Severity: P3
Impact: The PR updates many moved-guide references in docs/doc-deps.yaml:L612 and docs/doc-deps.yaml:L628, but it does not add an entry for the new public helper module diff_diff/_guides_api.py. That means docs-impact tooling can miss future documentation drift when the accessor changes.
Concrete fix: Add a diff_diff/_guides_api.py entry to docs/doc-deps.yaml mapping it to the bundled guide docs and/or the public API surface, or explicitly document why it is intentionally excluded.

Tech Debt

Severity: P3-informational
Impact: test_wheel_content_matches_package_resource() in tests/test_guides.py:L35 only compares the accessor against source-tree package resources; it does not prove that the wheel actually contains the files or that the new ReadTheDocs root-path publication configured in docs/conf.py:L74 and linked from pyproject.toml:L83 is present after build/deploy. This is non-blocking and compatible with ✅, especially given the already-tracked docs/CI validation debt in TODO.md:L95 and TODO.md:L96.
Concrete fix: Add one wheel-install smoke test that imports diff_diff from the built artifact and calls get_llm_guide(), plus one docs-build smoke test that asserts llms-practitioner.txt is emitted at the expected site-root path.

Security

No findings. The accessor dispatches through a fixed whitelist rather than interpolating user input into filesystem paths in diff_diff/_guides_api.py:L6.

Documentation/Tests

No new untracked findings. Coverage in tests/test_guides.py:L10 is reasonable for variant loading, invalid inputs, namespace export, and docstring surfacing.

Verification note: I could not run pytest tests/test_guides.py or import diff_diff in this environment because numpy, pandas, scipy, and pytest are not installed.

…ides PR #309 added four validation diagnostics to SyntheticDiDResults: `get_weight_concentration()`, `get_loo_effects_df()`, `in_time_placebo()`, and `sensitivity_to_zeta_omega()`. Because this PR is the first place the guides ship inside the wheel, we want them faithful to main's API at the moment of merge. - `llms-full.txt`: add the four methods to the SyntheticDiDResults Methods line and a short "Validation diagnostics" subsection describing each. - `llms-practitioner.txt`: split the former `SyntheticDiD/TROP` bullet so SyntheticDiD now points at the built-in helpers (with the jackknife caveat for LOO); TROP keeps the generic guidance. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

igerber · 2026-04-17T00:48:20Z

/ai-review

github-actions · 2026-04-17T00:54:32Z

🔁 AI review rerun (requested by @igerber)

Head SHA: 70b547bd8e3f905c9deb0b296778edd14dd1930a

Overall Assessment

✅ Looks good

Executive Summary

No estimator, weighting, variance/SE, identification-assumption, or methodology-bearing default behavior changed. The only methodology-adjacent edits are guide-text updates, and the Baker et al. deviations remain explicitly documented and aligned between diff_diff/guides/llms-practitioner.txt:L3-L20 and docs/methodology/REGISTRY.md:L2782-L2797.
Re-review status: there are still no unmitigated P0/P1 issues in scope, so the assessment stays ✅.
The new get_llm_guide() accessor is wired correctly into the public surface and implemented with a fixed variant whitelist in diff_diff/init.py:L1-L15, diff_diff/init.py:L204-L205, diff_diff/init.py:L406-L409, and diff_diff/_guides_api.py:L1-L48.
P3-informational: the prior doc-deps finding remains. diff_diff/_guides_api.py:L1-L48 is still unmapped in docs/doc-deps.yaml:L616-L637, but that falls under the already-tracked unmapped-source debt in TODO.md:L95-L96.
P3-informational: the prior release-validation gap also remains. tests/test_guides.py:L35-L42 checks source-tree/package-resource equivalence, not a built wheel or docs-build artifact.
P3: the new llms-full.txt wording for get_loo_effects_df() is slightly inaccurate versus the actual API.

Methodology

Severity: P3-informational. Impact: Affected methods: none in estimator/inference code. The changed practitioner-guide text preserves the already-documented deviations from Baker et al., and those deviations remain mirrored in the Methodology Registry at docs/methodology/REGISTRY.md:L2782-L2797. Concrete fix: None.

Code Quality

No findings.

Performance

No findings.

Maintainability

Severity: P3-informational. Impact: diff_diff/_guides_api.py:L1-L48 introduces a new public, doc-bearing source file, but docs/doc-deps.yaml:L616-L637 still has no entry for it. Future accessor/docstring changes can drift without doc-deps tooling noticing. This is already tracked generically in TODO.md:L96-L96.
Concrete fix: Add a diff_diff/_guides_api.py block to docs/doc-deps.yaml mapping it to the bundled guide docs and/or public API documentation.

Tech Debt

Severity: P3-informational. Impact: tests/test_guides.py:L35-L42 proves get_llm_guide() matches package resources in the source tree, but it does not verify the built wheel or the Sphinx extra-path publication configured in pyproject.toml:L85-L98 and docs/conf.py:L74-L78. That leaves the README claim in README.md:L72-L82 validated only indirectly. This is compatible with the tracked .txt-guide CI gap in TODO.md:L95-L96.
Concrete fix: Add one CI smoke test that installs the built wheel and calls get_llm_guide(), plus one docs-build assertion that llms-practitioner.txt is emitted at site root.

Security

No findings.

Documentation/Tests

Severity: P3. Impact: diff_diff/guides/llms-full.txt:L1032-L1034 describes get_loo_effects_df() as leave-one-out influence “per treated unit” and implies a treated-side support condition, but the method actually returns both control and treated LOO rows and also requires more than one control with nonzero/effective weight to be available, per diff_diff/results.py:L1007-L1049 and diff_diff/synthetic_did.py:L1251-L1281. This is a documentation-accuracy issue only, not a methodology defect.
Concrete fix: Reword the bullet to “per-unit leave-one-out influence” and either list the full availability conditions or point readers to the method docstring/error path.
Verification note: I could not rerun pytest tests/test_guides.py or import diff_diff end-to-end in this environment because pytest and numpy are not installed.

AI review flagged that the bullet said "per treated unit" and implied a treated-side-only support condition, but the method actually returns a DataFrame with both control and treated rows and has a broader set of availability conditions than a single clause captures. - Change "per treated unit" to "per-unit (both control and treated rows)" - Defer the full availability conditions to the method docstring, with examples of the two most common failure modes (single treated unit, only one control with nonzero effective weight) Code behavior unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

igerber · 2026-04-17T00:59:11Z

/ai-review

github-actions · 2026-04-17T01:04:59Z

🔁 AI review rerun (requested by @igerber)

Head SHA: 89ee3378a467c6450b840b662a6154d4fd32e7b2

Overall Assessment

✅ Looks good

Executive Summary

Re-review status: still no unmitigated P0/P1 issues. This PR does not change estimator logic, weighting, variance/SE computation, identification assumptions, or methodology-bearing defaults.
The methodology-adjacent practitioner-guide edits in diff_diff/guides/llms-practitioner.txt:L3-L20 match the documented Notes in docs/methodology/REGISTRY.md:L2782-L2797, so there is no undocumented deviation from Baker et al.
The prior P3 documentation issue around get_loo_effects_df() appears resolved: diff_diff/guides/llms-full.txt:L1032-L1035 now matches diff_diff/results.py:L1007-L1038.
Remaining items are P3 only: _guides_api.py is still unmapped in docs/doc-deps.yaml, artifact-level wheel/docs publication is still not exercised by tests, and the SyntheticDiDResults table in llms-full.txt still omits "jackknife" from variance_method.
I could not rerun pytest or import diff_diff end-to-end here because this environment does not have pytest or numpy.

Methodology

No findings. Affected methods: none. The only methodology-adjacent change is guide prose, and its explicit deviations from Baker et al. remain documented in both diff_diff/guides/llms-practitioner.txt:L3-L20 and docs/methodology/REGISTRY.md:L2782-L2797.

Code Quality

No findings.

Performance

No findings.

Maintainability

Severity: P3-informational. Impact: diff_diff/_guides_api.py:L1-L48 introduces a new public, doc-bearing source file, but the infrastructure block in docs/doc-deps.yaml:L616-L637 still has no entry for it. Changes to the accessor or its docstring can drift without the doc-deps map noticing. This limitation is already tracked generically in TODO.md:L96-L96. Concrete fix: add a diff_diff/_guides_api.py block mapping the accessor to the bundled guides and any public API docs that surface get_llm_guide().

Tech Debt

Severity: P3-informational. Impact: pyproject.toml:L83-L98 and docs/conf.py:L74-L77 now advertise wheel-bundled guides and RTD-published text artifacts, but tests/test_guides.py:L35-L67 only exercises source-tree package resources; it does not validate a built wheel or built docs output. Packaging/publishing regressions would therefore slip past CI. This is compatible with the existing tracked guide-validation debt in TODO.md:L95-L95. Concrete fix: add one wheel-install smoke test that imports diff_diff from the built artifact and one docs-build assertion that llms-practitioner.txt is emitted at the site root.

Security

No findings.

Documentation/Tests

Severity: P3. Impact: In the edited SyntheticDiDResults section, diff_diff/guides/llms-full.txt:L1024-L1024 still describes variance_method as "bootstrap" or "placebo" only, even though the same section now documents a jackknife-only diagnostic at diff_diff/guides/llms-full.txt:L1034-L1034 and the implementation supports "bootstrap", "jackknife", and "placebo" in diff_diff/synthetic_did.py:L50-L64 and diff_diff/results.py:L696-L724. This is a minor documentation inconsistency only. Concrete fix: update the attribute row to include "jackknife" or point readers to the estimator/results docstring.
Verification note: I could not execute pytest tests/test_guides.py or import diff_diff locally in this environment because pytest and numpy are not installed.

- diff_diff/guides/llms-practitioner.txt: add reversible-treatment branch to the Step 4 decision tree pointing at ChaisemartinDHaultfoeuille (alias DCDH), and a short Step 5 example showing the survey_design + L_max + TSL fallback for dCDH. The practitioner guide previously had zero dCDH coverage despite the full API being live. - diff_diff/guides/llms-full.txt: fix stale SyntheticDiDResults table (line 1024) — variance_method now accepts "bootstrap", "jackknife", or "placebo" (jackknife was added for synthdid::vcov parity but the table only listed bootstrap/placebo). Bundled-guide paths only. The old docs/llms-*.txt were moved to diff_diff/guides/ in main (PR #310); these edits target the new canonical location consumed by get_llm_guide(). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

igerber and others added 2 commits April 16, 2026 18:22

Note case-sensitivity of variant in get_llm_guide docstring

5646126

Clarifies that "CONCISE", " concise", etc. raise ValueError so callers don't have to discover this by experimentation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

igerber added the ready-for-ci Triggers CI test workflows label Apr 16, 2026

igerber and others added 2 commits April 16, 2026 20:47

Merge remote-tracking branch 'origin/main' into llm-guide-api

8023b19

igerber merged commit bd2b55d into main Apr 17, 2026
19 checks passed

igerber deleted the llm-guide-api branch April 17, 2026 10:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bundle LLM guide files in wheel with get_llm_guide() accessor#310

Bundle LLM guide files in wheel with get_llm_guide() accessor#310
igerber merged 7 commits intomainfrom
llm-guide-api

igerber commented Apr 16, 2026

Uh oh!

github-actions bot commented Apr 16, 2026

Uh oh!

igerber commented Apr 16, 2026

Uh oh!

github-actions bot commented Apr 16, 2026

Uh oh!

igerber commented Apr 16, 2026

Uh oh!

github-actions bot commented Apr 16, 2026

Uh oh!

igerber commented Apr 17, 2026

Uh oh!

github-actions bot commented Apr 17, 2026

Uh oh!

igerber commented Apr 17, 2026

Uh oh!

github-actions bot commented Apr 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

igerber commented Apr 16, 2026

Summary

Methodology references (required if estimator / math changes)

Validation

Security / privacy

Uh oh!

github-actions bot commented Apr 16, 2026

Uh oh!

igerber commented Apr 16, 2026

Uh oh!

github-actions bot commented Apr 16, 2026

Uh oh!

igerber commented Apr 16, 2026

Uh oh!

github-actions bot commented Apr 16, 2026

Uh oh!

igerber commented Apr 17, 2026

Uh oh!

github-actions bot commented Apr 17, 2026

Uh oh!

igerber commented Apr 17, 2026

Uh oh!

github-actions bot commented Apr 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant