Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -9,3 +9,6 @@ dist/

python_modules/
.venv-workers/

# Social-card HTML intermediates (rasterized output lives in public/og/)
build/
18 changes: 18 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,24 @@ The format is inspired by [Keep a Changelog](https://keepachangelog.com/en/1.1.0

## Unreleased

### Added

- Banner position grammar from `docs/visual-explainer-spec.md` is now production: `render_banner(slug, position)` supports `before`, `after-cell-N` (legacy anchor `cell-N`), and `after-walkthrough`, with multiple figures per position rendering as one small-multiple banner. The mutability page ships the canonical two-figure pair (aliased mutation vs. frozen tuple).
- Curated pair banners on contrast cells: `positional-only-parameters` shows the `/` and `*` separator twins side by side, `metaclasses` pairs the metaclass triangle with the familiar class triangle, and `tuples` pairs the frozen tuple with the growing list on the intent-contrast cell. `iterator-vs-iterable` gains the one-pass caret figure on the exhaustion cell.

- `/sitemap.xml` route listing home, journeys, and all example pages; `public/robots.txt` with a Sitemap directive.
- JSON-LD structured data: `WebSite` on the home page, `TechArticle`/`LearningResource` on every example page, enforced by the SEO linter.
- Client-side example search on the home page: a build-step JSON index (`make build-search-index`), fingerprinted `search.js`/`search-index.json` assets, `/` keyboard shortcut, and a Node ranking check (`make search-ranking-test`).
- Dark mode via `prefers-color-scheme`: inverted warm palette, dual-theme Shiki highlighting, a dark CodeMirror highlight style, and marginalia figures rendered on a light paper chip so the locked grammar stays untouched.
- Skip-to-content link on every page.
- Per-example social-card images composed from each example's marginalia figure (`make social-cards`), referenced by `og:image`/`twitter:card` on home and example pages and checked by the SEO linter.
- Learner-behavior report (`scripts/learner_report.py` + `docs/learner-analytics.md`) aggregating exported Worker wide events into most-read pages, most-run examples with edited/error shares and execution percentiles, journey traffic, and missing-example 404s.

### Changed

- Journeys now reference every example: `iterator-vs-iterable`, `classmethods-and-staticmethods`, `bound-and-unbound-methods`, `abstract-base-classes`, and `structured-data-shapes` joined their natural sections, with journey-outcome support lists updated.
- Journey meta descriptions no longer claim gap placeholders; all previously declared gaps are filled.

## 2026-05-16

### Added
Expand Down
20 changes: 15 additions & 5 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,22 +1,32 @@
.PHONY: test embed-examples build check-generated fingerprint browser-layout-test seo-cache-lint verify-examples check-registry-integrity check-confusable-pairs check-broad-surface-tours check-footgun-coverage check-notes-supported score-example-criteria check-quality-scores check-no-figure-rationales check-journey-outcomes quality-checks rubric-audit format-examples verify-python-version verify smoke-deployment dev deploy lint
.PHONY: test embed-examples build-search-index build check-generated fingerprint browser-layout-test search-ranking-test social-cards seo-cache-lint verify-examples check-registry-integrity check-confusable-pairs check-broad-surface-tours check-footgun-coverage check-notes-supported score-example-criteria check-quality-scores check-no-figure-rationales check-journey-outcomes quality-checks rubric-audit format-examples verify-python-version verify smoke-deployment dev deploy lint

test:
uv run --python 3.13 python -m unittest discover -s tests -v

embed-examples:
scripts/embed_example_sources.py

build: embed-examples fingerprint
build-search-index: embed-examples
uv run --python 3.13 scripts/build_search_index.py

build: embed-examples build-search-index fingerprint

check-generated: build
git diff --exit-code src/example_sources_data.py src/asset_manifest.py public/_headers
git diff --exit-code src/example_sources_data.py src/asset_manifest.py public/_headers public/search-index.json

fingerprint: embed-examples
fingerprint: embed-examples build-search-index
scripts/fingerprint_assets.py

browser-layout-test:
scripts/check_browser_layout.mjs

search-ranking-test:
scripts/check_search_ranking.mjs

social-cards:
uv run --python 3.13 scripts/build_social_cards.py
scripts/build_social_cards.mjs

seo-cache-lint:
scripts/lint_seo_cache.py

Expand Down Expand Up @@ -64,7 +74,7 @@ verify-python-version: build
lint:
uv run ruff check src tests scripts

verify: build test seo-cache-lint verify-examples quality-checks browser-layout-test lint check-generated
verify: build test seo-cache-lint verify-examples quality-checks browser-layout-test search-ranking-test lint check-generated

dev:
uv run pywrangler dev --port 9696
Expand Down
17 changes: 16 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,11 @@ Production: <https://www.pythonbyexample.dev> (`workers.dev` remains enabled as
- Workers Assets for static files
- Fingerprinted CSS/JS assets with immutable cache headers
- Versioned Worker Cache API keys for rendered HTML
- SEO metadata and canonical URLs for home and example pages
- SEO metadata, canonical URLs, JSON-LD structured data, and a sitemap for home and example pages
- Client-side example search on the home page (press `/` to focus)
- Dark mode via `prefers-color-scheme`, including dual-theme code highlighting
- Per-example social-card images composed from the marginalia figure set
- Learner-behavior reporting from Worker wide events (`docs/learner-analytics.md`)

## Attribution

Expand Down Expand Up @@ -186,6 +190,17 @@ scripts/check_example_migration_parity.py
make check-generated
```

After adding an example (or changing a title, summary, or figure), also
regenerate its social card so the SEO linter finds the image:

```bash
make social-cards
```

This composes a 1200x630 card per example from its marginalia figure and
rasterizes it to `public/og/<slug>.jpg` with headless Chrome (set
`CHROME_PATH` if Chrome is not at the default location).

`src/example_sources_data.py` is generated and committed so Cloudflare Workers can load examples in production. Do not edit it by hand.

For a Python version migration, update `python_version` and `docs_base_url` in `src/example_sources/manifest.toml`, then run:
Expand Down
48 changes: 48 additions & 0 deletions docs/learner-analytics.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# Learner analytics

The Worker emits one structured wide event per request (see
`docs/observability-spec.md`). Those events already carry the fields
that matter for content decisions: the page path, the example slug on
POST runs, whether the submitted code was edited, the Dynamic Worker
outcome, and the execution time.

`scripts/learner_report.py` turns an export of those events into a
learner-behavior report so content work is steered by external signal
rather than internal quality scores:

- **Most-read pages** — GET traffic per path.
- **Most-run examples** — POST runs per slug, with the edited share
(are people experimenting or just pressing Run?), the error share
(where edited code fails), and p50/p95 execution times.
- **Journey traffic** — which curated arcs get used.
- **Missing-example requests** — 404s under `/examples/`, i.e. demand
for pages that do not exist yet. These are content candidates.
- **Turnstile outcomes** — how often runs are challenged or blocked.

## Getting events

Live tail (short windows; see the `wrangler tail` caveats in
`docs/lessons-learned.md`):

```bash
uv run --group workers pywrangler tail --format json > events.ndjson
scripts/learner_report.py events.ndjson
```

Or export from the Cloudflare dashboard (Workers Logs) / a Logpush job
and feed the NDJSON file in the same way. The script auto-detects the
raw payload, the `wrangler tail` envelope, and the Workers Logs
envelope, so exports from any of the three sources work unmodified.

`--json` emits the aggregated report as JSON for further processing;
`--limit N` controls rows per section.

## Reading the report

- A high **edited share** with a low error share means the example
invites successful experimentation — the ideal.
- A high **error share** on edited runs marks pages where learners try
something the example did not prepare them for; consider extending
the walkthrough or notes there.
- **Missing-example requests** that recur are the strongest possible
signal for what to write next.
14 changes: 13 additions & 1 deletion docs/lessons-learned.md
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,7 @@ git diff --check
- **Two rubrics, one craft section.** Journey-section figures depict a *conceptual shift* across multiple lessons; example-cell figures depict the *single move* the surrounding cell discusses. `docs/journey-visualisation-rubric.md` and `docs/example-figure-rubric.md` score each on 10 points: content fidelity, craft, context. Topic gates per kind of section / cell shape.
- **Constraint-shaped material improves when the constraint is drawn as a boundary, not a caveat.** Runtime-boundary teaching works best when it shows the standard Python contract, the site-specific runner boundary, and the portable evidence that preserves the lesson. If a constraint-shaped section cannot be reframed this way, then use the no-figure rationale registry instead of shipping a weak mechanism picture.
- **Authoring stays on the contributor; figures stay on the curator.** Example markdown does not include figure references. `src/marginalia.py` holds `FIGURES` (paint functions) and `ATTACHMENTS` (slug → cell → figure → caption). Curating figures is a single-file edit that contributors never see.
- **Inline between prose and code is the production layout; banners between cells is the prototyped richer grammar.** Cells with figures drop to single-column stacking (prose, figure, code) via `.lp-cell.has-figure { grid-template-columns: 1fr }`. Cells without figures keep today's `prose | code` 2-column grid bit-for-bit. The banner-between approach (`/prototyping/layout-banner-*`) supports multi-figure small-multiples between cells when one inline figure isn't enough.
- **Banners between cells is the production layout, with a position grammar.** Cells always keep the `prose | code` 2-column grid; figures render in `.cell-banner` rows at `before`, `after-cell-N` (legacy anchor `cell-N`), or `after-walkthrough` positions via `render_banner(slug, position)` in `src/marginalia.py` and `_render_walkthrough` in `src/app.py`. Multiple tuples on one position share a banner as a small multiple (`cell-banner--2` etc.) — the mutability aliasing/tuple pair is the canonical example. The prototypes at `/prototyping/layout-banner-*` validated the grammar before the production rollout.
- **Centralised gestalt pages catch drift that page-by-page review misses.** `/prototyping/marginalia-gestalt`, `/prototyping/journey-figures-gestalt`, and `/prototyping/production-figures-gestalt` show every figure in three different framings. Seeing all section figures of a journey in one 3-up row exposes inconsistencies invisible across six tabs.
- **Mapping reuses existing figures; promoting moves design to production.** Half of example coverage came from attaching existing FIGURES to new examples (no paint code). The other half from new paint code copied or designed from gestalt cards. Both paths must pass the rubric.
- **Tests against the cell layout must allow the `has-figure` class.** When the renderer adds `has-figure` to cells with attached figures, assertions on the literal string `class="lesson-step lp-cell"` fail. Change those tests to check the substring `lesson-step lp-cell` so both variants match.
Expand Down Expand Up @@ -129,3 +129,15 @@ git diff --check
- **Deployment smoke belongs beside CI, and POST smoke must assert rendered output.** `scripts/smoke_deployment.py` checks rendered Worker pages, runtime-boundary pages, journey pages, prototype review pages, and representative Dynamic Worker POST runs for HTTP failures, exception markers, and stale edited-code output. Build success is not enough; the deployed Worker must render and execute edited examples. With Turnstile enabled, submitted code can appear in the editor textarea even when it did not run, so POST smoke must inspect the output panel rather than searching the whole HTML document.
- **Observability smoke should assert the custom event, not the whole tail envelope.** Use unique `x-request-id` values, exercise cache miss/hit/bypass and client-error paths, and assert on the structured payload inside `logs[].message[]`. If Turnstile is enabled, Dynamic Worker error-path probes need the smoke bypass secret; otherwise they only verify the Turnstile-fail path.
- **Turnstile should be secret-gated, session-scoped, and invisible until needed.** Protect edited-code POST runs only when `TURNSTILE_SECRET_KEY` and an explicit challenge mode are configured, lazy-load/render the Invisible-mode widget only after the server returns a challenge-required marker, and issue a signed clearance cookie so normal session runs skip Siteverify. The Cloudflare widget mode is `Invisible`; the client-side render option is `execution: "execute"`, not `size: "invisible"`. If production smoke must POST through a protected endpoint, use a separate `PBE_SMOKE_BYPASS_SECRET` header so smoke remains a deployment check rather than a CAPTCHA solver. See `docs/turnstile-runner-protection-spec.md` for the full runner-protection design.

## Discoverability, theming, and learner analytics

- **Extend the existing registry's vocabulary instead of adding a parallel one.** The banner rollout was originally specced as a new `BANNERS` dict keyed by position. But `ATTACHMENTS` is load-bearing for five contract families (score sync, figure usage, caption uniqueness, anchor resolution, gestalt builders), and a second registry would have split that coverage. Extending the anchor vocabulary (`before`, `after-walkthrough`, `after-cell-N` alongside legacy `cell-N`) delivered the same grammar with every existing contract intact. Same lesson as "one paint registry, not two," applied to data shape migrations.
- **Dark mode for locked-palette SVGs: change the mat, not the art.** The marginalia grammar hardcodes the light palette in every figure, and recolouring 109 figures would have meant touching the locked grammar and re-auditing geometry. Instead, dark mode renders each figure on a light "paper" chip (`--figure-paper` background, small padding, rounded corners). The figures stay byte-identical, read as intentional artifacts, and the grammar's palette constraint survives.
- **Dual-theme code highlighting needs both pipelines.** Shiki supports `themes: { light, dark }` and emits `--shiki-dark` CSS variables that a `prefers-color-scheme` block activates — no second render pass. CodeMirror has no equivalent, so `editor.js` picks `defaultHighlightStyle` vs `oneDarkHighlightStyle` from `matchMedia` at init. Two highlighters, two theming mechanisms; forgetting either leaves unreadable code in one scheme.
- **Rasterized artifacts do not belong in byte-parity gates.** Social cards are committed PNGs-turned-JPEGs rendered by headless Chrome, and rasterized bytes vary across Chrome versions and platforms. Putting `public/og/` under `make check-generated` would flap on every environment difference. The SEO linter instead checks *existence* — every `og:image` URL must resolve to a committed file — which catches the real failure (a new example without a card) without the false ones. Corollary: JPEG q90 beats PNG ~4x on file size for cards with gradient backgrounds, with no visible text degradation at 1200x630.
- **Social cards should reuse the curated figure set.** Each example's card composes its marginalia figure beside the title and summary, so a shared link carries the same diagram the page teaches with. `render_first_figure(slug)` is the only new marginalia surface the card builder needed; one Chrome session rasterizes all 110 cards in seconds via CDP navigation + `Page.captureScreenshot` clips.
- **Copy that describes data state goes stale silently.** Journey meta descriptions still claimed "explicit placeholders for missing examples" long after the last gap placeholder was filled, and the gap-rendering UI suggested holes that no longer existed. Prose that asserts a property of the data ("all examples run," "placeholders mark gaps") should either be derived from the data or covered by a check; otherwise fixing the data falsifies the copy.
- **Make targets that import the example loader must go through uv.** The loader executes example code that uses 3.12+ syntax (`type UserId = int`), so any script importing `src.app` breaks under an older system `python3` even though its shebang says `python3`. CI masks this by installing 3.13 as the default interpreter. Build steps that import the catalog (`build_search_index`, `build_social_cards`) run as `uv run --python 3.13 scripts/...` in the Makefile so local machines with an older `python3` behave like CI.
- **Index notes text, not just titles, and normalize at build time.** The search index concatenates the slug words and every note line, lowercased at build time, so concept queries ("walrus", "GIL"-style vocabulary that titles avoid) hit the right example and the client never re-normalizes entry content. Exporting the ranking function from `search.js` lets a plain Node script (`make search-ranking-test`) assert ranking behaviour against the real generated index without a JS test framework.
- **The wide events already knew what learners do; nobody was asking.** `example.slug`, `code_edited`, `execution_ms`, turnstile outcome, and 404 paths were all in the observability payload before any analytics existed. `scripts/learner_report.py` is a pure consumer: it auto-detects the raw payload, `wrangler tail` envelope, and Workers Logs envelope per line, so exports from any source work unmodified. The most valuable section costs nothing to collect: recurring 404s under `/examples/` are direct demand for pages that do not exist.
5 changes: 5 additions & 0 deletions docs/quality-registries.toml
Original file line number Diff line number Diff line change
Expand Up @@ -316,6 +316,7 @@ journey = "iteration"
section = "See the protocol behind `for`."
support = [
"iterating-over-iterables",
"iterator-vs-iterable",
"iterators",
"generators",
]
Expand Down Expand Up @@ -434,12 +435,14 @@ support = [
"inheritance-and-super",
"dataclasses",
"properties",
"classmethods-and-staticmethods",
"special-methods",
"truth-and-size",
"container-protocols",
"callable-objects",
"operator-overloading",
"attribute-access",
"bound-and-unbound-methods",
"descriptors",
"metaclasses",
]
Expand All @@ -456,6 +459,7 @@ section = "Keep runtime and static analysis separate."
support = [
"type-hints",
"protocols",
"abstract-base-classes",
"enums",
"runtime-type-checks",
]
Expand All @@ -473,6 +477,7 @@ support = [
"union-and-optional-types",
"type-aliases",
"typed-dicts",
"structured-data-shapes",
"literal-and-final",
"callable-types",
]
Expand Down
Loading
Loading