Skip to content

book deploy: retry the flaky redirect check, and upload the Playwright report on failure#2177

Merged
dscho merged 2 commits into
git:gh-pagesfrom
dscho:harden-book-deploy-playwright
Jun 25, 2026
Merged

book deploy: retry the flaky redirect check, and upload the Playwright report on failure#2177
dscho merged 2 commits into
git:gh-pagesfrom
dscho:harden-book-deploy-playwright

Conversation

@dscho

@dscho dscho commented Jun 25, 2026

Copy link
Copy Markdown
Member

The scheduled update-book.yml workflow rebuilds the ProGit book and, in its push-updates job, deploys to GitHub Pages through the deploy-to-github-pages composite Action, which runs the Playwright smoke tests against the freshly deployed site. A recent scheduled run failed there, and the failure was needlessly hard to diagnose because, unlike the pull-request workflow in ci.yml, the deploy Action never uploaded the Playwright report that embeds the failure screenshots.

The first commit closes that observability gap. The report's upload-artifact step is guarded by if: always() && steps.playwright.outputs.result != '', but the Playwright step in the Action never wrote anything to $GITHUB_OUTPUT, so the guard was always false and the report was silently dropped. The PR workflow avoids this by writing result=$PLAYWRIGHT_TEST_URL before running the tests; this mirrors that, so the deploy job now leaves the report behind on failure.

The second commit fixes the failure itself. The book test navigates to /book/ and expects to land on /book/en/v2, but that redirect is purely client-side: Hugo emits /book/index.html as a stub whose only content is <meta http-equiv="refresh" content="0; url=../book/en/v2">. When that zero-delay refresh fires, Chrome occasionally aborts the load that page.goto() started and strands the page on its internal error page, chrome-error://chromewebdata/. Because that is a committed navigation, the existing expect(page).toHaveURL(...), which only polls the URL, can never recover. This is a known transient Playwright/Chromium issue. The fix wraps the navigate-and-assert in expect(async () => { ... }).toPass(), which re-triggers the navigation until the expected URL is reached.

In the interest of full disclosure: the chrome-error is a live-network race that does not reproduce against a local localhost build, so I could not exercise it directly. The refactor was verified only to not regress the normal pass, by running the full suite against a local Hugo build before and after the change with identical results.

@dscho

dscho commented Jun 25, 2026

Copy link
Copy Markdown
Member Author

@To1ne @jvns could I bother any of you for a review?

Comment thread tests/git-scm.spec.js Outdated
await expect(async () => {
await page.goto(gotoURL)
await expect(page).toHaveURL(expectedURL)
}).toPass()

@jvns jvns Jun 25, 2026

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the timeout on this?

edit: the docs say "by default toPass has timeout 0 and does not respect custom expect timeout.", should we set an explicit timeout?

https://playwright.dev/docs/test-assertions#expecttopass

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jvns good point. The docs say:

Note that by default toPass has timeout 0 and does not respect custom expect timeout.

We should probably use something like this instead:

  // Retry every second for up to a minute in total
}).toPass({ intervals: [1_000], timeout: 60_000 });

What do you think?

@jvns jvns Jun 25, 2026

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe start with a shorter timeout, in case there are some actual failures that need to be caught more quickly? I don't feel strongly though

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jvns here's the...

range-diff
  • 1: aa7ed1d ! 1: 7da7e6f tests: retry the book redirect to dodge "chrome-error://chromewebdata/"

    @@ tests/git-scm.spec.js: test.afterEach(async ({ page }, testInfo) => {
     +  await expect(async () => {
     +    await page.goto(gotoURL)
     +    await expect(page).toHaveURL(expectedURL)
    -+  }).toPass()
    ++    // Retry every second for up to 15 seconds in total
    ++  }).toPass({ intervals: [1_000], timeout: 15_000 });
     +}
     +
      test('generator is Hugo', async ({page}) => {

@dscho dscho force-pushed the harden-book-deploy-playwright branch from aa7ed1d to 7da7e6f Compare June 25, 2026 12:33
@jvns

jvns commented Jun 25, 2026

Copy link
Copy Markdown
Collaborator

I find the commit messages kind of hard to read (there's some jargon like "committed navigation" that reminds me of how LLMs talk that's hard for me to understand, and it feels more verbose than it needs to be)

i think the messages could be a lot more concise, like for instance for the first one:

When there are test failures like in https://github.com/git/git-scm.com/actions/runs/28149818103, the Playwright report isn't uploaded.

This is because upload-artifact step is guarded by if: always() && steps.playwright.outputs.result != '', but the Playwright step in the Action never wrote anything to $GITHUB_OUTPUT, so steps.playwright.outputs.result was always the empty string.

This writes result=$PLAYWRIGHT_TEST_URL before running the tests to make the Playwright report upload.

dscho added 2 commits June 25, 2026 14:43
When there are test failures like in
https://github.com/git/git-scm.com/actions/runs/28149818103, the
Playwright report isn't uploaded.

This is because upload-artifact step is guarded by if: always() &&
steps.playwright.outputs.result != '', but the Playwright step in the
Action never wrote anything to $GITHUB_OUTPUT, so
steps.playwright.outputs.result was always the empty string.

This writes result=$PLAYWRIGHT_TEST_URL before running the tests to make
the Playwright report upload.

Assisted-by: Opus 4.8
Helped-by: Julia Evans <julia@jvns.ca>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The `book` Playwright test assert that `/book/` redirects to
`/book/en/v2`. Occasionally this fails, landing on
`chrome-error://chromewebdata/` instead, see e.g.
https://github.com/git/git-scm.com/actions/runs/28149818103.

This is a known Playwright/Chromium issue, see
microsoft/playwright#19161.

Let's work around this (transient) problem by retrying multiple times.

Assisted-by: Opus 4.8
Helped-by: Julia Evans <julia@jvns.ca>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
@dscho dscho force-pushed the harden-book-deploy-playwright branch from 7da7e6f to 830490c Compare June 25, 2026 12:46
@dscho

dscho commented Jun 25, 2026

Copy link
Copy Markdown
Member Author

When there are test failures like in https://github.com/git/git-scm.com/actions/runs/28149818103, the Playwright report isn't uploaded.

This is because upload-artifact step is guarded by if: always() && steps.playwright.outputs.result != '', but the Playwright step in the Action never wrote anything to $GITHUB_OUTPUT, so steps.playwright.outputs.result was always the empty string.

This writes result=$PLAYWRIGHT_TEST_URL before running the tests to make the Playwright report upload.

You're right, @jvns, I did let LLM write those messages (while working on some Git for Windows v2.55.0 release stuff, MinGit-BusyBox is currently quite broken, for example).

I've now reworded both commits (taking your suggestion verbatim). Good?

@jvns

jvns commented Jun 25, 2026

Copy link
Copy Markdown
Collaborator

yep, those are so much better, thank you!

@dscho dscho merged commit 1924a65 into git:gh-pages Jun 25, 2026
1 check passed
@dscho dscho deleted the harden-book-deploy-playwright branch June 25, 2026 14:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants