Skip to content

Improve CI Workflow#887

Open
Symmetricity wants to merge 9 commits into
systemed:masterfrom
Symmetricity:ci/workflow-improvements
Open

Improve CI Workflow#887
Symmetricity wants to merge 9 commits into
systemed:masterfrom
Symmetricity:ci/workflow-improvements

Conversation

@Symmetricity
Copy link
Copy Markdown

@Symmetricity Symmetricity commented May 13, 2026

Improve CI Workflow

This PR was AI generated.

Summary

This updates the GitHub Actions CI workflow so it is faster, less repetitive,
and checks more than whether tilemaker compiles.

  • update GitHub Actions to current major versions
  • add explicit workflow permissions and concurrency cancellation
  • cache the Geofabrik Liechtenstein fixture using freshness metadata, validate
    it against Geofabrik's published .md5, then share it with build jobs as an
    artifact
  • add a fast static-validation job before fixture download and build-heavy jobs
  • skip build-heavy jobs when a change cannot affect runtime behavior
  • use runner-image-scoped vcpkg binary caches instead of caching the installed
    vcpkg tree directly
  • add Docker Buildx cache and split PR image builds from master publishes
  • generate MBTiles and PMTiles outputs from each CI build path
  • fail immediately if a tilemaker invocation fails or does not create the
    expected output archive
  • run tile generation twice per build path and verify repeatability
  • upload generated tile archives for inspection
  • verify PMTiles archive structure with pmtiles verify
  • compare generated tile contents semantically, not just byte-for-byte
  • normalize MVT geometry encodings during semantic comparison, including
    point ordering and polygon ring rotation while preserving polygon ring
    grouping
  • report progress and ETA during expensive semantic tile comparisons
  • compare PMTiles semantic content in parallel worker processes when raw tile
    bytes differ
  • keep the published Docker action output out of generated-tile validation
    because it is built from the published master image, not the PR build

Rationale

The old workflow downloaded the same PBF independently in several jobs and only
verified that the build commands completed. That made CI slower than necessary
and left important runtime behavior unchecked.

The shared fixture job makes every build use the same input file. It keys the
Actions cache from Geofabrik response metadata (URL, effective URL, ETag,
Last-Modified, and Content-Length) so repeat runs can reuse the fixture
without pinning CI to an old file forever. When Geofabrik republishes the PBF,
the metadata-derived key changes and CI downloads the new fixture. The restored
or downloaded fixture is validated against Geofabrik's published .md5 before
it is uploaded for build jobs. If Geofabrik is unavailable or the checksum does
not match, the workflow reports that tile generation was not verified rather
than making the failure look like a code regression.

The vcpkg cache now stores vcpkg binary packages and includes the GitHub runner
image identity in the key. That avoids blindly restoring an installed tree from
a different runner image while still allowing warm dependency restores on stable
runners. Docker builds also use the GitHub Actions build cache.

The generated tile verification exists because tilemaker's most important output
is the generated vector-tile archive. A build can compile cleanly while still
producing invalid PMTiles, non-repeatable tile content, or different content
between build paths. CI should catch those cases.

Each tilemaker invocation is now checked as soon as it returns. This avoids
multi-command CI steps masking a failed archive generation when a later command
succeeds, and it makes the failing output visible in the generation job rather
than later as a missing artifact or verifier failure.

The static-validation job gives the workflow a cheaper failure path for changes
to CI, profiles, and scripts. It validates GitHub workflow files with
actionlint, shell scripts with shellcheck, tracked JSON files with Python,
Lua profiles with luac, and Python CI helper scripts with py_compile before
starting the fixture download, Docker builds, or C++ build matrix.

Output verification

The new verifier checks two levels of output equality:

  • raw decompressed tile bytes, which catches byte-for-byte differences after
    removing archive compression noise
  • semantic MVT content, which canonicalizes layer order, feature order, and tag
    order before comparing. It also decodes geometry command streams so equivalent
    point ordering and polygon ring rotation do not fail as content changes, while
    preserving polygon outer-ring and inner-ring grouping.

Raw byte differences with identical semantic content are reported as notices.
Semantic tile-content differences fail the job and include the first differing
tile. When the layer feature counts differ, the error also reports the affected
layer and counts.

PMTiles semantic comparisons use paired worker processes for raw-different
tiles. This keeps the same semantic comparison rules while avoiding repeated
archive scans and giving progress/ETA output on larger fixtures where many raw
tile payloads differ.

PMTiles archives are also checked with pmtiles verify before their tile content
is compared. The PMTiles verifier output is included in the GitHub annotation so
the failure is visible in the web UI without digging through raw logs.

The GitHub Action job uses the published
ghcr.io/systemed/tilemaker:master Docker image from action.yml, not the
PR-built CMake or Makefile binaries. It is therefore kept out of generated-tile
validation. On master, it runs after docker-publish as a smoke test for the
published Action image.

Current findings

The verifier was run locally against artifacts downloaded from fork CI runs.
Those tests showed that the new check is catching real output issues, not just
formatting or compression differences:

  • Windows PMTiles archives failed structural verification with a header length
    mismatch reported by pmtiles verify.
  • Repeat runs produced semantic tile-content differences, including examples
    where the same tile/layer had different feature counts, such as poi having
    102 features in one run and 101 in the repeat run.
  • Many byte-level differences were only ordering differences, which is why the
    verifier now separates raw-byte mismatches from semantic MVT mismatches.
  • Some raw MVT geometry differences were equivalent encodings of the same
    feature geometry; the verifier now normalizes those before calculating
    semantic hashes.
  • One fork CI run uploaded an incomplete generated-tile artifact after a build
    job produced only three of the four expected archives. The workflow now fails
    at the tile generation step if an expected archive is missing or empty.
  • A later fork CI run with the fail-fast wrapper caught the same class of
    problem at source: Windows CMake exited with -1073741819 while writing the
    stored liechtenstein.mbtiles output, instead of uploading a partial artifact.
  • A larger local Austria PMTiles repeat test produced 47,742 addressed tiles,
    with 43,124 raw-byte-different tiles between warmed repeat runs. The
    parallel semantic verifier checked those differences with four worker
    processes in 14m57.97s and confirmed that semantic content matched.

These findings are not fixed by this PR. This PR adds CI coverage that exposes
them consistently.

Related issues and PRs

This PR does not include the build-fix changes from #886. If master still has
those build failures when this PR is tested, this PR should be applied after
#886 or rebased once equivalent fixes land.

Directly related:

This PR may improve future detection and debugging for output-related issues,
but does not claim to fix them:

No directly relevant GitHub Discussions were found for deterministic generated
tile output checks.

Testing

  • actionlint .github/workflows/ci.yml
  • ruby -e 'require "yaml"; YAML.load_file(".github/workflows/ci.yml"); puts "yaml ok"'
  • git ls-files -z -- '*.sh' | xargs -0 -r shellcheck
  • git ls-files -z -- 'resources/*.lua' | xargs -0 -r luac5.4 -p
  • tracked JSON files parsed with Python json.load
  • git ls-files -z -- '.github/scripts/*.py' | xargs -0 -r python3 -m py_compile
  • python3 -B -c 'import py_compile; py_compile.compile(".github/scripts/verify-generated-tiles.py", cfile="/tmp/verify-generated-tiles.pyc", doraise=True)'
  • git diff --check
  • local verifier run against downloaded fork CI tile-output artifacts
  • local verifier run against two warmed Austria PMTiles outputs using the
    parallel semantic comparator; raw tile bytes differed, semantic content
    matched

The clean branch currently contains only CI-related files:

  • .github/workflows/ci.yml
  • .github/scripts/verify-generated-tiles.py
  • .gitignore

Update the workflow defaults before adding deeper output checks. This adds explicit workflow permissions, run cancellation, shared fixture download and verification, newer GitHub Actions versions, runner-scoped vcpkg binary caches, and Docker build caching.

This reduces duplicated work across jobs, avoids using an unchecked PBF download independently on each runner, makes cache reuse less brittle across runner image changes, and keeps PR Docker builds separate from master publishes.
Generate MBTiles and PMTiles artifacts from each CI build path, run the generation twice, and compare the results after the build jobs complete.

The verification step checks PMTiles archive structure, records raw decompressed tile-byte hashes, and also canonicalizes MVT layer, feature, and tag ordering so ordering-only changes are reported separately from semantic tile-content differences.

This turns CI into a correctness check for generated output, not just a compiler check. Current runs show that the new check is finding real issues: Windows PMTiles archives fail structural verification, and repeat runs can produce semantic differences such as different feature counts in the same layer/tile.

Because this adds a Python CI helper, ignore Python bytecode and cache directories produced by local validation.
Decode MVT geometry before calculating semantic hashes so equivalent point ordering and polygon ring rotation do not fail CI as content changes.

Keep the published Docker action output in repeat-output checks, but exclude it from cross-runner comparisons because that action uses the ghcr.io master image rather than the PR-built binaries.
@Symmetricity Symmetricity force-pushed the ci/workflow-improvements branch from fd48625 to 57b1c31 Compare May 13, 2026 05:11
@Symmetricity Symmetricity mentioned this pull request May 13, 2026
CI generated tile artifacts can otherwise be incomplete when one tilemaker invocation fails but a later command in the same step succeeds. This made the Windows CMake job upload only three of the four expected tile files while the job itself still passed.

Wrap direct tilemaker calls so each output is checked immediately, and verify generated files before artifact upload. This makes the failing output visible at the generation step instead of deferring the problem to the verifier or silently uploading partial artifacts.
Hash ordered tile payloads before falling back to semantic decoding so repeat outputs that differ only in archive layout can pass without expensive MVT canonicalization.

When raw tile bytes differ, decode only the differing tiles and stop at the first semantic mismatch. This keeps the existing verification behaviour while reducing work for deterministic runs and making failures report quickly.
Generated tile verification can spend a long time canonicalizing MVT payloads when PMTiles archives differ at the raw tile-byte level. Large fixtures can have tens of thousands of raw-different tiles even when the semantic content matches.

Compare PMTiles tile pairs in worker processes and report progress with an ETA. This keeps the existing semantic comparison while making larger local and CI artifacts practical to diagnose.
The CI fixture job downloaded the Liechtenstein PBF on every runtime-impacting run, even when Geofabrik had not republished the file.

Resolve the current Geofabrik freshness metadata with a HEAD request and use a normalized hash of URL, effective URL, ETag, Last-Modified, and Content-Length as the Actions cache key. Restore the cached PBF when that metadata matches, otherwise download the current file.

Validate both cached and newly downloaded fixtures against Geofabrik's published MD5 before uploading the artifact. This keeps repeat runs faster without pinning CI to a stale checked-in fixture.
The workflow can spend runner time downloading fixtures, restoring dependencies, and building C++ before catching simple profile, workflow, or script syntax errors.

Add a cheap static-validation job after the changed-file gate and before fixture download, Docker, and build-heavy jobs. It validates GitHub workflows with actionlint, shell scripts with shellcheck, tracked JSON files with Python, Lua profiles with luac, and Python CI helpers with py_compile.

This gives CI a faster failure path for the profile and workflow files we have been changing, without adding compiler or vcpkg work to the early validation layer.
The generated-tile verifier compares outputs from binaries built by this workflow. The local GitHub Action uses action.yml, which points at the published ghcr.io/systemed/tilemaker:master image instead of the PR-built binary.

Exclude the Action artifact from tile validation and run that job only after docker-publish on master, where it is a smoke test for the published Action image rather than a cross-build equivalence input.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant