Skip to content

Use fixed polygon clip cache level#892

Open
Symmetricity wants to merge 1 commit into
systemed:masterfrom
Symmetricity:fix/deterministic-polygon-clip-cache
Open

Use fixed polygon clip cache level#892
Symmetricity wants to merge 1 commit into
systemed:masterfrom
Symmetricity:fix/deterministic-polygon-clip-cache

Conversation

@Symmetricity
Copy link
Copy Markdown

Use fixed polygon clip cache level

This PR was AI generated.

Summary

This makes multipolygon clip-cache reuse deterministic by caching polygon clips
at tilemaker's existing z6 object-clustering level, then writing z0..z6 tiles
before z7+ tiles.

Higher zoom polygon tiles previously reused whichever lower-zoom cached clip
happened to be present. In a multithreaded run, that ancestor could depend on
worker scheduling. Clipping an already clipped polygon through different
ancestor paths can produce different encoded polygon geometry, even when the
source data and command line are unchanged.

Linestring clip-cache behavior is unchanged.

Background

The existing clip cache is intentional and performance-sensitive. It was added
as part of #612, which extracted ClipCache and reduced memory use by lazily
reconstructing way-backed geometries. PR #607 explains the intended cache model:
when writing higher zooms, tilemaker prefers clipping a cached lower-zoom
geometry over clipping the original geometry again. That avoids repeated work
for very large polygons.

This PR keeps that optimization but makes the parent level deterministic for
polygons. Rather than searching z-1, z-2, and so on until any cached ancestor is
found, multipolygons now use the z6 ancestor. z6 is already tilemaker's object
clustering level, so the change follows an existing locality boundary instead
of adding a new one.

Investigation

This was found while investigating generated-tile semantic differences. A
separate fix exposed a repeatability issue where normal multithreaded runs could
emit different polygon coordinates for the same tile. The first observed local
case was:

  • tile: z11/1077/1329
  • layer: water
  • tags: class=lake
  • source: OSM relation 9404214, a large Rhine water multipolygon with inner
    rings
  • symptom: matching feature counts and tags, but different polygon coordinates
    between repeated runs

Additional checks:

  • running with --threads=1 was stable
  • temporarily disabling polygon ClipCache reads made repeated normal
    multithreaded runs semantically match again
  • revalidating clipped polygons after spike removal did not fix the scheduling
    sensitivity
  • a full zoom-barrier scheduler made local repeats deterministic, but was too
    expensive on a medium fixture

The narrowed root cause is shared parent-tile clip-cache reuse. A higher zoom
tile could reuse a z10 parent in one run and a z6 parent in another run,
depending on which worker populated the cache first.

Implementation

This PR adds an optional fixed-zoom mode to ClipCache.

For multipolygons only:

  • cache entries are written only at min(basezoom, CLUSTER_ZOOM)
  • cache lookups for higher zooms map directly to that fixed parent tile
  • z0..z6 tiles are written before z7+ tiles so the fixed parent cache entries
    are available before child tiles are processed

The existing ancestor-search behavior remains available and is still used for
linestrings.

Results

Small fixture / CI

The fix was tested stacked on the currently submitted PR changes in fork CI run
25960612652.

After rerunning a known intermittent Windows crash in tile generation, the run
passed native generated-tile verification:

  • macOS 14 Makefile
  • macOS latest Makefile
  • Ubuntu 22.04 CMake
  • Ubuntu 22.04 Makefile
  • Windows CMake

Repeat outputs and cross-runner outputs matched semantically. Raw MBTiles and
PMTiles bytes still differed, but decoded MVT content matched.

The clean, isolated branch was also checked locally against the Liechtenstein
fixture using upstream resources/config-openmaptiles.json and
resources/process-openmaptiles.lua.

That isolated check is useful, but it also shows why this PR should be read as
one part of the broader determinism work:

  • upstream master repeat output differed in 15 decoded MBTiles layers:
    1 water geometry difference and 14 poi differences
  • the clean fixed-z6 branch removed the observed z11 water repeat difference,
    but still showed unrelated poi differences because the isolated branch does
    not include the Lua POI-order determinism fix
  • upstream-vs-fixed output differed in polygon geometry as expected:
    204 decoded MBTiles layer differences, mostly landcover, landuse,
    water, and park

Two visual checks were prepared from that fixture:

Rhine water polygon repeat comparison

High-zoom water edge comparison

Larger local fixture

A local Austria fixture was used for a larger timing check:

  • input: Austria extract, 754 MB
  • profile: resources/config-openmaptiles.json and
    resources/process-openmaptiles.lua
  • output: PMTiles
  • command output redirected to log files for both runs
  • source PBF effectively warm in the page cache for both runs
  • semantic verification was not run for the timing comparison

Results:

Build Wall time User time System time CPU Max RSS
upstream master 3:53.30 540.47s 466.40s 431% 3,537,824 KB
fixed z6 cache 3:33.61 552.21s 398.39s 445% 3,461,776 KB

On this fixture, the fixed-z6 branch was about 8.4% faster wall-clock, used
about 5.6% less total CPU time, and used about 2.1% less peak RSS. Treat these
as fixture-specific results rather than a guarantee.

An earlier larger PMTiles repeat test on the same Austria fixture produced
47,742 addressed tiles. The fixed-z6 outputs differed at the raw tile-byte
level, but the semantic verifier checked 43,124 raw-different PMTiles tile
pairs and confirmed matching decoded content.

Planet Estimate

This has not been measured on a planet build.

PR #612 reported a planet build at 67m51s and 42 GB RAM on a 48-core Hetzner
CCX63. If the Austria wall-time ratio transferred directly, that would estimate
roughly 62 minutes for the same planet build. Applying the Austria RSS ratio
directly would estimate roughly 41 GB RAM.

Those numbers are only directional:

  • planet output has a different mix of very large polygons and high-zoom tiles
  • memory is dominated by node/way/object stores, not only the clip cache
  • the z6 phase boundary may help or hurt depending on workload balance
  • the local Austria run used a different machine and a much smaller input

The safer expectation is that this should be approximately performance-neutral
to modestly faster on large polygon-heavy outputs, with memory roughly neutral.

Expected Improvement

The main improvement is deterministic decoded tile content for multipolygon
outputs that previously depended on worker scheduling.

This should make repeated tilemaker runs less sensitive to thread timing and
platform scheduling. It also keeps the monster-polygon clip cache rather than
disabling it.

Possible Regressions

This can change polygon geometry compared with previous output because clipping
is not perfectly associative:

clip(original, z6 parent), then clip(child)

can differ slightly from:

clip(original, z8 parent), then clip(child)

Expected visual differences, if any, should be localized to polygon edges,
especially water, landuse, landcover, and building polygons that cross tile or
z6-cluster boundaries.

This does not make generated archives byte-for-byte deterministic. Archive
ordering, compression, and raw tile bytes can still differ while decoded MVT
content matches.

There is also a scheduling tradeoff: z0..z6 tiles now complete before z7+ tiles
are posted. The tested Austria fixture did not show a slowdown, but a workload
with unusually expensive low-zoom/z6 tiles could see reduced parallelism during
that phase.

Related Issues and PRs

No directly matching upstream issue or PR was found for the scheduling-sensitive
deterministic-output bug.

Testing

  • cmake -S . -B build -DCMAKE_BUILD_TYPE=RelWithDebInfo
  • cmake --build build -j2
  • fork CI run 25960612652, stacked test branch, native generated-tile
    semantic verification passed after rerunning the known intermittent Windows
    tile-generation crash
  • local Austria upstream/fixed-z6 timing comparison, PMTiles output, no
    semantic verification for the timing run

Higher zoom polygon tiles previously reused whichever lower-zoom cached clip happened to be present. In a multithreaded run that can depend on worker scheduling, and clipping an already-clipped polygon is not guaranteed to produce the same result as clipping the original geometry through a different ancestor path.

Cache polygon clips at tilemaker's existing z6 object-clustering level and write z0..z6 before z7+. This keeps the monster-polygon cache useful while removing the scheduling-dependent ancestor choice. Linestring cache behavior is unchanged.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant