File-backed mmap for XNNPACK packed weights (#19862)#19862
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19862
Note: Links to docs will display an error until the docs builds have been completed. ❗ 2 Active SEVsThere are 2 currently active SEVs. If your PR is affected, please view them below:
❌ 2 New Failures, 2 Unrelated FailuresAs of commit 24af41f with merge base 4de16d0 ( NEW FAILURES - The following jobs have failed:
BROKEN TRUNK - The following jobs failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
|
@doggeral has exported this pull request. If you are a Meta employee, you can view the originating Diff in D106673663. |
This PR needs a
|
Summary: Add file-backed mmap support to `XNNWeightsCache` so that packed weight allocations go to a `MAP_SHARED` file instead of dirty heap. After `msync(MS_ASYNC)`, pages become clean file-backed and drop out of iOS `phys_footprint`. ## How it works 1. `set_packed_cache_path()` configures the cache file path via `BackendOptions` 2. `initialize_for_runtime()` opens the cache file 3. Each `reserve_space()` call extends the file via `ftruncate` and creates a `MAP_SHARED` mmap region — XNNPACK packs weights directly into file-backed pages 4. `finalize_for_runtime()` calls `msync(MS_ASYNC)` on newly added regions only (incremental sync), making pages clean 5. On Windows, mmap is unavailable — all code paths fall back to heap allocation automatically (`packed_file_fd_` stays -1) ## Expected savings ~400 MB packed weights move from dirty heap to clean file-backed pages (0 `phys_footprint` on iOS). Differential Revision: D106673663
eeb479e to
24af41f
Compare
Summary:
Add file-backed mmap support to
XNNWeightsCacheso that packed weight allocations go to aMAP_SHAREDfile instead of dirty heap. Aftermsync(MS_ASYNC), pages become clean file-backed and drop out of iOSphys_footprint.How it works
set_packed_cache_path()configures the cache file path viaBackendOptionsinitialize_for_runtime()opens the cache filereserve_space()call extends the file viaftruncateand creates aMAP_SHAREDmmap region — XNNPACK packs weights directly into file-backed pagesfinalize_for_runtime()callsmsync(MS_ASYNC)on newly added regions only (incremental sync), making pages cleanpacked_file_fd_stays -1)Expected savings
~400 MB packed weights move from dirty heap to clean file-backed pages (0
phys_footprinton iOS).Differential Revision: D106673663