Skip to content

fix(mcp): replace .db-targeting stat() with cbm_file_size on Windows (> 2 GB)#579

Open
BombaxCeiba wants to merge 1 commit into
DeusData:mainfrom
BombaxCeiba:fix/list-projects-2gb-windows
Open

fix(mcp): replace .db-targeting stat() with cbm_file_size on Windows (> 2 GB)#579
BombaxCeiba wants to merge 1 commit into
DeusData:mainfrom
BombaxCeiba:fix/list-projects-2gb-windows

Conversation

@BombaxCeiba

Copy link
Copy Markdown

Fixes #578

Summary

The reported bug: on Windows, list_projects drops any .db > 2 GB from the
UI project list. Root cause is a bare stat(), whose st_size is a 32-bit
long on Windows — for files > 2 GB the call fails with EOVERFLOW and the
entry is silently skipped.

While triaging that, two more .db-targeting stat() sites in mcp.c turned
out to share the exact same > 2 GB bug, so this PR fixes all three in one go:

  • list_projects — drops the project from the UI list (the reported symptom).
  • maybe_auto_index — mis-detects the .db as absent and re-indexes the
    project on every server start
    (when auto_index is on).
  • try_artifact_bootstrap — mis-detects it as absent and wastes an artifact
    import
    (immediately overwritten by the ensuing pipeline rebuild).

All three now use the codebase's existing cbm_file_size helper (already
wide-char + 64-bit clean, already used by the UI layer) instead of stat().

Root cause

struct stat::st_size is a 32-bit long on Windows (MinGW / MSVC), so
stat() on a > 2 GB file fails with EOVERFLOW (returns -1). Each of the
three sites treated that -1 as "file missing" or "skip", so a valid > 2 GB
.db was silently dropped / re-indexed / re-imported.
(build_project_json_entry also read (int64_t)st->st_size — moot on Windows
since the entry never reached it, and would truncate anyway.)

The fix

Single file changed: src/mcp/mcp.c (+7 / −9). No new function, no new
include. Each site swaps stat() for the existing cbm_file_size() (declared
in foundation/platform.h, which mcp.c already includes):

  • list_projects / build_project_json_entry — size is now taken as
    int64_t from cbm_file_size(full_path) (was (int64_t)st->st_size); the
    skip guard becomes if (size_bytes < 0) continue;.
  • try_artifact_bootstrapstat(db_buf) != 0 (missing) →
    cbm_file_size(db_buf) < 0.
  • maybe_auto_indexstat(db_check) == 0 (present) →
    cbm_file_size(db_check) >= 0.

cbm_file_size is wide-char + 64-bit clean on Windows (GetFileAttributesExW

  • cbm_utf8_to_wide, composing nFileSizeHigh/nFileSizeLow into int64_t)
    and uses plain stat() on POSIX (off_t is already 64-bit under LFS). It
    returns CBM_NOT_FOUND (-1) on failure, so the existing missing-file logic
    is preserved at every site.

Relation to prior work (#386 / #394)

Distinct from the merged #386 (fix(windows): use wide-char API for non-ASCII path support), despite sharing the stat surface:

  • fix(windows): use wide-char API for non-ASCII path support #386 addressed path encoding (non-ASCII UTF-8 paths via
    stat_wstat64). Its changes (discover.c, compat_fs.c, platform.c,
    win_utf8.h) do not touch mcp.c, so these sites kept their bare
    stat().
  • This PR addresses the file-size dimension: struct stat::st_size is
    a 32-bit long on Windows, so a > 2 GB .db makes stat() fail with
    EOVERFLOW. Orthogonal to path encoding.

This PR adds no new helper. It reuses cbm_file_size, which already existed
on main (introduced in 94a9a92, well before #386) and which #386 happened to
upgrade to wide-char safety on the Windows side. A naive migration to #386's
wide_stat would not have fixed this: wide_stat writes the _stat64
result back into a struct stat, where st_size stays 32-bit on Windows, so
the size would truncate instead of failing. cbm_file_size returns int64_t
directly, which is what actually closes the hole.

Not a duplicate of anything under #394 (Windows cluster) — none of its 8
sub-issues cover the large-file / st_size-overflow class.

Verification

cbm_file_size returns the size as int64_t on both platforms — on Windows
via GetFileAttributesExW (no 32-bit st_size in play, so EOVERFLOW is
impossible), on POSIX via stat() (identical to the old path). The > 2 GB case
can no longer be dropped / re-indexed / re-imported, and POSIX behaviour is
unchanged. CI runs the full test suite on Windows and POSIX on PR open.

…(>2GB)

On Windows struct stat::st_size is a 32-bit long, so any .db larger than 2 GB
makes stat() fail with EOVERFLOW. Three call sites in src/mcp/mcp.c checked
.db files this way and silently mishandled the > 2 GB case: list_projects
dropped the project from the UI list, maybe_auto_index re-indexed it on every
server start, and try_artifact_bootstrap re-imported it from the artifact.
Replace all three with the existing cbm_file_size (platform.c), which is
wide-char + 64-bit clean on Windows (GetFileAttributesExW) and returns int64_t
on both platforms, so the size never rides through the 32-bit st_size.

Signed-off-by: Dusk_NM02 <dusk.11th@outlook.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Windows] list_projects silently drops any .db file ≥ 2 GB — UI shows an empty project list

2 participants