Skip to content

fix: seal child-process stdin on Windows (first-run hang)#74

Merged
Sunrisepeak merged 2 commits into
mainfrom
fix/windows-stdin-hang
May 23, 2026
Merged

fix: seal child-process stdin on Windows (first-run hang)#74
Sunrisepeak merged 2 commits into
mainfrom
fix/windows-stdin-hang

Conversation

@Sunrisepeak
Copy link
Copy Markdown
Member

Summary

mcpp's first-run flow on Windows was hanging at xlings / xim / curl / git grandchildren that block on terminal stdin, forcing users to press Enter repeatedly to advance bootstrap and toolchain install.

This PR seals child-process stdin on Windows (matching the POSIX behavior added in #55 for the macOS xcrun hang) and adds a deterministic regression test at both the unit and CI integration layers.

Companion analysis: .agents/docs/2026-05-23-windows-stdin-hang-analysis.md (not in this PR — kept as working notes).

Root cause

Status before this PR
process.cppm seal_stdin() no-op on Windows (POSIX-only </dev/null)
xlings.cppm install_with_progress direct path explicitly bypasses seal_stdin even on POSIX
shell.cppm silent_redirect doc claimed stdin+stdout+stderr, implementation only >/dev/null 2>&1
PR #55 (macOS hang fix) POSIX-only
PR #57 ("suppress xlings noise on Windows") stdout/stderr only, never touched stdin

Net result: on Windows, any subprocess descendant that calls read(stdin) blocks on the user's terminal until they press Enter.

Changes

  • src/platform/process.cppmseal_stdin now appends <NUL on Windows. All capture / run_silent / run_streaming / run_passthrough callers gain the protection automatically.
  • src/xlings.cppminstall_with_progress direct path explicitly appends <NUL on Windows (this path deliberately bypasses seal_stdin). POSIX keeps the original behavior to stay conservative.
  • src/platform/shell.cppmsilent_redirect docstring corrected (it never touched stdin; implementation unchanged).
  • tests/unit/test_process_seal_stdin.cppnew unit test, see below.
  • .github/workflows/ci-windows.ymlnew regression step, see below.

How the fix is tested

Unit test (tests/unit/test_process_seal_stdin.cpp)

Deterministic reproduction at the unit level:

  1. Rebinds the test process's own STDIN to an open, empty, never-closing pipe (Win32 CreatePipe + SetStdHandle + CRT _dup2 on Windows; pipe() + dup2() on POSIX).
  2. Calls run_silent / capture / run_streaming with a child that reads stdin (more on Windows, cat on POSIX).
  3. Without the fix → child inherits our pipe → blocks forever → test (and CI) times out.
  4. With the fix → child reads from NUL / /dev/null → exits in <100ms.
  5. Test asserts elapsed < 5s.

Runs on every CI (Linux + macOS + Windows) via mcpp test.

Integration test (ci-windows.yml)

Adds a new step Regression: mcpp survives open-empty-stdin (Windows hang fix). Launches mcpp via [System.Diagnostics.Process]::Start with RedirectStandardInput = $true (parent holds the child's stdin open, never writes, never closes). Runs three commands inside this hostile-stdin scenario:

  • mcpp --version (sanity)
  • mcpp build (full bootstrap → toolchain resolve → dep resolve → compile)
  • mcpp run (post-build run path)

Without the fix → any grandchild reading stdin blocks → step times out (15 min step budget, per-command 5/10/2 min) → CI fails.
With the fix → all three complete cleanly.

Test plan

  • CI on Linux (ci.yml) — must stay green (POSIX path unchanged)
  • CI on macOS (ci-macos.yml) — must stay green (POSIX path unchanged)
  • CI on Windows (ci-windows.yml) — including the new regression step
  • New unit test test_process_seal_stdin runs on all three platforms

Risk assessment

Low. Plan A (seal_stdin Windows branch) only adds <NUL to the command string. Plan B (install_with_progress) only adds <NUL to Windows path; POSIX unchanged. Both are append-only — no API or behavior change for callers that don't read stdin.

The only theoretical concern (raised in the pre-fix comment "xlings may need stdin for subprocess coordination during large package extraction") was never observed in practice on Linux/macOS over the past two months; we preserve POSIX behavior conservatively and only seal on Windows where the hang was reported.

mcpp's first-run flow on Windows was hanging at xlings / xim / curl / git
grandchildren that block on terminal stdin, forcing users to press Enter
repeatedly to advance bootstrap and toolchain install.

Root cause: process::seal_stdin was a no-op on Windows, and
install_with_progress's direct-install path deliberately bypassed it.
The POSIX side has had </dev/null sealing since PR #55 (macOS xcrun hang
fix); Windows never received the equivalent fix. PR #57 only suppressed
stdout/stderr noise (>/dev/null 2>&1) and did not touch stdin.

Changes:
  - process.cppm: seal_stdin now appends "<NUL" on Windows (matches POSIX
    behavior). All capture / run_silent / run_streaming / run_passthrough
    callers gain the protection automatically.
  - xlings.cppm: install_with_progress's direct path explicitly appends
    "<NUL" on Windows. POSIX keeps the original behavior conservatively.
  - shell.cppm: silent_redirect docstring corrected — it never touched
    stdin, that's seal_stdin's job. Implementation unchanged.

Regression coverage:
  - tests/unit/test_process_seal_stdin.cpp — deterministic reproduction
    test. Rebinds the test process's own stdin to an open, empty,
    never-closing pipe, then calls run_silent / capture / run_streaming
    with a child that reads stdin (more on Windows, cat on POSIX).
    Without the fix the child would block forever waiting on our pipe;
    with the fix it reads NUL / /dev/null and exits immediately. 5-second
    upper bound (real runs complete in <100ms).
  - ci-windows.yml — adds a step that launches mcpp via
    System.Diagnostics.Process with RedirectStandardInput=$true (parent
    holds the child's stdin open but never writes). Runs mcpp --version,
    mcpp build, mcpp run. Without the fix, any grandchild reading stdin
    blocks → step times out → CI fails. With the fix → all complete.
…$Args

Two issues with the regression step from the previous commit (both showed
up only on the actual Windows runner, not in local validation):

1. MCPP_SELF was set in an earlier bash step via `pwd` (git-bash) so the
   value is MSYS-style (e.g. /d/a/mcpp/...). Bash steps tolerate it but
   pwsh's `&` operator can't exec it ("not recognized as a name of a
   cmdlet, function, script file, or executable program"). Convert via
   cygpath -w before use.

2. `$Args` is a PowerShell automatic variable inside function scope; a
   `param([string]$Args)` does not bind cleanly. Renamed to $McppArgs
   to avoid the collision (also updated call sites).
@Sunrisepeak Sunrisepeak merged commit 8662905 into main May 23, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant