Skip to content

fix: remove subos sysroot override, use payload paths for toolchain sysroot#62

Merged
Sunrisepeak merged 22 commits into
mainfrom
fix/linux-sysroot-payload-first
May 21, 2026
Merged

fix: remove subos sysroot override, use payload paths for toolchain sysroot#62
Sunrisepeak merged 22 commits into
mainfrom
fix/linux-sysroot-payload-first

Conversation

@Sunrisepeak
Copy link
Copy Markdown
Member

Summary

  • Remove M5.5 logic that forced tc->sysroot to mcpp's subos (~/.mcpp/registry/subos/default), which lacks linux kernel headers (linux/limits.h) on user machines
  • Parse clang++.cfg for --sysroot= when -print-sysroot fails (Clang doesn't support it), so tc->sysroot reflects the payload's actual configuration
  • Generalize --no-default-config from macOS-only to all Clang toolchains with a cfg file (cfg paths become stale after mcpp copies the payload)

Root cause

Commit 063fb6f changed MCPP_HOME to ~/.mcpp/, where the subos skeleton exists but is incomplete (no linux/, asm/, asm-generic/ headers). M5.5 only checked exists(usr/include) and overwrote the toolchain's correct sysroot. CI never caught this because its subos is fully populated by xlings self install.

Design principle

mcpp uses xlings only as a package index + download tool. The toolchain sysroot comes from the payload itself (GCC -print-sysroot, Clang clang++.cfg), not from xlings subos.

Test plan

  • CI passes on Linux (GCC + LLVM std module precompilation)
  • CI passes on macOS (--no-default-config + xcrun SDK path)
  • CI passes on Windows (no sysroot involved, MSVC STL)

…ysroot (#62)

Remove M5.5 logic that forced tc->sysroot to mcpp's xlings subos
(~/.mcpp/registry/subos/default). The subos created by `xlings self init`
lacks linux kernel headers (linux/limits.h, asm/, asm-generic/), causing
std module precompilation to fail on user machines.

Root cause: commit 063fb6f changed MCPP_HOME to ~/.mcpp/, where the
subos exists but is incomplete. M5.5 only checked exists(usr/include)
and overwrote the toolchain's correct sysroot. CI never caught this
because its subos is fully populated by `xlings self install`.

Design principle: mcpp uses xlings only as a package index + download
tool. Sysroot comes from the toolchain payload itself, not from subos.

Changes:
- cli.cppm: delete M5.5 subos sysroot override entirely
- probe.cppm: parse clang++.cfg --sysroot= when -print-sysroot fails
  (Clang doesn't support -print-sysroot), so tc->sysroot reflects the
  payload's actual configuration
- stdmod.cppm: generalize --no-default-config from macOS-only to all
  Clang toolchains with a cfg file (cfg paths become stale after mcpp
  copies the payload to its sandbox)
- flags.cppm: sync the same --no-default-config logic for regular
  compilation flags
GCC bakes the build-time sysroot into the binary via --with-sysroot.
For xlings-built GCC this is a path like <buildhost>/.xlings/subos/default
that doesn't exist on the user's machine. When -print-sysroot returns
such a non-existent path ending in subos/default, remap it to the
equivalent sysroot relative to the compiler's own xpkgs directory.

This is payload-derived (from the compiler binary's location in the
registry), not a config-level dependency on subos.
When bypassing clang++.cfg with --no-default-config, we must provide
both libc++ include paths that the cfg originally supplied:
  -isystem <llvmRoot>/include/c++/v1
  -isystem <llvmRoot>/include/<triple>/c++/v1

The target-specific path contains __config_site which is required by
__config. Without it, std module precompilation fails with
'__config_site' file not found.
On Linux, clang++.cfg contains essential linker flags (-fuse-ld=lld,
--rtlib=compiler-rt, --unwindlib=libunwind). Using --no-default-config
strips these, causing "cannot find crtbeginS.o" link failures because
clang falls back to system GNU ld looking for GCC runtime objects.

On Linux, let the cfg apply normally. The cfg's --sysroot points to
the xlings subos which is valid and complete. Pass --sysroot explicitly
only when needed (to override a stale cfg value), leveraging the fact
that command-line --sysroot takes precedence over the cfg's value.

Keep --no-default-config for macOS only, where the cfg-baked paths
genuinely become stale (pointing to CommandLineTools SDK when Xcode
SDK is active).
Replace --sysroot dependency on xlings subos with fine-grained -isystem
paths derived from sibling xpkgs payloads (glibc, linux-headers).

Phase 2: PayloadPaths model
- Add PayloadPaths struct to Toolchain model (glibcInclude, glibcLib,
  linuxInclude)
- probe_payload_paths() finds sibling glibc and linux-headers xpkgs
  via find_sibling_package() which searches across all index prefixes
- Falls back to host /usr/include for linux kernel headers if no
  xpkg found

Phase 3: Payload-first flags
- flags.cppm: use -isystem for glibc + linux-headers instead of
  --sysroot; Clang with cfg uses --no-default-config + explicit flags
  including -fuse-ld=lld, --rtlib=compiler-rt, --unwindlib=libunwind
- stdmod.cppm: unified Clang cfg bypass on all platforms for std
  module precompile (no linker needed, so --no-default-config is safe)

Phase 4: Clang cfg fixup
- fixup_clang_cfg() rewrites clang++.cfg paths after payload copy,
  similar to fixup_gcc_specs() for GCC
- Called during `mcpp toolchain install llvm`

Phase 5: Sysroot dependency auto-install
- Toolchain install ensures glibc and linux-headers xpkgs are
  installed before the main toolchain package
GCC's include-fixed directory contains stdlib.h wrappers that use
#include_next to find the sysroot's stdlib.h. This mechanism only
works with --sysroot, not standalone -isystem paths.

Fix: for GCC, keep --sysroot from probe_sysroot() and supplement
with -isystem for linux kernel headers from payload when the probed
sysroot is missing them. For Clang, continue using --no-default-config
+ explicit -isystem (Clang doesn't have include-fixed).
When bypassing clang++.cfg, the cfg's -nostdinc++ and -stdlib=libc++
flags are also stripped. Without -nostdinc++, Clang may find host
libstdc++ headers before the payload's libc++ headers. Without
-stdlib=libc++, Clang defaults to libstdc++ runtime.

Also remove the /usr/include fallback for linux-headers — mixing
host headers with xpkg glibc causes bits/wordsize.h conflicts.
When GCC's probed sysroot (subos/default) is missing linux kernel
headers or glibc headers, symlink them from the payload xpkgs:
  - linux/, asm/, asm-generic/ ← scode-x-linux-headers xpkg
  - features.h, bits/, etc.    ← xim-x-glibc xpkg

This makes mcpp self-sufficient: it uses subos/default as a sysroot
directory for GCC's include-fixed mechanism, but actively populates
it from payload rather than depending on xlings init completeness.

Principle: subos is just a directory layout that mcpp manages.
Content comes from xpkgs payloads. Clang doesn't use subos at all
(--no-default-config + explicit -isystem from payload).
Clang on Windows auto-detects the MSVC version at compile time and
embeds it in module AST files (e.g. x86_64-pc-windows-msvc19.44.35227).
But -dumpmachine returns just x86_64-pc-windows-msvc (no version).

When MSVC updates a patch version (35226 → 35227), the fingerprint
didn't change, so mcpp reused cached std.pcm compiled for the old
version → "AST file was compiled for different target" error.

Fix: probe clang's -print-effective-triple which includes the MSVC
version, and append to driverIdent for fingerprint computation.

Also: ensure sysroot complete by symlinking linux kernel headers
from payload xpkgs into the GCC sysroot directory.
Refactor:
- Extract is_msvc_target() to model.cppm alongside is_musl_target()
- Replace 3 scattered tc.targetTriple.find("msvc") checks in
  clang.cppm, detect.cppm, provider.cppm

CI:
- Add ci-fresh-install.yml: validates first-time user install flow
  on all platforms (Linux, macOS, Windows) with zero cache.
- Tests: xlings install mcpp → self-host build → mcpp new → mcpp run
- Tests: import std with both GCC and LLVM (Linux), LLVM (macOS),
  LLVM+MSVC STL (Windows)
- Catches issues that cached CI misses: incomplete sysroot, stale
  cfg paths, missing xpkg dependencies
- macOS tarball: macos-aarch64 → macosx-arm64 (matches release assets)
- Windows: use explicit extract dir name
- All steps: export MCPP_VENDORED_XLINGS so freshly-built mcpp
  uses the installed xlings binary for package operations
- Use MCPP_BOOTSTRAP for bootstrap mcpp, MCPP for freshly-built
- Set MCPP_HOME explicitly for consistent sandbox location
Strip all env overrides, self-host builds, and MCPP_VENDORED_XLINGS.
Simulate exactly what a real user does:
  1. Install xlings
  2. xlings install mcpp -y
  3. mcpp new hello && cd hello && mcpp run
Git Bash on Windows mangles GITHUB_PATH entries. Switch to pwsh
which handles Windows paths natively.
This CI tests xlings-distributed mcpp (not the PR's code), so it
will fail until fixes are released to the xlings mcpp package.
Change to manual trigger only — run after a release to verify the
real end-to-end user experience.
Restore PR trigger. Flow:
1. Bootstrap xlings + old mcpp via xlings install
2. Build THIS PR's mcpp from source (self-host)
3. Use freshly-built mcpp to simulate fresh user: new → run
4. Linux: test both GCC (default) and LLVM toolchains
5. macOS: test LLVM (default)
6. Windows: test LLVM + MSVC STL
- Remove all extra env vars (MCPP_VENDORED_XLINGS, XLINGS_BIN)
- Just add xlings bin to GITHUB_PATH, everything else works
- Windows: set PATH inline in bootstrap step so xlings install mcpp
  can find xlings immediately
- Remove $MCPP variable, put freshly-built mcpp dir on PATH instead
- All steps just use `mcpp` command directly
- Linux LLVM step: add `mcpp self config --mirror GLOBAL` before
  toolchain install (CI runners are outside CN)
- Windows: use pwsh throughout for native path handling
Use xlings-installed mcpp directly to test:
  1. mcpp build (self-host compile)
  2. mcpp new hello → mcpp run (default toolchain)
  3. Linux: install LLVM → mcpp new → mcpp run (continue-on-error
     since released mcpp may not have latest fixes yet)

No extra env vars, no $MCPP variable — just mcpp on PATH via xlings.
The fresh-install CI workflow tests the released mcpp binary via
xlings, not this PR's code. Move it to its own branch/PR to keep
this PR focused on the sysroot fix.
Linux CI now tests all 3 toolchains with freshly-built mcpp:
  - GCC 16.1.0: mcpp new → build → run
  - musl-gcc 15.1.0: mcpp new → build → run
  - LLVM 20.1.7: install + mcpp new → build → run

Remove continue-on-error "Fresh user experience" tests from all
3 CI workflows — moved to separate ci-fresh-install.yml (PR #63).
Those tests validate the xlings-distributed mcpp binary, not the
PR's code, so they don't belong in the main CI gate.
@Sunrisepeak Sunrisepeak merged commit 3f9a369 into main May 21, 2026
1 of 2 checks passed
@Sunrisepeak Sunrisepeak deleted the fix/linux-sysroot-payload-first branch May 21, 2026 15:55
Sunrisepeak added a commit that referenced this pull request May 21, 2026
Main CI changes:
- ci-windows.yml: add explicit "Toolchain: LLVM — mcpp new → build → run"
  smoke test (previously only soft fallback via || true)
- All three main CIs (ci.yml, ci-macos.yml, ci-windows.yml) no longer
  have continue-on-error fresh user tests (removed in PR #62)

Toolchain coverage after this change:
  Linux:   gcc@16.1.0 ✓  musl-gcc@15.1.0 ✓  llvm@20.1.7 ✓
  macOS:   llvm@20.1.7 ✓ (comprehensive module tests)
  Windows: llvm@20.1.7 ✓ (explicit smoke test)

New workflow:
- ci-fresh-install.yml: tests released mcpp via xlings on clean machines
  (no caches). Validates real first-time user experience separately
  from PR code testing.
Sunrisepeak added a commit that referenced this pull request May 21, 2026
* ci: add fresh-install workflow for first-time user experience

Validates the released mcpp binary via xlings on all platforms:
  - Linux: xlings install mcpp → mcpp build (self) → mcpp new → mcpp run
           + install LLVM → mcpp new → mcpp run (continue-on-error)
  - macOS: xlings install mcpp → mcpp build → mcpp new → mcpp run
  - Windows: xlings install mcpp → mcpp build → mcpp new → mcpp run

No caches — simulates a clean machine. Catches issues like incomplete
sysroot, stale cfg paths, missing xpkg dependencies.

* ci: add Windows LLVM smoke test + ci-fresh-install workflow

Main CI changes:
- ci-windows.yml: add explicit "Toolchain: LLVM — mcpp new → build → run"
  smoke test (previously only soft fallback via || true)
- All three main CIs (ci.yml, ci-macos.yml, ci-windows.yml) no longer
  have continue-on-error fresh user tests (removed in PR #62)

Toolchain coverage after this change:
  Linux:   gcc@16.1.0 ✓  musl-gcc@15.1.0 ✓  llvm@20.1.7 ✓
  macOS:   llvm@20.1.7 ✓ (comprehensive module tests)
  Windows: llvm@20.1.7 ✓ (explicit smoke test)

New workflow:
- ci-fresh-install.yml: tests released mcpp via xlings on clean machines
  (no caches). Validates real first-time user experience separately
  from PR code testing.

* fix: ci.yml — install musl-gcc before setting default

The musl-gcc test step assumed the toolchain was already cached.
After cache clears, `toolchain default gcc@15.1.0-musl` fails with
"not installed". Add explicit install before setting default.

* ci: toolchain smoke tests build mcpp itself, not hello world

Each toolchain now builds mcpp from source (self-host) instead of
a trivial hello-world project. This validates that the toolchain
can compile a real C++23 modules codebase.

Coverage:
  Linux:   gcc@16.1.0 builds mcpp ✓
           musl-gcc@15.1.0 builds mcpp ✓
           llvm@20.1.7 builds mcpp ✓
  macOS:   llvm@20.1.7 builds mcpp ✓ (default, in self-host smoke)
  Windows: llvm@20.1.7 builds mcpp ✓ (new step)

* ci: each platform's every toolchain builds mcpp itself

Merge redundant self-host smoke steps into unified toolchain steps.
Each supported toolchain per platform does: clean → build mcpp → verify.

  Linux:   gcc@16.1.0 (+ test), musl-gcc@15.1.0, llvm@20.1.7
  macOS:   llvm@20.1.7
  Windows: llvm@20.1.7

* ci: complete CI architecture — dev CI + release validation CI

Two-tier CI design:

Tier 1 — PR/dev CI (ci.yml, ci-macos.yml, ci-windows.yml):
  Runs on every PR. Builds mcpp from source, runs tests + e2e,
  then verifies each platform's supported toolchains can build mcpp:
    Linux:   gcc@16.1.0 (+test), musl-gcc@15.1.0, llvm@20.1.7
    macOS:   llvm@20.1.7
    Windows: llvm@20.1.7

Tier 2 — Release validation CI (ci-fresh-install.yml):
  Manual trigger + daily schedule. Tests released mcpp via xlings
  on clean machines (no caches). For each platform, every supported
  toolchain: new → run (basic project) + build mcpp (self-host).
    Linux:   gcc, musl-gcc, llvm — new+run + build mcpp
    macOS:   llvm — new+run + build mcpp
    Windows: llvm — new+run + build mcpp

ci-fresh-install no longer runs on PRs (it tests released mcpp,
not PR code). Moved to workflow_dispatch + daily cron.

* ci: temporarily enable ci-fresh-install on PR for validation

* ci: ci-fresh-install — fix GCC version, remove PR trigger

- GCC: don't set toolchain default (GCC is already default after
  xlings install mcpp), fixes "gcc@ is not installed" error
- Remove pull_request trigger: ci-fresh-install tests released mcpp
  (not PR code), so it should not block PRs. Runs on push to main,
  workflow_dispatch, and daily schedule only.

* fix: save mcpp binary before clean in toolchain smoke tests

mcpp clean deletes target/ which contains the freshly-built binary.
Copy it to /tmp before running clean so it survives.

Also: clear Windows BMI cache (--bmi-cache) to avoid MSVC version
mismatch errors from stale std.pcm.

* fix: Windows packaging — find mcpp.exe from target/ instead of stale $MCPP_SELF

* fix: CI toolchain tests actually use correct toolchain + assertions

Review findings addressed:

1. Linux musl/LLVM steps were NOT actually using those toolchains
   (mcpp.toml [toolchain].default overrides global default).
   Fix:
   - musl: use `mcpp build --target x86_64-linux-musl`
   - LLVM: sed-override mcpp.toml before build
   - All steps: grep assertion on "Resolved <toolchain>" output

2. Windows: remove `|| true` on critical config/default commands
   that silently swallowed failures.

3. Windows: add explicit `mcpp new hello → mcpp run` smoke test
   (covers first-run user path that E2E skips due to missing
   fresh-sandbox capability).

4. All toolchain build steps: assert via grep that the expected
   toolchain was actually resolved.

* fix: CI — tee to file instead of /dev/stderr (Windows compat), precise grep

- Replace `tee /dev/stderr` with `tee build.log` (Git Bash on Windows
  doesn't have /dev/stderr)
- All grep assertions read from build.log file
- musl grep: "musl" → "Resolved gcc@15.1.0-musl" (more precise)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant