perf: trim remaining Windows CI test hotspots#176
Merged
Conversation
Follow-up to #174 targeting the remaining slow Windows tests, which are subprocess-spawn dominated: - Consolidate sequential python3 invocations in hermes plugin/bridge tests and cache generated read-only artifacts per process. - Tighten fake codex app-server and LSP fixture timeouts/polling; skip the taskkill spawn for already-exited children on Windows. - Replace git subprocess spawns with in-process gix equivalents for current-branch, branch-exists, rev distance, remote URL, worktree root, and common-dir lookups, with a cheap .git-ancestor pre-flight before any remaining git fallback. - Compile tiktoken-rs/regex/base64 at opt-level 2 in dev builds and split the BPE vocabulary unit test so each process pays one model load. - Reuse the cached empty-schema store template in dashboard LCM fixes and memory eval fixtures.
|
added 2 commits
July 1, 2026 23:16
Consolidate the PYYAML_FALLBACK_PRELUDE/write_pyyaml_shim duplicates from agent_test and hermes_lcm_bridge_test into tests/common, drop the now-unused pyyaml_shim_pythonpath/python3_has_real_yaml probe helpers, and trim repeated subprocess-cost annotations to the canonical git_may_resolve_repo doc.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Follow-up to #174. The junit data from that run shows the remaining slow Windows tests are subprocess-spawn dominated (python3 via cmd shims, git, taskkill — each spawn ~100-500ms on Windows). This PR removes or consolidates those spawns:
agent_test,hermes_lcm_bridge_test— the 5.1s/4.6s top offenders and the 44.5s binary total): consolidate sequential python3 invocations into single scripts with attributable failures, and cache generated read-only plugin artifacts once per process.automation_backend_test,lsp_code_diagnostics_test): tighten fixture timeouts and polling; on Windows, skip thetaskkillspawn when the child has already exited (src/sessions/codex_app_server.rs).src/branch.rs,src/worktree.rs,src/tracedecay.rs): current-branch, branch-exists, rev-list distance, remote URL, worktree root, and common-dir lookups now use gix, with a cheap.git-ancestor pre-flight before any remaininggitfallback. Production code benefits too.Cargo.toml,src/dashboard/token_count.rs): compiletiktoken-rs/regex stack/base64at opt-level 2 in dev/test builds (the 2.8s unit test is a one-time vocabulary decode + regex compile at -O0) and split the vocabulary test so each nextest process pays one model load.dashboard_lcm_fixes_test,memory_eval_test): reuse the cached empty-schema store template from perf: cut Windows CI test fixture overhead #174.Verification
cargo fmt --all -- --check,git diff --check, conventional-commit checkTest plan