perf: use CoW (copy-on-write) cloning for directory copies#122
perf: use CoW (copy-on-write) cloning for directory copies#122helizaga merged 6 commits intocoderabbitai:mainfrom
Conversation
|
Important Review skippedThis PR was authored by the user configured for CodeRabbit reviews. By default, CodeRabbit skips reviewing PRs authored by this user. It's recommended to use a dedicated user account to post CodeRabbit review feedback. To trigger a single review, invoke the You can disable this status message by setting the Use the checkbox below for a quick retry:
WalkthroughAdds Changes
Sequence Diagram(s)sequenceDiagram
participant Caller as Script (copy_directories)
participant Helper as _fast_copy_dir()
participant OS as OS cp
participant FS as Filesystem
Caller->>Helper: copy(src, dst)
Helper->>Helper: detect/read cached OS (_fast_copy_os)
alt Linux
Helper->>OS: cp --reflink=auto -RP src dst
else macOS
Helper->>OS: cp -cRP src dst
alt partial CoW created / failure
Helper->>FS: remove partial dst
Helper->>OS: cp -RP src dst
end
else Other
Helper->>OS: cp -RP src dst
end
OS->>FS: perform copy (reflink or full)
Helper->>Caller: return status and log result
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@lib/copy.sh`:
- Around line 88-93: The darwin branch can leave a partially-populated
destination when cp -cRP "$src" "$dest" fails, so modify the fallback to clean
the destination before retrying: after the failed cp -cRP "$src" "$dest"
attempt, detect and remove the partially-created destination (e.g., rm -rf
"$dest" only when it was created by the failed attempt) or perform the initial
clone into a temporary directory and then atomically rename/move it into
"$dest", then run cp -RP "$src" "$dest" as the fallback; update the darwin case
where cp -cRP and cp -RP are invoked to implement this cleanup/temporal-staging
logic.
🧹 Nitpick comments (2)
lib/copy.sh (1)
84-85:detect_osis invoked on every call; consider caching for loop-heavy callers.
copy_directoriescalls_fast_copy_dironce per matched directory. Each invocation spawns a subshell fordetect_os. You could cache the result in a module-level variable on first call.Proposed optimization
+# Cached OS value for _fast_copy_dir; set on first call. +_fast_copy_os="" + _fast_copy_dir() { local src="$1" dest="$2" - local os - os=$(detect_os) + if [ -z "$_fast_copy_os" ]; then + _fast_copy_os=$(detect_os) + fi + local os="$_fast_copy_os"tests/copy_safety.bats (1)
89-101: Temp directory cleanup won't run if an assertion fails.If
[ -f "$dst/mydir/sub/file.txt" ]fails on line 98, lines 100 (rm -rf) is never reached. Same pattern in the other two tests. Consider using a BATSteardownfunction or the$BATS_TEST_TMPDIRbuilt-in (BATS 1.4+) for automatic cleanup.Example using teardown
+teardown() { + # Clean up any temp dirs created during tests + [ -n "${_test_tmpdir:-}" ] && rm -rf "$_test_tmpdir" +} + `@test` "_fast_copy_dir copies directory contents" { - local src dst - src=$(mktemp -d) - dst=$(mktemp -d) + _test_tmpdir=$(mktemp -d) + local src="$_test_tmpdir/src" dst="$_test_tmpdir/dst" + mkdir -p "$src" "$dst" mkdir -p "$src/mydir/sub" echo "hello" > "$src/mydir/sub/file.txt" _fast_copy_dir "$src/mydir" "$dst/" [ -f "$dst/mydir/sub/file.txt" ] [ "$(cat "$dst/mydir/sub/file.txt")" = "hello" ] - rm -rf "$src" "$dst" }
helizaga
left a comment
There was a problem hiding this comment.
Nice contribution! The CoW optimization is well-scoped, correctly handles fallbacks, and addresses a real performance need for includeDirs users copying large directories like node_modules.
Review notes:
- All 239 tests pass (3 new + 236 existing)
- Manually verified: directory copy works, symlinks preserved, CoW cloning active on macOS APFS
set -esafe at the call site (guarded byif)- Fallback chain is correct on all platforms
Fix pushed: I added a commit (a0ce2f5) to fix a pre-existing Bash 3.2 bug discovered during testing — copy_directories() had a case inside $() inside a heredoc, which Bash 3.2 (macOS system shell) can't parse. This caused copy_directories() to silently fail on every invocation. The fix hoists the find command into a variable before the heredoc. This bug existed on main and was not introduced by this PR, but fixing it here means the CoW optimization actually works end-to-end on macOS.
Add _fast_copy_dir() that leverages filesystem-level cloning (macOS APFS cp -c, Linux cp --reflink=auto) for near-instant directory copies on supported filesystems, with automatic fallback to standard cp -RP on ext4/NTFS/others. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Clean up partial clone output before fallback in Darwin _fast_copy_dir - Cache detect_os result to avoid repeated subshell calls - Use teardown function in copy_safety tests for reliable tmpdir cleanup Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Bash 3.2 cannot parse case/esac inside $() inside heredocs, causing copy_directories() to silently fail on macOS system Bash. Move the find command into a variable before the heredoc.
- Introduced a new job in the lint workflow to check if completion files are up to date. - Updated the generate-completions.sh script to include a new option for color output mode.
a0ce2f5 to
48a1630
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
lib/copy.sh (1)
78-90: Consider resetting the cache in tests or documenting the stale-cache risk.
_fast_copy_osis a module-level global that persists for the lifetime of the process. Ifdetect_oscould ever return different results (e.g., during testing with mockedOSTYPE), the cached value will be stale. This is unlikely in production but could cause confusing test failures.A simple
_fast_copy_os=""in testsetup()would make tests deterministic.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/copy.sh` around lines 78 - 90, The module-level cache variable _fast_copy_os used by _fast_copy_dir can become stale during tests that mock detect_os; add a small cache-reset mechanism and documentation so tests remain deterministic: create a tiny helper function (e.g., _fast_copy_reset_cache) that sets _fast_copy_os="" and call that in test setup (or alternatively move the cache into a function-scoped static or document the stale-cache risk in the module comment), and reference _fast_copy_os, _fast_copy_dir and detect_os when updating tests/docs so maintainers know how to clear the cache.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@lib/copy.sh`:
- Around line 98-101: The short-circuit "[ -e \"$ _clone_target\" ] && rm -rf
\"$ _clone_target\"" can fail under set -e and abort before the cp -RP fallback;
change this to an explicit guarded check such as: if [ -e "$_clone_target" ];
then rm -rf "$_clone_target"; fi (or append "|| true" to the rm command) so that
absence of the target does not cause the function to exit and the subsequent cp
-RP "$src" "$dest" still runs.
---
Nitpick comments:
In `@lib/copy.sh`:
- Around line 78-90: The module-level cache variable _fast_copy_os used by
_fast_copy_dir can become stale during tests that mock detect_os; add a small
cache-reset mechanism and documentation so tests remain deterministic: create a
tiny helper function (e.g., _fast_copy_reset_cache) that sets _fast_copy_os=""
and call that in test setup (or alternatively move the cache into a
function-scoped static or document the stale-cache risk in the module comment),
and reference _fast_copy_os, _fast_copy_dir and detect_os when updating
tests/docs so maintainers know how to clear the cache.
- Use if/fi instead of && for partial-clone cleanup to avoid exit code 1 leaking under set -e when target doesn't exist - Reset _fast_copy_os in test setup for deterministic test isolation
Summary
_fast_copy_dir()tolib/copy.shthat uses CoW file cloning when supported by the filesystemcp -cRP(clone) with fallback tocp -RPcp --reflink=auto -RP(auto-fallback built in)cp -RP(no behavior change)cp -RPincopy_directories()with_fast_copy_dir_fast_copy_dir(contents, symlinks, error handling)Motivation
Copying large directories like
node_modulesor.venvviagtr.copy.includeDirscan be slow. On CoW-capable filesystems (APFS, Btrfs, XFS), file cloning is near-instant regardless of directory size, as only metadata is copied.Risk
cp -RPbehaviorcopy_patterns) are intentionally unchanged, as CoW overhead outweighs benefit for small config files🤖 Generated with Claude Code
Summary by CodeRabbit
Chores
CI
Tests