Skip to content

perf(recording): reduce recording memory pressure#1831

Open
ibbxx wants to merge 1 commit into
CapSoftware:mainfrom
ibbxx:bounty-109-recording-memory-pressure
Open

perf(recording): reduce recording memory pressure#1831
ibbxx wants to merge 1 commit into
CapSoftware:mainfrom
ibbxx:bounty-109-recording-memory-pressure

Conversation

@ibbxx
Copy link
Copy Markdown

@ibbxx ibbxx commented May 17, 2026

/claim #109

Summary

  • Stream cursor flushes directly to disk without cloning and pretty-printing the full cursor event. history every interval.
    • Skip periodic cursor flushes when no cursor move/click events were added since the last write.
    • Lower instant-mode MP4 muxer queue default from 240 to 96 frames and add CAP_MP4_MUXER_BUFFER_SIZE_INSTANT for instant-only tuning.
    • Serialize muxer env-var tests behind a shared lock so process-wide env mutation does not race under parallel cargo test.

Why this helps

Long recordings currently reallocateand serialize the entire cursor history on every flush tick, even when no cursor event count changes. Streaming writes and no-op flush skipping reduce duration-scaled allocations. Lowering the instant muxer queue reduces retained frame capacity under encode/upload backpressure while keeping the global override available and adding an instant-specific override.

Greptile Summary

This PR reduces recording memory pressure through two independent changes: streaming cursor data directly to disk with borrowed slices instead of cloning and pretty-printing the full history, and lowering the instant-mode MP4 muxer frame queue from 240 to 96 with a new CAP_MP4_MUXER_BUFFER_SIZE_INSTANT env override.

  • cursor.rs: flush_cursor_data now uses File::create + BufWriter + serde_json::to_writer with a BorrowedCursorEvents<'a> wrapper to avoid allocations; a length-comparison guard skips the flush tick when no new cursor events have been recorded since the last write.
  • macos.rs: get_mp4_muxer_buffer_size gains a two-level env-var lookup (instant-specific override → global override → compile-time default); existing env-mutation tests are refactored behind a shared ENV_LOCK mutex but the helper with_muxer_env is not panic-safe, which can cause mutex poisoning and cascade-fail the test module.

Confidence Score: 3/5

The production recording changes are straightforward and low-risk, but the test refactor in macos.rs introduces a fragility that can make the entire test module fail as a cascade from a single assertion error.

The cursor streaming and muxer queue changes are correct. The concern is the with_muxer_env helper: when any test closure panics, Rust poisons the ENV_LOCK mutex, making every subsequent lock().unwrap() call panic — turning one test failure into a whole-module failure. A secondary issue is that File::create truncates the cursor output file before serialization completes, leaving an empty file on I/O error rather than preserving the last valid snapshot.

crates/recording/src/output_pipeline/macos.rs — specifically the with_muxer_env test helper and its mutex-poisoning exposure on panic.

Important Files Changed

Filename Overview
crates/recording/src/cursor.rs Streaming write optimization with BorrowedCursorEvents avoids cloning; skip-flush guard uses length-based comparison; File::create now truncates before serialization, risking data loss on write error.
crates/recording/src/output_pipeline/macos.rs Instant-mode MP4 muxer queue lowered from 240→96; new CAP_MP4_MUXER_BUFFER_SIZE_INSTANT env var added with correct fallback logic; test helper with_muxer_env introduced but not panic-safe — mutex poisoning on test failure will cascade to all sibling tests.

Comments Outside Diff (1)

  1. crates/recording/src/cursor.rs, line 62-88 (link)

    P2 File truncated before serialization succeeds

    File::create opens and truncates the output file before serde_json::to_writer has written anything. If serialization fails (e.g. mid-stream I/O error on the writer), the file is left empty or with partial bytes flushed on BufWriter drop — permanently losing the previously valid cursor data. The old code built the full JSON string in memory first, so the file was not touched on serialization failure. A safer pattern is to write to a temporary file and rename it over the destination only after a successful flush.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: crates/recording/src/cursor.rs
    Line: 62-88
    
    Comment:
    **File truncated before serialization succeeds**
    
    `File::create` opens and truncates the output file before `serde_json::to_writer` has written anything. If serialization fails (e.g. mid-stream I/O error on the writer), the file is left empty or with partial bytes flushed on `BufWriter` drop — permanently losing the previously valid cursor data. The old code built the full JSON string in memory first, so the file was not touched on serialization failure. A safer pattern is to write to a temporary file and rename it over the destination only after a successful flush.
    
    How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
Fix the following 3 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 3
crates/recording/src/output_pipeline/macos.rs:1207-1215
**Mutex poisoning cascades on test failure**

If any test panics inside `with_muxer_env` (e.g. an `assert_eq!` fails), the thread unwinds while holding `_guard`. Rust marks the `Mutex` as poisoned when a thread panics while holding a lock. Every subsequent call to `ENV_LOCK.lock().unwrap()` will then panic with a `PoisonError`, causing all remaining tests in the module to fail regardless of their own correctness. Use `.unwrap_or_else(|e| e.into_inner())` to tolerate a poisoned mutex.

### Issue 2 of 3
crates/recording/src/cursor.rs:62-88
**File truncated before serialization succeeds**

`File::create` opens and truncates the output file before `serde_json::to_writer` has written anything. If serialization fails (e.g. mid-stream I/O error on the writer), the file is left empty or with partial bytes flushed on `BufWriter` drop — permanently losing the previously valid cursor data. The old code built the full JSON string in memory first, so the file was not touched on serialization failure. A safer pattern is to write to a temporary file and rename it over the destination only after a successful flush.

### Issue 3 of 3
crates/recording/src/output_pipeline/macos.rs:1229-1237
Env vars set inside `with_muxer_env` are not cleaned up if `test()` panics. After the panic is caught by the test harness, `CAP_MP4_MUXER_BUFFER_SIZE` / `CAP_MP4_MUXER_BUFFER_SIZE_INSTANT` remain set, which can pollute later tests. Wrapping `test()` in `catch_unwind` and resuming after cleanup ensures the vars are always removed.

```suggestion
            let result = std::panic::catch_unwind(std::panic::AssertUnwindSafe(test));

            unsafe {
                std::env::remove_var("CAP_MP4_MUXER_BUFFER_SIZE");
                std::env::remove_var("CAP_MP4_MUXER_BUFFER_SIZE_INSTANT");
            }

            match result {
                Ok(v) => v,
                Err(e) => std::panic::resume_unwind(e),
            }
        }
```

Reviews (1): Last reviewed commit: "perf(recording): reduce recording memory..." | Re-trigger Greptile

Greptile also left 2 inline comments on this PR.

@superagent-security superagent-security Bot added contributor:verified Contributor passed trust analysis. pr:verified PR passed security analysis. labels May 17, 2026
Comment on lines 1207 to +1215

static ENV_LOCK: std::sync::Mutex<()> = std::sync::Mutex::new(());

fn with_muxer_env<T>(
global: Option<&str>,
instant: Option<&str>,
test: impl FnOnce() -> T,
) -> T {
let _guard = ENV_LOCK.lock().unwrap();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Mutex poisoning cascades on test failure

If any test panics inside with_muxer_env (e.g. an assert_eq! fails), the thread unwinds while holding _guard. Rust marks the Mutex as poisoned when a thread panics while holding a lock. Every subsequent call to ENV_LOCK.lock().unwrap() will then panic with a PoisonError, causing all remaining tests in the module to fail regardless of their own correctness. Use .unwrap_or_else(|e| e.into_inner()) to tolerate a poisoned mutex.

Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/recording/src/output_pipeline/macos.rs
Line: 1207-1215

Comment:
**Mutex poisoning cascades on test failure**

If any test panics inside `with_muxer_env` (e.g. an `assert_eq!` fails), the thread unwinds while holding `_guard`. Rust marks the `Mutex` as poisoned when a thread panics while holding a lock. Every subsequent call to `ENV_LOCK.lock().unwrap()` will then panic with a `PoisonError`, causing all remaining tests in the module to fail regardless of their own correctness. Use `.unwrap_or_else(|e| e.into_inner())` to tolerate a poisoned mutex.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +1229 to +1237
let result = test();

unsafe {
std::env::remove_var("CAP_MP4_MUXER_BUFFER_SIZE");
std::env::remove_var("CAP_MP4_MUXER_BUFFER_SIZE_INSTANT");
}

result
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Env vars set inside with_muxer_env are not cleaned up if test() panics. After the panic is caught by the test harness, CAP_MP4_MUXER_BUFFER_SIZE / CAP_MP4_MUXER_BUFFER_SIZE_INSTANT remain set, which can pollute later tests. Wrapping test() in catch_unwind and resuming after cleanup ensures the vars are always removed.

Suggested change
let result = test();
unsafe {
std::env::remove_var("CAP_MP4_MUXER_BUFFER_SIZE");
std::env::remove_var("CAP_MP4_MUXER_BUFFER_SIZE_INSTANT");
}
result
}
let result = std::panic::catch_unwind(std::panic::AssertUnwindSafe(test));
unsafe {
std::env::remove_var("CAP_MP4_MUXER_BUFFER_SIZE");
std::env::remove_var("CAP_MP4_MUXER_BUFFER_SIZE_INSTANT");
}
match result {
Ok(v) => v,
Err(e) => std::panic::resume_unwind(e),
}
}
Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/recording/src/output_pipeline/macos.rs
Line: 1229-1237

Comment:
Env vars set inside `with_muxer_env` are not cleaned up if `test()` panics. After the panic is caught by the test harness, `CAP_MP4_MUXER_BUFFER_SIZE` / `CAP_MP4_MUXER_BUFFER_SIZE_INSTANT` remain set, which can pollute later tests. Wrapping `test()` in `catch_unwind` and resuming after cleanup ensures the vars are always removed.

```suggestion
            let result = std::panic::catch_unwind(std::panic::AssertUnwindSafe(test));

            unsafe {
                std::env::remove_var("CAP_MP4_MUXER_BUFFER_SIZE");
                std::env::remove_var("CAP_MP4_MUXER_BUFFER_SIZE_INSTANT");
            }

            match result {
                Ok(v) => v,
                Err(e) => std::panic::resume_unwind(e),
            }
        }
```

How can I resolve this? If you propose a fix, please make it concise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

🙋 Bounty claim contributor:verified Contributor passed trust analysis. pr:verified PR passed security analysis.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant