perf(recording): reduce recording memory pressure#1831
Conversation
|
|
||
| static ENV_LOCK: std::sync::Mutex<()> = std::sync::Mutex::new(()); | ||
|
|
||
| fn with_muxer_env<T>( | ||
| global: Option<&str>, | ||
| instant: Option<&str>, | ||
| test: impl FnOnce() -> T, | ||
| ) -> T { | ||
| let _guard = ENV_LOCK.lock().unwrap(); |
There was a problem hiding this comment.
Mutex poisoning cascades on test failure
If any test panics inside with_muxer_env (e.g. an assert_eq! fails), the thread unwinds while holding _guard. Rust marks the Mutex as poisoned when a thread panics while holding a lock. Every subsequent call to ENV_LOCK.lock().unwrap() will then panic with a PoisonError, causing all remaining tests in the module to fail regardless of their own correctness. Use .unwrap_or_else(|e| e.into_inner()) to tolerate a poisoned mutex.
Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/recording/src/output_pipeline/macos.rs
Line: 1207-1215
Comment:
**Mutex poisoning cascades on test failure**
If any test panics inside `with_muxer_env` (e.g. an `assert_eq!` fails), the thread unwinds while holding `_guard`. Rust marks the `Mutex` as poisoned when a thread panics while holding a lock. Every subsequent call to `ENV_LOCK.lock().unwrap()` will then panic with a `PoisonError`, causing all remaining tests in the module to fail regardless of their own correctness. Use `.unwrap_or_else(|e| e.into_inner())` to tolerate a poisoned mutex.
How can I resolve this? If you propose a fix, please make it concise.| let result = test(); | ||
|
|
||
| unsafe { | ||
| std::env::remove_var("CAP_MP4_MUXER_BUFFER_SIZE"); | ||
| std::env::remove_var("CAP_MP4_MUXER_BUFFER_SIZE_INSTANT"); | ||
| } | ||
|
|
||
| result | ||
| } |
There was a problem hiding this comment.
Env vars set inside
with_muxer_env are not cleaned up if test() panics. After the panic is caught by the test harness, CAP_MP4_MUXER_BUFFER_SIZE / CAP_MP4_MUXER_BUFFER_SIZE_INSTANT remain set, which can pollute later tests. Wrapping test() in catch_unwind and resuming after cleanup ensures the vars are always removed.
| let result = test(); | |
| unsafe { | |
| std::env::remove_var("CAP_MP4_MUXER_BUFFER_SIZE"); | |
| std::env::remove_var("CAP_MP4_MUXER_BUFFER_SIZE_INSTANT"); | |
| } | |
| result | |
| } | |
| let result = std::panic::catch_unwind(std::panic::AssertUnwindSafe(test)); | |
| unsafe { | |
| std::env::remove_var("CAP_MP4_MUXER_BUFFER_SIZE"); | |
| std::env::remove_var("CAP_MP4_MUXER_BUFFER_SIZE_INSTANT"); | |
| } | |
| match result { | |
| Ok(v) => v, | |
| Err(e) => std::panic::resume_unwind(e), | |
| } | |
| } |
Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/recording/src/output_pipeline/macos.rs
Line: 1229-1237
Comment:
Env vars set inside `with_muxer_env` are not cleaned up if `test()` panics. After the panic is caught by the test harness, `CAP_MP4_MUXER_BUFFER_SIZE` / `CAP_MP4_MUXER_BUFFER_SIZE_INSTANT` remain set, which can pollute later tests. Wrapping `test()` in `catch_unwind` and resuming after cleanup ensures the vars are always removed.
```suggestion
let result = std::panic::catch_unwind(std::panic::AssertUnwindSafe(test));
unsafe {
std::env::remove_var("CAP_MP4_MUXER_BUFFER_SIZE");
std::env::remove_var("CAP_MP4_MUXER_BUFFER_SIZE_INSTANT");
}
match result {
Ok(v) => v,
Err(e) => std::panic::resume_unwind(e),
}
}
```
How can I resolve this? If you propose a fix, please make it concise.
/claim #109
Summary
Why this helps
Long recordings currently reallocateand serialize the entire cursor history on every flush tick, even when no cursor event count changes. Streaming writes and no-op flush skipping reduce duration-scaled allocations. Lowering the instant muxer queue reduces retained frame capacity under encode/upload backpressure while keeping the global override available and adding an instant-specific override.
Greptile Summary
This PR reduces recording memory pressure through two independent changes: streaming cursor data directly to disk with borrowed slices instead of cloning and pretty-printing the full history, and lowering the instant-mode MP4 muxer frame queue from 240 to 96 with a new
CAP_MP4_MUXER_BUFFER_SIZE_INSTANTenv override.flush_cursor_datanow usesFile::create+BufWriter+serde_json::to_writerwith aBorrowedCursorEvents<'a>wrapper to avoid allocations; a length-comparison guard skips the flush tick when no new cursor events have been recorded since the last write.get_mp4_muxer_buffer_sizegains a two-level env-var lookup (instant-specific override → global override → compile-time default); existing env-mutation tests are refactored behind a sharedENV_LOCKmutex but the helperwith_muxer_envis not panic-safe, which can cause mutex poisoning and cascade-fail the test module.Confidence Score: 3/5
The production recording changes are straightforward and low-risk, but the test refactor in macos.rs introduces a fragility that can make the entire test module fail as a cascade from a single assertion error.
The cursor streaming and muxer queue changes are correct. The concern is the with_muxer_env helper: when any test closure panics, Rust poisons the ENV_LOCK mutex, making every subsequent lock().unwrap() call panic — turning one test failure into a whole-module failure. A secondary issue is that File::create truncates the cursor output file before serialization completes, leaving an empty file on I/O error rather than preserving the last valid snapshot.
crates/recording/src/output_pipeline/macos.rs — specifically the with_muxer_env test helper and its mutex-poisoning exposure on panic.
Important Files Changed
Comments Outside Diff (1)
crates/recording/src/cursor.rs, line 62-88 (link)File::createopens and truncates the output file beforeserde_json::to_writerhas written anything. If serialization fails (e.g. mid-stream I/O error on the writer), the file is left empty or with partial bytes flushed onBufWriterdrop — permanently losing the previously valid cursor data. The old code built the full JSON string in memory first, so the file was not touched on serialization failure. A safer pattern is to write to a temporary file and rename it over the destination only after a successful flush.Prompt To Fix With AI
Prompt To Fix All With AI
Reviews (1): Last reviewed commit: "perf(recording): reduce recording memory..." | Re-trigger Greptile