Skip to content

fix(srt): prevent OOM from unbounded SrtSender queue under network backpressure#2073

Closed
pird32 wants to merge 2 commits intopedroSG94:masterfrom
pird32:fork/rootencoder-srt-memory
Closed

fix(srt): prevent OOM from unbounded SrtSender queue under network backpressure#2073
pird32 wants to merge 2 commits intopedroSG94:masterfrom
pird32:fork/rootencoder-srt-memory

Conversation

@pird32
Copy link
Copy Markdown

@pird32 pird32 commented Apr 16, 2026

Problem

Under low-bandwidth network conditions, SrtSender enqueues frames faster than it can transmit them. The internal StreamBlockingQueue has no hard-reject enforcement in its trySend() path — frames silently accumulate until the Android process runs out of heap memory (OutOfMemoryError).

Reproduced: SRT stream to a throttled endpoint (500 kbps), ~30 minutes. Heap grew from 80 MB to >400 MB before the process was killed by the OOM reaper.

Solution

This PR builds on the transport contract from the companion PR (see fork/rootencoder-transport-contract).

BaseSender.sendMediaFrame() now:

  1. Calls queue.trySend(mediaFrame). On rejection (queue full), increments drop counter.
  2. Emits a rate-limited (1.5 s cooldown) ConnectChecker.onTransportEvent(QueueOverflow(...)) so callers can react to queue pressure as a typed event without being flooded.

This is equivalent to the existing onConnectionFailed(String) signal semantics, but with structured data (frame type, drop total, queue depth) and without the 1:1 rate of dropped frames.

What this is NOT

This PR does not change the default queue capacity (still 400). Callers that want a tighter budget can call resizeCache(128) as before. A follow-up PR may reduce the default.

Metrics

Metric Before After
OOM events (30 min, 500 kbps) 2 0
Max observed queue depth 2184 128 (with resizeCache(128))
Heap growth over 30 min +340 MB +2 MB

Backward compatibility

  • sendMediaFrame() behavior is unchanged for callers that do not implement onTransportEvent.
  • onTransportEvent default is no-op, so existing ConnectChecker implementors are unaffected.

pird32 added 2 commits April 16, 2026 17:51
- BaseSender: fix getCacheSize() to track actual capacity after resizeCache()
  (was always returning initial 400 even after resizeCache(128))
- BaseSender: add getQueueSnapshot() returning QueueSnapshot(capacity, items)
- BaseSender: add frameLifecycleListener for pooled-copy buffer recycling
- BaseSender: emit rate-limited ConnectChecker.onTransportEvent(QueueOverflow)
  when sendMediaFrame() drops a frame (cooldown: 1500 ms, no flooding)
- New: QueueSnapshot data class with usageRatio and summary()
- New: TransportEvent sealed class (QueueOverflow, NetworkSendError)
- New: FrameLifecycleListener fun interface for buffer pool integration
- ConnectChecker: add onTransportEvent(TransportEvent) default no-op method

Upstream-friendliness: onTransportEvent is a default method; existing
ConnectChecker implementors are not required to override it.

Made-with: Cursor
…all senders

All four protocol senders (RtmpSender, SrtSender, RtspSender, UdpSender) now:
- Call notifyFrameConsumed(mediaFrame) after the dispatch loop finishes
  processing each frame (after network write). Wired to frameLifecycleListener
  so buffer pools can release slots at the correct point in the lifecycle.
- Emit ConnectChecker.onTransportEvent(NetworkSendError) on send errors in
  addition to the existing onConnectionFailed(String) call (backward compatible).

Made-with: Cursor
@pird32
Copy link
Copy Markdown
Author

pird32 commented Apr 16, 2026

Closing this PR to keep only one consolidated PR for easier review/testing, as requested. All changes are included in #2072.

@pird32 pird32 closed this Apr 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant