TricklePublisher.SegmentWriter masks CancelledError as TrickleSegmentWriteError on shutdown

## Symptom

Normal session shutdown produces tracebacks that look like errors but happen on every clean teardown:

```
livepeer_gateway.trickle_publisher.TrickleSegmentWriteError:
    Trickle segment writer timed out after 5.0s
```

```
Trickle segment close suppressed seq=0
...
TimeoutError
```

## Root cause (revised after investigation)

**Not** an `asyncio.wait_for` cancellation race (initial hypothesis was wrong). Verified via timestamped logs after migrating to `asyncio.timeout` — the timeouts still fire because they're **legitimate**:

1. Mid-stream, the HTTP post-body consumer for the active segment becomes wedged or closed (the orchestrator-side reader stops reading, or the connection drops).
2. `SegmentWriter`'s internal queue fills up.
3. The next `await self.queue.put(data)` in `_stream_pipe_to_trickle` blocks waiting for queue space.
4. After 5s, the timeout fires → `TrickleSegmentWriteError` → `MediaPublish` logs `"dropped segment ... mid-stream"` and enters drain mode.
5. On `/stream/stop`, `SegmentWriter.close()` calls `queue.put(None)` to signal end-of-stream — but the consumer is gone, so it hangs.
6. After 5s the timeout fires → `"Trickle segment close suppressed"` warning logged.

So during shutdown, every active segment with a wedged consumer adds 5s + a misleading-looking traceback before teardown completes. PR #7 / `live_grayscale/test.sh` reproduces it consistently.

## Proposed fix (needs work)

`SegmentWriter` needs to know when its associated HTTP post-task is dead and short-circuit `close()` / `write()` rather than block on `queue.put` forever. Options:

- Track the post-task on the segment; on `close()` check `post_task.done()` before `queue.put(None)`.
- Use `put_nowait()` and absorb `QueueFull`.
- Reduce the close-path timeout to e.g. 0.5s (still misleading but faster).

## What PR #11 does *not* fix

PR #11 migrates `wait_for` → `asyncio.timeout` which is **modernization**, not a fix for this issue. The shutdown noise persists after that migration; investigation here is separate.

## Reproduction

```bash
cd examples/runner/live_grayscale
docker compose up -d --wait --build
SKIP_VIEWER=1 bash test.sh        # passes
docker logs live_grayscale        # shows the tracebacks during shutdown
```

## Severity

Cosmetic — teardown completes correctly, but the noise causes false alarms during debugging and adds 5s × N-active-segments to shutdown latency.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TricklePublisher.SegmentWriter masks CancelledError as TrickleSegmentWriteError on shutdown #10

Symptom

Root cause (revised after investigation)

Proposed fix (needs work)

What PR #11 does not fix

Reproduction

Severity

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

TricklePublisher.SegmentWriter masks CancelledError as TrickleSegmentWriteError on shutdown #10

Description

Symptom

Root cause (revised after investigation)

Proposed fix (needs work)

What PR #11 does not fix

Reproduction

Severity

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions