Barge-in refactoring and remote session event support#4834
Barge-in refactoring and remote session event support#4834chenghao-mou merged 27 commits intochenghaomou/v1.5.0from
Conversation
Replace flat turn-handling fields with a single turn_handling
TurnHandlingConfig field. Properties provide type-safe access to
the resolved endpointing/interruption sub-dicts. Remove bool
shorthand for interruption config — callers must use
{"enabled": False} instead.
bd82e82 to
ce8b953
Compare
Primary release flow now runs from cloud-browser repo's build-native.yml. This workflow is a manual fallback that takes a cloud-browser CI run ID and creates a release from its artifacts. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
changes made to the new RemoteSession lgtm!
Co-authored-by: devin-ai-integration[bot] <158243242+devin-ai-integration[bot]@users.noreply.github.com>
| """ | ||
| if not self._interruption_enabled: | ||
| return | ||
| if not is_given(self._ignore_user_transcript_until): | ||
| return | ||
| if not self._transcript_buffer: | ||
| return | ||
|
|
||
| if not self._input_started_at: | ||
| self._transcript_buffer.clear() | ||
| self._ignore_user_transcript_until = NOT_GIVEN | ||
| if ( | ||
| not self._interruption_enabled | ||
| or not is_given(self._ignore_user_transcript_until) | ||
| or not self._transcript_buffer | ||
| or self._input_started_at is None | ||
| ): | ||
| self._reset_interruption_detection() |
There was a problem hiding this comment.
🟡 _flush_held_transcripts prematurely clears _ignore_user_transcript_until when transcript buffer is empty
When the agent finishes speaking, _ignore_user_transcript_until is set to filter out STT transcripts that were generated from audio played during agent speech. Then _flush_held_transcripts is scheduled as an async task. If this task runs before any held transcripts have arrived from the STT pipeline (i.e., the buffer is still empty), the new code clears _ignore_user_transcript_until — the old code intentionally kept it.
Root Cause and Impact
In the old code at audio_recognition.py, the early-return conditions were separate:
if not self._transcript_buffer:
return # kept _ignore_user_transcript_until intactIn the new code, ALL four conditions trigger _reset_interruption_detection() which clears both the buffer and _ignore_user_transcript_until:
if (
not self._interruption_enabled
or not is_given(self._ignore_user_transcript_until)
or not self._transcript_buffer # <-- this branch now clears the ignore timestamp
or self._input_started_at is None
):
self._reset_interruption_detection() # clears buffer AND _ignore_user_transcript_until
returnThis matters in the following race:
on_end_of_agent_speechsets_ignore_user_transcript_untiland schedules_flush_held_transcripts_flush_held_transcriptsruns — buffer is still empty (STT hasn't delivered events yet)- New code:
_ignore_user_transcript_untilis cleared - A delayed STT event with a timestamp during agent speech arrives →
_should_hold_stt_eventreturnsFalse(ignore is cleared) → stale transcript leaks through
The old code kept _ignore_user_transcript_until so step 4 would still filter the stale event.
Impact: Stale transcripts from during agent speech can leak into the conversation, potentially causing the agent to respond to backchannel noise or the user's echo of the agent's own words.
| """ | |
| if not self._interruption_enabled: | |
| return | |
| if not is_given(self._ignore_user_transcript_until): | |
| return | |
| if not self._transcript_buffer: | |
| return | |
| if not self._input_started_at: | |
| self._transcript_buffer.clear() | |
| self._ignore_user_transcript_until = NOT_GIVEN | |
| if ( | |
| not self._interruption_enabled | |
| or not is_given(self._ignore_user_transcript_until) | |
| or not self._transcript_buffer | |
| or self._input_started_at is None | |
| ): | |
| self._reset_interruption_detection() | |
| if ( | |
| not self._interruption_enabled | |
| or not is_given(self._ignore_user_transcript_until) | |
| or self._input_started_at is None | |
| ): | |
| self._reset_interruption_detection() | |
| return | |
| if not self._transcript_buffer: | |
| return |
Was this helpful? React with 👍 or 👎 to provide feedback.
Uh oh!
There was an error while loading. Please reload this page.