Skip to content

Barge-in refactoring and remote session event support#4834

Merged
chenghao-mou merged 27 commits intochenghaomou/v1.5.0from
fix/final-cleanup
Feb 26, 2026
Merged

Barge-in refactoring and remote session event support#4834
chenghao-mou merged 27 commits intochenghaomou/v1.5.0from
fix/final-cleanup

Conversation

@chenghao-mou
Copy link
Member

@chenghao-mou chenghao-mou commented Feb 15, 2026

  • final clean up for barge-in
  • dropped BaseModel support
  • add remote session events for playground debug panel

devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

Replace flat turn-handling fields with a single turn_handling
TurnHandlingConfig field. Properties provide type-safe access to
the resolved endpointing/interruption sub-dicts. Remove bool
shorthand for interruption config — callers must use
{"enabled": False} instead.
theomonnom and others added 8 commits February 15, 2026 20:37
Primary release flow now runs from cloud-browser repo's build-native.yml.
This workflow is a manual fallback that takes a cloud-browser CI run ID
and creates a release from its artifacts.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

@chenghao-mou chenghao-mou changed the title Support remote session events Barge-in refactoring and remote session event support Feb 17, 2026
devin-ai-integration[bot]

This comment was marked as resolved.

@chenghao-mou chenghao-mou requested a review from a team February 18, 2026 14:51
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changes made to the new RemoteSession lgtm!

Co-authored-by: devin-ai-integration[bot] <158243242+devin-ai-integration[bot]@users.noreply.github.com>
@chenghao-mou chenghao-mou merged commit e7c3e15 into chenghaomou/v1.5.0 Feb 26, 2026
4 checks passed
@chenghao-mou chenghao-mou deleted the fix/final-cleanup branch February 26, 2026 11:26
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View 34 additional findings in Devin Review.

Open in Devin Review

Comment on lines 284 to +291
"""
if not self._interruption_enabled:
return
if not is_given(self._ignore_user_transcript_until):
return
if not self._transcript_buffer:
return

if not self._input_started_at:
self._transcript_buffer.clear()
self._ignore_user_transcript_until = NOT_GIVEN
if (
not self._interruption_enabled
or not is_given(self._ignore_user_transcript_until)
or not self._transcript_buffer
or self._input_started_at is None
):
self._reset_interruption_detection()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 _flush_held_transcripts prematurely clears _ignore_user_transcript_until when transcript buffer is empty

When the agent finishes speaking, _ignore_user_transcript_until is set to filter out STT transcripts that were generated from audio played during agent speech. Then _flush_held_transcripts is scheduled as an async task. If this task runs before any held transcripts have arrived from the STT pipeline (i.e., the buffer is still empty), the new code clears _ignore_user_transcript_until — the old code intentionally kept it.

Root Cause and Impact

In the old code at audio_recognition.py, the early-return conditions were separate:

if not self._transcript_buffer:
    return  # kept _ignore_user_transcript_until intact

In the new code, ALL four conditions trigger _reset_interruption_detection() which clears both the buffer and _ignore_user_transcript_until:

if (
    not self._interruption_enabled
    or not is_given(self._ignore_user_transcript_until)
    or not self._transcript_buffer  # <-- this branch now clears the ignore timestamp
    or self._input_started_at is None
):
    self._reset_interruption_detection()  # clears buffer AND _ignore_user_transcript_until
    return

This matters in the following race:

  1. on_end_of_agent_speech sets _ignore_user_transcript_until and schedules _flush_held_transcripts
  2. _flush_held_transcripts runs — buffer is still empty (STT hasn't delivered events yet)
  3. New code: _ignore_user_transcript_until is cleared
  4. A delayed STT event with a timestamp during agent speech arrives → _should_hold_stt_event returns False (ignore is cleared) → stale transcript leaks through

The old code kept _ignore_user_transcript_until so step 4 would still filter the stale event.

Impact: Stale transcripts from during agent speech can leak into the conversation, potentially causing the agent to respond to backchannel noise or the user's echo of the agent's own words.

Suggested change
"""
if not self._interruption_enabled:
return
if not is_given(self._ignore_user_transcript_until):
return
if not self._transcript_buffer:
return
if not self._input_started_at:
self._transcript_buffer.clear()
self._ignore_user_transcript_until = NOT_GIVEN
if (
not self._interruption_enabled
or not is_given(self._ignore_user_transcript_until)
or not self._transcript_buffer
or self._input_started_at is None
):
self._reset_interruption_detection()
if (
not self._interruption_enabled
or not is_given(self._ignore_user_transcript_until)
or self._input_started_at is None
):
self._reset_interruption_detection()
return
if not self._transcript_buffer:
return
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants