fix: skip STT after audio conversion failure#8704
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces tracking for failed audio conversions during the preprocessing stage to prevent sending failed records to the speech-to-text (STT) provider. It adds a set failed_record_ids to track failed records by their ID and skips STT processing for them. Unit tests are also added to verify this behavior. The reviewer suggests refactoring the duplicate WAV conversion and error handling logic for direct and reply records into a shared helper function to improve code maintainability.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| logger.debug(f"路径映射: {url} -> {component.url}") | ||
| message_chain[idx] = component | ||
|
|
||
| failed_record_ids: set[int] = set() |
There was a problem hiding this comment.
There is significant code duplication between processing direct Record components (lines 70-84) and processing Record components inside Reply chains (lines 87-103). Both blocks perform the exact same WAV conversion, temporary file tracking, and error handling.
Following the general rule to avoid code duplication when implementing similar functionality for direct vs. quoted attachments, we should refactor this logic into a shared helper function.
Here is an example of how you can refactor this:
async def _process_record(record_comp: Record, context: str = "") -> bool:
try:
original_path = await record_comp.convert_to_file_path()
record_path = await ensure_wav(original_path)
if record_path != original_path:
event.track_temporary_local_file(record_path)
record_comp.file = record_path
record_comp.path = record_path
return True
except Exception as e:
failed_record_ids.add(id(record_comp))
logger.warning(f"Voice processing {context}failed: {e}")
return FalseThen, the loops can be simplified to:
# Process direct Record components
message_chain = event.get_messages()
for idx, component in enumerate(message_chain):
if isinstance(component, Record):
if await _process_record(component):
message_chain[idx] = component
# Process Record components inside Reply chains
for component in event.get_messages():
if isinstance(component, Reply) and component.chain:
for idx, reply_comp in enumerate(component.chain):
if isinstance(reply_comp, Record):
if await _process_record(reply_comp, "in reply chain "):
component.chain[idx] = reply_compReferences
- When implementing similar functionality for different cases (e.g., direct vs. quoted attachments), refactor the logic into a shared helper function to avoid code duplication.
There was a problem hiding this comment.
Hey - I've found 1 issue
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location path="astrbot/core/pipeline/preprocess_stage/stage.py" line_range="68-72" />
<code_context>
logger.debug(f"路径映射: {url} -> {component.url}")
message_chain[idx] = component
+ failed_record_ids: set[int] = set()
+
# In here, we convert all Record components to wav format and update the file path.
</code_context>
<issue_to_address>
**issue (bug_risk):** Tracking failed records by `id()` may be brittle if components are reconstructed or copied.
This relies on the same `Record` instances being reused throughout the STT pass. If `event.get_messages()` or related logic can recreate or copy `Record`s, their `id()` values will change and `failed_record_ids` will no longer prevent reprocessing failed items. In that case, prefer a stable identifier (e.g., a `uid` field, file path, or a boolean flag on the record) instead of `id()` and confirm which invariant actually holds in this pipeline.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| failed_record_ids: set[int] = set() | ||
|
|
||
| # In here, we convert all Record components to wav format and update the file path. | ||
| message_chain = event.get_messages() | ||
| for idx, component in enumerate(message_chain): |
There was a problem hiding this comment.
issue (bug_risk): Tracking failed records by id() may be brittle if components are reconstructed or copied.
This relies on the same Record instances being reused throughout the STT pass. If event.get_messages() or related logic can recreate or copy Records, their id() values will change and failed_record_ids will no longer prevent reprocessing failed items. In that case, prefer a stable identifier (e.g., a uid field, file path, or a boolean flag on the record) instead of id() and confirm which invariant actually holds in this pipeline.
|
Checked the pipeline invariant behind this: |
Summary
Root cause
When
ensure_wav()failed, preprocessing logged a warning but left the originalRecordin the message chain. The STT pass then sent that unconverted file to the configured provider, producing a second, less useful provider error such as HTTP 400 for unsupported AMR input.Impact
A missing or failing ffmpeg installation no longer causes AstrBot to submit the known-unconverted audio to STT. Other valid voice components in the same message still get transcribed, and non-voice content remains untouched.
Validation
pytest tests/test_preprocess_stage.py tests/test_whisper_api_source.py -q(3 passed)python -m ruff check astrbot/core/pipeline/preprocess_stage/stage.py tests/test_preprocess_stage.pypython -m ruff format --check astrbot/core/pipeline/preprocess_stage/stage.py tests/test_preprocess_stage.pygit diff --checkFixes #8651
Summary by Sourcery
Ensure speech-to-text processing skips voice components whose audio conversion to WAV failed while leaving other message content unaffected.
Bug Fixes:
Tests: