fix: filter empty assistant messages in streaming requests#7725
fix: filter empty assistant messages in streaming requests#7725bugkeep wants to merge 2 commits intoAstrBotDevs:masterfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request refactors the assistant message cleaning logic in the OpenAI source provider into a dedicated method, _clean_assistant_messages_for_provider, and ensures it is applied to both standard and streaming queries. The logic filters out empty assistant messages without tool calls and normalizes empty content to None when tool calls are present. Additionally, new test cases were added to verify these behaviors. A review comment suggests optimizing the cleaning loop to improve readability and reduce redundant checks.
| for idx, msg in enumerate(messages): | ||
| if msg.get("role") == "assistant": | ||
| content = msg.get("content") | ||
| tool_calls = msg.get("tool_calls") | ||
|
|
||
| if not tool_calls and cls._is_empty_assistant_content(content): | ||
| logger.warning( | ||
| f"Filtered empty assistant message at index {idx} " | ||
| "(without tool calls)" | ||
| ) | ||
| continue | ||
|
|
||
| if tool_calls and cls._is_empty_assistant_content(content): | ||
| msg["content"] = None | ||
|
|
||
| cleaned_messages.append(msg) |
There was a problem hiding this comment.
The logic for cleaning assistant messages can be optimized by checking if the content is empty first. This avoids redundant calls to _is_empty_assistant_content and only fetches tool_calls when necessary, improving both readability and performance.
| for idx, msg in enumerate(messages): | |
| if msg.get("role") == "assistant": | |
| content = msg.get("content") | |
| tool_calls = msg.get("tool_calls") | |
| if not tool_calls and cls._is_empty_assistant_content(content): | |
| logger.warning( | |
| f"Filtered empty assistant message at index {idx} " | |
| "(without tool calls)" | |
| ) | |
| continue | |
| if tool_calls and cls._is_empty_assistant_content(content): | |
| msg["content"] = None | |
| cleaned_messages.append(msg) | |
| for idx, msg in enumerate(messages): | |
| if msg.get("role") == "assistant": | |
| content = msg.get("content") | |
| if cls._is_empty_assistant_content(content): | |
| tool_calls = msg.get("tool_calls") | |
| if not tool_calls: | |
| logger.warning( | |
| f"Filtered empty assistant message at index {idx} " | |
| "(without tool calls)" | |
| ) | |
| continue | |
| msg["content"] = None | |
| cleaned_messages.append(msg) |
There was a problem hiding this comment.
Hey - I've left some high level feedback:
- Consider downgrading the
logger.warningin_clean_assistant_messages_for_providertoinfoordebug, as filtering empty assistant messages is now expected behavior and may otherwise produce noisy logs in normal operation. - In
_clean_assistant_messages_for_provider, you might want to document (or enforce) whether whitespace-only strings should be treated as empty content as well, to avoid provider-specific 400 errors if some backends treat them equivalently to"".
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Consider downgrading the `logger.warning` in `_clean_assistant_messages_for_provider` to `info` or `debug`, as filtering empty assistant messages is now expected behavior and may otherwise produce noisy logs in normal operation.
- In `_clean_assistant_messages_for_provider`, you might want to document (or enforce) whether whitespace-only strings should be treated as empty content as well, to avoid provider-specific 400 errors if some backends treat them equivalently to `""`.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
Summary
Fixes #7721
Testing
uv run pytest tests/test_openai_source.py -k "query_stream_filters_empty_assistant_messages or query_filters_empty_list_assistant_message_without_tool_calls" -quv run pytest tests/test_openai_source.py -q --deselect tests/test_openai_source.py::test_file_uri_to_path_preserves_windows_drive_letter --deselect tests/test_openai_source.py::test_file_uri_to_path_preserves_windows_netloc_drive_letter --deselect tests/test_openai_source.py::test_file_uri_to_path_preserves_remote_netloc_as_unc_path --deselect tests/test_openai_source.py::test_prepare_chat_payload_materializes_context_localhost_file_uri_image_urlsuv run pytest tests/unit -quv run ruff format .uv run ruff check .Note: The full
tests/test_openai_source.py -qrun has the same four Windows file URI failures on current upstreamastrbotdevs/master; they were excluded in the OpenAI adapter regression run above.Summary by Sourcery
Normalize and filter empty assistant messages across OpenAI-compatible chat requests, including streaming, to avoid provider errors and ensure spec-compliant payloads.
Bug Fixes:
Tests: