Skip to content

fix: filter empty assistant messages in streaming requests#7725

Open
bugkeep wants to merge 2 commits intoAstrBotDevs:masterfrom
bugkeep:codex/fix-stream-empty-assistant
Open

fix: filter empty assistant messages in streaming requests#7725
bugkeep wants to merge 2 commits intoAstrBotDevs:masterfrom
bugkeep:codex/fix-stream-empty-assistant

Conversation

@bugkeep
Copy link
Copy Markdown
Contributor

@bugkeep bugkeep commented Apr 22, 2026

Summary

  • Share assistant-message cleanup between non-streaming and streaming OpenAI-compatible requests.
  • Filter assistant messages with empty content and no tool calls before streaming requests.
  • Treat empty list content like other empty assistant content and normalize empty tool-call assistant content to null.

Fixes #7721

Testing

  • uv run pytest tests/test_openai_source.py -k "query_stream_filters_empty_assistant_messages or query_filters_empty_list_assistant_message_without_tool_calls" -q
  • uv run pytest tests/test_openai_source.py -q --deselect tests/test_openai_source.py::test_file_uri_to_path_preserves_windows_drive_letter --deselect tests/test_openai_source.py::test_file_uri_to_path_preserves_windows_netloc_drive_letter --deselect tests/test_openai_source.py::test_file_uri_to_path_preserves_remote_netloc_as_unc_path --deselect tests/test_openai_source.py::test_prepare_chat_payload_materializes_context_localhost_file_uri_image_urls
  • uv run pytest tests/unit -q
  • uv run ruff format .
  • uv run ruff check .

Note: The full tests/test_openai_source.py -q run has the same four Windows file URI failures on current upstream astrbotdevs/master; they were excluded in the OpenAI adapter regression run above.

Summary by Sourcery

Normalize and filter empty assistant messages across OpenAI-compatible chat requests, including streaming, to avoid provider errors and ensure spec-compliant payloads.

Bug Fixes:

  • Filter assistant messages with empty content and no tool calls before both standard and streaming chat completion requests to prevent strict API errors.
  • Normalize assistant messages that have tool calls but empty content (including empty lists) to use null content in accordance with the OpenAI specification.

Tests:

  • Add regression tests verifying that streaming queries filter empty assistant messages while preserving valid tool-call messages.
  • Add tests ensuring empty-list assistant content without tool calls is removed from requests and that non-empty conversations are preserved.

@auto-assign auto-assign Bot requested review from Raven95676 and anka-afk April 22, 2026 07:08
@dosubot dosubot Bot added size:M This PR changes 30-99 lines, ignoring generated files. area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. labels Apr 22, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the assistant message cleaning logic in the OpenAI source provider into a dedicated method, _clean_assistant_messages_for_provider, and ensures it is applied to both standard and streaming queries. The logic filters out empty assistant messages without tool calls and normalizes empty content to None when tool calls are present. Additionally, new test cases were added to verify these behaviors. A review comment suggests optimizing the cleaning loop to improve readability and reduce redundant checks.

Comment on lines +83 to +98
for idx, msg in enumerate(messages):
if msg.get("role") == "assistant":
content = msg.get("content")
tool_calls = msg.get("tool_calls")

if not tool_calls and cls._is_empty_assistant_content(content):
logger.warning(
f"Filtered empty assistant message at index {idx} "
"(without tool calls)"
)
continue

if tool_calls and cls._is_empty_assistant_content(content):
msg["content"] = None

cleaned_messages.append(msg)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The logic for cleaning assistant messages can be optimized by checking if the content is empty first. This avoids redundant calls to _is_empty_assistant_content and only fetches tool_calls when necessary, improving both readability and performance.

Suggested change
for idx, msg in enumerate(messages):
if msg.get("role") == "assistant":
content = msg.get("content")
tool_calls = msg.get("tool_calls")
if not tool_calls and cls._is_empty_assistant_content(content):
logger.warning(
f"Filtered empty assistant message at index {idx} "
"(without tool calls)"
)
continue
if tool_calls and cls._is_empty_assistant_content(content):
msg["content"] = None
cleaned_messages.append(msg)
for idx, msg in enumerate(messages):
if msg.get("role") == "assistant":
content = msg.get("content")
if cls._is_empty_assistant_content(content):
tool_calls = msg.get("tool_calls")
if not tool_calls:
logger.warning(
f"Filtered empty assistant message at index {idx} "
"(without tool calls)"
)
continue
msg["content"] = None
cleaned_messages.append(msg)

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • Consider downgrading the logger.warning in _clean_assistant_messages_for_provider to info or debug, as filtering empty assistant messages is now expected behavior and may otherwise produce noisy logs in normal operation.
  • In _clean_assistant_messages_for_provider, you might want to document (or enforce) whether whitespace-only strings should be treated as empty content as well, to avoid provider-specific 400 errors if some backends treat them equivalently to "".
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- Consider downgrading the `logger.warning` in `_clean_assistant_messages_for_provider` to `info` or `debug`, as filtering empty assistant messages is now expected behavior and may otherwise produce noisy logs in normal operation.
- In `_clean_assistant_messages_for_provider`, you might want to document (or enforce) whether whitespace-only strings should be treated as empty content as well, to avoid provider-specific 400 errors if some backends treat them equivalently to `""`.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. size:M This PR changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]使用 DeepSeek Reasoner 开启工具调用时报 400 错误

1 participant