Skip to content

fix: extend empty assistant message filter to _query_stream and add empty-list coverage#7723

Open
lingyun14beta wants to merge 7 commits intoAstrBotDevs:masterfrom
lingyun14beta:fix/empty-assistant-message-query-stream
Open

fix: extend empty assistant message filter to _query_stream and add empty-list coverage#7723
lingyun14beta wants to merge 7 commits intoAstrBotDevs:masterfrom
lingyun14beta:fix/empty-assistant-message-query-stream

Conversation

@lingyun14beta
Copy link
Copy Markdown

@lingyun14beta lingyun14beta commented Apr 22, 2026

Fixes #7721 .PR #7202 added empty assistant message filtering in _query (non-streaming) but
_query_stream (streaming) was left unprotected. DeepSeek Reasoner only returns
reasoning_content when making tool calls, leaving content as an empty string
or empty list after serialisation. Strict APIs like DeepSeek reject such messages
with HTTP 400. The existing filter in _query also did not cover content == [].

Modifications / 改动点

  • Extracted the filtering logic into a new _filter_empty_assistant_messages
    static method to avoid code duplication.
  • Called the method in both _query and _query_stream before the API call.
  • Extended coverage to handle content == [] (empty list) in addition to
    content == "" and content is None.
  • Added 9 unit/integration tests in tests/test_openai_source.py covering the
    new helper method and _query_stream filtering behaviour.
  • This is NOT a breaking change. / 这不是一个破坏性变更。

Screenshots or Test Results / 运行截图或测试结果

image image image The filter is working as expected — empty assistant messages are now caught and logged as warnings before the API call, and the conversation continues normally.

Checklist / 检查清单

  • 😊 If there are new features added in the PR, I have discussed it with the authors through issues/emails, etc.
    / 如果 PR 中有新加入的功能,已经通过 Issue / 邮件等方式和作者讨论过。

  • 👀 My changes have been well-tested, and "Verification Steps" and "Screenshots" have been provided above.
    / 我的更改经过了良好的测试,并已在上方提供了“验证步骤”和“运行截图”

  • 🤓 I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations in requirements.txt and pyproject.toml.
    / 我确保没有引入新依赖库,或者引入了新依赖库的同时将其添加到 requirements.txtpyproject.toml 文件相应位置。

  • 😮 My changes do not introduce malicious code.
    / 我的更改没有引入恶意代码。

Summary by Sourcery

Extend assistant message filtering to both streaming and non-streaming OpenAI-compatible queries and improve handling of empty assistant contents.

Bug Fixes:

  • Prevent strict OpenAI-compatible APIs from returning 400 errors by filtering or normalizing empty assistant messages, including those with empty-list content, before making chat completion calls.
  • Ensure tool-call assistant messages with empty content are preserved by converting their content to null instead of dropping them.
  • Avoid executing tools whose definitions are not present in the current tool set when parsing tool call completions.

Enhancements:

  • Extract shared assistant message filtering into a dedicated helper to keep query implementations consistent across streaming and non-streaming paths.
  • Allow the HTTP client factory to return no client when no proxy is configured, better reflecting its usage semantics.

Tests:

  • Add unit tests for the assistant message filtering helper covering empty string, null, empty list, tool-call scenarios, and no-op behavior when messages are absent.
  • Add streaming query tests to verify empty and empty-list assistant messages are filtered or normalized correctly before being sent to the API.

@dosubot dosubot Bot added size:M This PR changes 30-99 lines, ignoring generated files. area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. labels Apr 22, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a centralized method for filtering empty assistant messages to prevent 400 errors on strict APIs, applying it to both standard and streaming query paths, and includes comprehensive tests. Feedback identifies significant regressions in the tool call parsing logic, including increased complexity and the removal of necessary error handling for malformed JSON, and suggests a simplification for the empty content check.

Comment on lines +875 to +892
for tool in tools.func_list:
if (
tool_call.type == "function"
and tool.name == tool_call.function.name
):
# workaround for #1454
if isinstance(tool_call.function.arguments, str):
args = json.loads(tool_call.function.arguments)
except json.JSONDecodeError as e:
logger.error(f"解析参数失败: {e}")
args = {}
else:
args = tool_call.function.arguments
args_ls.append(args)
func_name_ls.append(tool_call.function.name)
tool_call_ids.append(tool_call.id)

# gemini-2.5 / gemini-3 series extra_content handling
extra_content = getattr(tool_call, "extra_content", None)
if extra_content is not None:
tool_call_extra_content_dict[tool_call.id] = extra_content

else:
args = tool_call.function.arguments
args_ls.append(args)
func_name_ls.append(tool_call.function.name)
tool_call_ids.append(tool_call.id)

# gemini-2.5 / gemini-3 series extra_content handling
extra_content = getattr(tool_call, "extra_content", None)
if extra_content is not None:
tool_call_extra_content_dict[tool_call.id] = extra_content
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This change introduces several regressions in the tool call parsing logic:

  1. Efficiency: It adds an unnecessary nested loop over tools.func_list, resulting in $O(N \times M)$ complexity where $N$ is the number of tool calls and $M$ is the number of available tools.
  2. Robustness: It removes the try...except block for JSON parsing of tool arguments. LLMs occasionally return malformed JSON, and removing this error handling will cause the entire request to crash.
  3. Correctness: It silently ignores tool calls that are not found in tools.func_list. This can lead to an inconsistent state where the response role is set to "tool" but no tool calls are actually processed, potentially causing errors in downstream logic.

This logic should be reverted to the previous implementation which was more robust and efficient.

Suggested change
for tool in tools.func_list:
if (
tool_call.type == "function"
and tool.name == tool_call.function.name
):
# workaround for #1454
if isinstance(tool_call.function.arguments, str):
args = json.loads(tool_call.function.arguments)
except json.JSONDecodeError as e:
logger.error(f"解析参数失败: {e}")
args = {}
else:
args = tool_call.function.arguments
args_ls.append(args)
func_name_ls.append(tool_call.function.name)
tool_call_ids.append(tool_call.id)
# gemini-2.5 / gemini-3 series extra_content handling
extra_content = getattr(tool_call, "extra_content", None)
if extra_content is not None:
tool_call_extra_content_dict[tool_call.id] = extra_content
else:
args = tool_call.function.arguments
args_ls.append(args)
func_name_ls.append(tool_call.function.name)
tool_call_ids.append(tool_call.id)
# gemini-2.5 / gemini-3 series extra_content handling
extra_content = getattr(tool_call, "extra_content", None)
if extra_content is not None:
tool_call_extra_content_dict[tool_call.id] = extra_content
if tool_call.type == "function":
# workaround for #1454
if isinstance(tool_call.function.arguments, str):
try:
args = json.loads(tool_call.function.arguments)
except json.JSONDecodeError as e:
logger.error(f"解析参数失败: {e}")
args = {}
else:
args = tool_call.function.arguments
args_ls.append(args)
func_name_ls.append(tool_call.function.name)
tool_call_ids.append(tool_call.id)
# gemini-2.5 / gemini-3 series extra_content handling
extra_content = getattr(tool_call, "extra_content", None)
if extra_content is not None:
tool_call_extra_content_dict[tool_call.id] = extra_content

Comment on lines +543 to +548
empty_content = (
content == ""
or content is None
or content == []
or (isinstance(content, list) and len(content) == 0)
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The check for empty content can be simplified. In Python, content == [] is sufficient to check for an empty list, making the additional isinstance and len check redundant.

                empty_content = (
                    content == ""
                    or content is None
                    or content == []
                )

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 3 issues, and left some high level feedback:

  • In _filter_empty_assistant_messages, the empty_content check redundantly tests both content == [] and isinstance(content, list) and len(content) == 0; consider simplifying to a single emptiness check (e.g., content in ("", None, []) or a small helper) to make the intent clearer.
  • The updated tool-calls handling in _parse_openai_completion now nests a loop over tools.func_list inside the for tool_call in choice.message.tool_calls loop; consider building a lookup (e.g., dict keyed by function name) or breaking once a match is found to avoid unnecessary iterations and make the control flow clearer.
  • The three new _query_stream tests share a lot of setup for provider, chunks, and the fake_create/fake_stream machinery; consider parametrizing or extracting a shared helper to reduce duplication and make future changes to the streaming fixture easier.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `_filter_empty_assistant_messages`, the `empty_content` check redundantly tests both `content == []` and `isinstance(content, list) and len(content) == 0`; consider simplifying to a single emptiness check (e.g., `content in ("", None, [])` or a small helper) to make the intent clearer.
- The updated tool-calls handling in `_parse_openai_completion` now nests a loop over `tools.func_list` inside the `for tool_call in choice.message.tool_calls` loop; consider building a lookup (e.g., dict keyed by function name) or breaking once a match is found to avoid unnecessary iterations and make the control flow clearer.
- The three new `_query_stream` tests share a lot of setup for `provider`, `chunks`, and the `fake_create`/`fake_stream` machinery; consider parametrizing or extracting a shared helper to reduce duplication and make future changes to the streaming fixture easier.

## Individual Comments

### Comment 1
<location path="astrbot/core/provider/sources/openai_source.py" line_range="872-881" />
<code_context>
-
-                if tool_call.type == "function":
-                    # workaround for #1454
-                    if isinstance(tool_call.function.arguments, str):
-                        try:
+                for tool in tools.func_list:
+                    if (
+                        tool_call.type == "function"
+                        and tool.name == tool_call.function.name
+                    ):
+                        # workaround for #1454
+                        if isinstance(tool_call.function.arguments, str):
                             args = json.loads(tool_call.function.arguments)
-                        except json.JSONDecodeError as e:
-                            logger.error(f"解析参数失败: {e}")
</code_context>
<issue_to_address>
**issue (bug_risk):** Restore defensive handling for invalid JSON in `tool_call.function.arguments`.

The previous version caught `json.JSONDecodeError`, logged, and fell back to `{}` for malformed JSON. The new code will now raise instead, changing behavior and potentially surfacing user-visible errors when providers send invalid arguments (which does occur). Please reintroduce a `try/except` or similar fallback so a single malformed tool call doesn’t fail the entire completion parsing flow.
</issue_to_address>

### Comment 2
<location path="astrbot/core/provider/sources/openai_source.py" line_range="875-874" />
<code_context>
+                for tool in tools.func_list:
</code_context>
<issue_to_address>
**question (bug_risk):** Clarify behavior when a tool call name doesn’t match any tool and avoid potential duplicate handling.

Because arguments are only appended when `tool.name == tool_call.function.name`:
- Tool calls with no matching entry in `tools.func_list` are now silently dropped.
- Duplicate tool names in `tools.func_list` will cause the same tool call to be handled multiple times.
If the goal is to accept only known tools, consider breaking after the first match and logging unmatched tool calls to surface misconfigurations.
</issue_to_address>

### Comment 3
<location path="tests/test_openai_source.py" line_range="1642-1651" />
<code_context>
+    assert payloads["messages"][1]["role"] == "user"
+
+
+def test_filter_empty_assistant_messages_removes_none_content():
+    payloads = {
+        "messages": [
+            {"role": "user", "content": "hello"},
+            {"role": "assistant", "content": None},
+            {"role": "user", "content": "world"},
+        ]
+    }
+    ProviderOpenAIOfficial._filter_empty_assistant_messages(payloads)
+    assert len(payloads["messages"]) == 2
+
</code_context>
<issue_to_address>
**suggestion (testing):** Strengthen the `None`-content test with more explicit assertions on list contents.

Right now this only checks the final list length. Please also assert that the two remaining items are the expected user messages (e.g., with explicit `role` and `content` checks) so the test fails if filtering ever mutates or reorders non‑assistant messages.

```suggestion
def test_filter_empty_assistant_messages_removes_none_content():
    payloads = {
        "messages": [
            {"role": "user", "content": "hello"},
            {"role": "assistant", "content": None},
            {"role": "user", "content": "world"},
        ]
    }
    ProviderOpenAIOfficial._filter_empty_assistant_messages(payloads)
    assert len(payloads["messages"]) == 2
    # Ensure the remaining messages are the original user messages in order
    assert payloads["messages"][0]["role"] == "user"
    assert payloads["messages"][0]["content"] == "hello"
    assert payloads["messages"][1]["role"] == "user"
    assert payloads["messages"][1]["content"] == "world"
```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +872 to +881
# 工具集未提供
# Should be unreachable
raise Exception("工具集未提供")

if tool_call.type == "function":
# workaround for #1454
if isinstance(tool_call.function.arguments, str):
try:
for tool in tools.func_list:
if (
tool_call.type == "function"
and tool.name == tool_call.function.name
):
# workaround for #1454
if isinstance(tool_call.function.arguments, str):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): Restore defensive handling for invalid JSON in tool_call.function.arguments.

The previous version caught json.JSONDecodeError, logged, and fell back to {} for malformed JSON. The new code will now raise instead, changing behavior and potentially surfacing user-visible errors when providers send invalid arguments (which does occur). Please reintroduce a try/except or similar fallback so a single malformed tool call doesn’t fail the entire completion parsing flow.

if tools is None:
# 工具集未提供
# Should be unreachable
raise Exception("工具集未提供")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question (bug_risk): Clarify behavior when a tool call name doesn’t match any tool and avoid potential duplicate handling.

Because arguments are only appended when tool.name == tool_call.function.name:

  • Tool calls with no matching entry in tools.func_list are now silently dropped.
  • Duplicate tool names in tools.func_list will cause the same tool call to be handled multiple times.
    If the goal is to accept only known tools, consider breaking after the first match and logging unmatched tool calls to surface misconfigurations.

Comment on lines +1642 to +1651
def test_filter_empty_assistant_messages_removes_none_content():
payloads = {
"messages": [
{"role": "user", "content": "hello"},
{"role": "assistant", "content": None},
{"role": "user", "content": "world"},
]
}
ProviderOpenAIOfficial._filter_empty_assistant_messages(payloads)
assert len(payloads["messages"]) == 2
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Strengthen the None-content test with more explicit assertions on list contents.

Right now this only checks the final list length. Please also assert that the two remaining items are the expected user messages (e.g., with explicit role and content checks) so the test fails if filtering ever mutates or reorders non‑assistant messages.

Suggested change
def test_filter_empty_assistant_messages_removes_none_content():
payloads = {
"messages": [
{"role": "user", "content": "hello"},
{"role": "assistant", "content": None},
{"role": "user", "content": "world"},
]
}
ProviderOpenAIOfficial._filter_empty_assistant_messages(payloads)
assert len(payloads["messages"]) == 2
def test_filter_empty_assistant_messages_removes_none_content():
payloads = {
"messages": [
{"role": "user", "content": "hello"},
{"role": "assistant", "content": None},
{"role": "user", "content": "world"},
]
}
ProviderOpenAIOfficial._filter_empty_assistant_messages(payloads)
assert len(payloads["messages"]) == 2
# Ensure the remaining messages are the original user messages in order
assert payloads["messages"][0]["role"] == "user"
assert payloads["messages"][0]["content"] == "hello"
assert payloads["messages"][1]["role"] == "user"
assert payloads["messages"][1]["content"] == "world"

@dosubot dosubot Bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:M This PR changes 30-99 lines, ignoring generated files. labels Apr 22, 2026
@dosubot dosubot Bot added size:M This PR changes 30-99 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Apr 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. size:M This PR changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]使用 DeepSeek Reasoner 开启工具调用时报 400 错误

1 participant