fix: extend empty assistant message filter to _query_stream and add empty-list coverage by lingyun14beta · Pull Request #7723 · AstrBotDevs/AstrBot

lingyun14beta · 2026-04-22T06:25:47Z

Fixes #7721 .PR #7202 added empty assistant message filtering in _query (non-streaming) but
_query_stream (streaming) was left unprotected. DeepSeek Reasoner only returns
reasoning_content when making tool calls, leaving content as an empty string
or empty list after serialisation. Strict APIs like DeepSeek reject such messages
with HTTP 400. The existing filter in _query also did not cover content == [].

Modifications / 改动点

Extracted the filtering logic into a new _filter_empty_assistant_messages
static method to avoid code duplication.
Called the method in both _query and _query_stream before the API call.
Extended coverage to handle content == [] (empty list) in addition to
content == "" and content is None.
Added 9 unit/integration tests in tests/test_openai_source.py covering the
new helper method and _query_stream filtering behaviour.

This is NOT a breaking change. / 这不是一个破坏性变更。

Screenshots or Test Results / 运行截图或测试结果

The filter is working as expected — empty assistant messages are now caught and logged as warnings before the API call, and the conversation continues normally.

Checklist / 检查清单

😊 If there are new features added in the PR, I have discussed it with the authors through issues/emails, etc.
/ 如果 PR 中有新加入的功能，已经通过 Issue / 邮件等方式和作者讨论过。
👀 My changes have been well-tested, and "Verification Steps" and "Screenshots" have been provided above.
/ 我的更改经过了良好的测试，并已在上方提供了“验证步骤”和“运行截图”。
🤓 I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations in requirements.txt and pyproject.toml.
/ 我确保没有引入新依赖库，或者引入了新依赖库的同时将其添加到 requirements.txt 和 pyproject.toml 文件相应位置。
😮 My changes do not introduce malicious code.
/ 我的更改没有引入恶意代码。

Summary by Sourcery

Extend assistant message filtering to both streaming and non-streaming OpenAI-compatible queries and improve handling of empty assistant contents.

Bug Fixes:

Prevent strict OpenAI-compatible APIs from returning 400 errors by filtering or normalizing empty assistant messages, including those with empty-list content, before making chat completion calls.
Ensure tool-call assistant messages with empty content are preserved by converting their content to null instead of dropping them.
Avoid executing tools whose definitions are not present in the current tool set when parsing tool call completions.

Enhancements:

Extract shared assistant message filtering into a dedicated helper to keep query implementations consistent across streaming and non-streaming paths.
Allow the HTTP client factory to return no client when no proxy is configured, better reflecting its usage semantics.

Tests:

Add unit tests for the assistant message filtering helper covering empty string, null, empty list, tool-call scenarios, and no-op behavior when messages are absent.
Add streaming query tests to verify empty and empty-list assistant messages are filtered or normalized correctly before being sent to the API.

gemini-code-assist

Code Review

This pull request introduces a centralized method for filtering empty assistant messages to prevent 400 errors on strict APIs, applying it to both standard and streaming query paths, and includes comprehensive tests. Feedback identifies significant regressions in the tool call parsing logic, including increased complexity and the removal of necessary error handling for malformed JSON, and suggests a simplification for the empty content check.

gemini-code-assist · 2026-04-22T06:27:37Z

+                for tool in tools.func_list:
+                    if (
+                        tool_call.type == "function"
+                        and tool.name == tool_call.function.name
+                    ):
+                        # workaround for #1454
+                        if isinstance(tool_call.function.arguments, str):
                            args = json.loads(tool_call.function.arguments)
-                        except json.JSONDecodeError as e:
-                            logger.error(f"解析参数失败: {e}")
-                            args = {}
-                    else:
-                        args = tool_call.function.arguments
-                    args_ls.append(args)
-                    func_name_ls.append(tool_call.function.name)
-                    tool_call_ids.append(tool_call.id)
-
-                    # gemini-2.5 / gemini-3 series extra_content handling
-                    extra_content = getattr(tool_call, "extra_content", None)
-                    if extra_content is not None:
-                        tool_call_extra_content_dict[tool_call.id] = extra_content
-
+                        else:
+                            args = tool_call.function.arguments
+                        args_ls.append(args)
+                        func_name_ls.append(tool_call.function.name)
+                        tool_call_ids.append(tool_call.id)
+
+                        # gemini-2.5 / gemini-3 series extra_content handling
+                        extra_content = getattr(tool_call, "extra_content", None)
+                        if extra_content is not None:
+                            tool_call_extra_content_dict[tool_call.id] = extra_content


This change introduces several regressions in the tool call parsing logic:

Efficiency: It adds an unnecessary nested loop over tools.func_list, resulting in $O(N \times M)$ complexity where $N$ is the number of tool calls and $M$ is the number of available tools.

Robustness: It removes the try...except block for JSON parsing of tool arguments. LLMs occasionally return malformed JSON, and removing this error handling will cause the entire request to crash.

Correctness: It silently ignores tool calls that are not found in tools.func_list. This can lead to an inconsistent state where the response role is set to "tool" but no tool calls are actually processed, potentially causing errors in downstream logic.

This logic should be reverted to the previous implementation which was more robust and efficient.

Suggested change

for tool in tools.func_list:

if (

tool_call.type == "function"

and tool.name == tool_call.function.name

):

# workaround for #1454

if isinstance(tool_call.function.arguments, str):

args = json.loads(tool_call.function.arguments)

except json.JSONDecodeError as e:

logger.error(f"解析参数失败: {e}")

args = {}

else:

args = tool_call.function.arguments

args_ls.append(args)

func_name_ls.append(tool_call.function.name)

tool_call_ids.append(tool_call.id)

# gemini-2.5 / gemini-3 series extra_content handling

extra_content = getattr(tool_call, "extra_content", None)

if extra_content is not None:

tool_call_extra_content_dict[tool_call.id] = extra_content

else:

args = tool_call.function.arguments

args_ls.append(args)

func_name_ls.append(tool_call.function.name)

tool_call_ids.append(tool_call.id)

# gemini-2.5 / gemini-3 series extra_content handling

extra_content = getattr(tool_call, "extra_content", None)

if extra_content is not None:

tool_call_extra_content_dict[tool_call.id] = extra_content

if tool_call.type == "function":

# workaround for #1454

if isinstance(tool_call.function.arguments, str):

try:

args = json.loads(tool_call.function.arguments)

except json.JSONDecodeError as e:

logger.error(f"解析参数失败: {e}")

args = {}

else:

args = tool_call.function.arguments

args_ls.append(args)

func_name_ls.append(tool_call.function.name)

tool_call_ids.append(tool_call.id)

# gemini-2.5 / gemini-3 series extra_content handling

extra_content = getattr(tool_call, "extra_content", None)

if extra_content is not None:

tool_call_extra_content_dict[tool_call.id] = extra_content

gemini-code-assist · 2026-04-22T06:27:37Z

+                empty_content = (
+                    content == ""
+                    or content is None
+                    or content == []
+                    or (isinstance(content, list) and len(content) == 0)
+                )


The check for empty content can be simplified. In Python, content == [] is sufficient to check for an empty list, making the additional isinstance and len check redundant.

empty_content = ( content == "" or content is None or content == [] )

sourcery-ai

Hey - I've found 3 issues, and left some high level feedback:

In _filter_empty_assistant_messages, the empty_content check redundantly tests both content == [] and isinstance(content, list) and len(content) == 0; consider simplifying to a single emptiness check (e.g., content in ("", None, []) or a small helper) to make the intent clearer.
The updated tool-calls handling in _parse_openai_completion now nests a loop over tools.func_list inside the for tool_call in choice.message.tool_calls loop; consider building a lookup (e.g., dict keyed by function name) or breaking once a match is found to avoid unnecessary iterations and make the control flow clearer.
The three new _query_stream tests share a lot of setup for provider, chunks, and the fake_create/fake_stream machinery; consider parametrizing or extracting a shared helper to reduce duplication and make future changes to the streaming fixture easier.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- In `_filter_empty_assistant_messages`, the `empty_content` check redundantly tests both `content == []` and `isinstance(content, list) and len(content) == 0`; consider simplifying to a single emptiness check (e.g., `content in ("", None, [])` or a small helper) to make the intent clearer.
- The updated tool-calls handling in `_parse_openai_completion` now nests a loop over `tools.func_list` inside the `for tool_call in choice.message.tool_calls` loop; consider building a lookup (e.g., dict keyed by function name) or breaking once a match is found to avoid unnecessary iterations and make the control flow clearer.
- The three new `_query_stream` tests share a lot of setup for `provider`, `chunks`, and the `fake_create`/`fake_stream` machinery; consider parametrizing or extracting a shared helper to reduce duplication and make future changes to the streaming fixture easier.

## Individual Comments

### Comment 1
<location path="astrbot/core/provider/sources/openai_source.py" line_range="872-881" />
<code_context>
-
-                if tool_call.type == "function":
-                    # workaround for #1454
-                    if isinstance(tool_call.function.arguments, str):
-                        try:
+                for tool in tools.func_list:
+                    if (
+                        tool_call.type == "function"
+                        and tool.name == tool_call.function.name
+                    ):
+                        # workaround for #1454
+                        if isinstance(tool_call.function.arguments, str):
                             args = json.loads(tool_call.function.arguments)
-                        except json.JSONDecodeError as e:
-                            logger.error(f"解析参数失败: {e}")
</code_context>
<issue_to_address>
**issue (bug_risk):** Restore defensive handling for invalid JSON in `tool_call.function.arguments`.

The previous version caught `json.JSONDecodeError`, logged, and fell back to `{}` for malformed JSON. The new code will now raise instead, changing behavior and potentially surfacing user-visible errors when providers send invalid arguments (which does occur). Please reintroduce a `try/except` or similar fallback so a single malformed tool call doesn’t fail the entire completion parsing flow.
</issue_to_address>

### Comment 2
<location path="astrbot/core/provider/sources/openai_source.py" line_range="875-874" />
<code_context>
+                for tool in tools.func_list:
</code_context>
<issue_to_address>
**question (bug_risk):** Clarify behavior when a tool call name doesn’t match any tool and avoid potential duplicate handling.

Because arguments are only appended when `tool.name == tool_call.function.name`:
- Tool calls with no matching entry in `tools.func_list` are now silently dropped.
- Duplicate tool names in `tools.func_list` will cause the same tool call to be handled multiple times.
If the goal is to accept only known tools, consider breaking after the first match and logging unmatched tool calls to surface misconfigurations.
</issue_to_address>

### Comment 3
<location path="tests/test_openai_source.py" line_range="1642-1651" />
<code_context>
+    assert payloads["messages"][1]["role"] == "user"
+
+
+def test_filter_empty_assistant_messages_removes_none_content():
+    payloads = {
+        "messages": [
+            {"role": "user", "content": "hello"},
+            {"role": "assistant", "content": None},
+            {"role": "user", "content": "world"},
+        ]
+    }
+    ProviderOpenAIOfficial._filter_empty_assistant_messages(payloads)
+    assert len(payloads["messages"]) == 2
+
</code_context>
<issue_to_address>
**suggestion (testing):** Strengthen the `None`-content test with more explicit assertions on list contents.

Right now this only checks the final list length. Please also assert that the two remaining items are the expected user messages (e.g., with explicit `role` and `content` checks) so the test fails if filtering ever mutates or reorders non‑assistant messages.

```suggestion
def test_filter_empty_assistant_messages_removes_none_content():
    payloads = {
        "messages": [
            {"role": "user", "content": "hello"},
            {"role": "assistant", "content": None},
            {"role": "user", "content": "world"},
        ]
    }
    ProviderOpenAIOfficial._filter_empty_assistant_messages(payloads)
    assert len(payloads["messages"]) == 2
    # Ensure the remaining messages are the original user messages in order
    assert payloads["messages"][0]["role"] == "user"
    assert payloads["messages"][0]["content"] == "hello"
    assert payloads["messages"][1]["role"] == "user"
    assert payloads["messages"][1]["content"] == "world"
```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2026-04-22T06:27:44Z

                    # 工具集未提供
                    # Should be unreachable
                    raise Exception("工具集未提供")
-
-                if tool_call.type == "function":
-                    # workaround for #1454
-                    if isinstance(tool_call.function.arguments, str):
-                        try:
+                for tool in tools.func_list:
+                    if (
+                        tool_call.type == "function"
+                        and tool.name == tool_call.function.name
+                    ):
+                        # workaround for #1454
+                        if isinstance(tool_call.function.arguments, str):


issue (bug_risk): Restore defensive handling for invalid JSON in tool_call.function.arguments.

The previous version caught json.JSONDecodeError, logged, and fell back to {} for malformed JSON. The new code will now raise instead, changing behavior and potentially surfacing user-visible errors when providers send invalid arguments (which does occur). Please reintroduce a try/except or similar fallback so a single malformed tool call doesn’t fail the entire completion parsing flow.

sourcery-ai · 2026-04-22T06:27:44Z

                if tools is None:
                    # 工具集未提供
                    # Should be unreachable
                    raise Exception("工具集未提供")


question (bug_risk): Clarify behavior when a tool call name doesn’t match any tool and avoid potential duplicate handling.

Because arguments are only appended when tool.name == tool_call.function.name:

Tool calls with no matching entry in tools.func_list are now silently dropped.

Duplicate tool names in tools.func_list will cause the same tool call to be handled multiple times.
If the goal is to accept only known tools, consider breaking after the first match and logging unmatched tool calls to surface misconfigurations.

sourcery-ai · 2026-04-22T06:27:44Z

+def test_filter_empty_assistant_messages_removes_none_content():
+    payloads = {
+        "messages": [
+            {"role": "user", "content": "hello"},
+            {"role": "assistant", "content": None},
+            {"role": "user", "content": "world"},
+        ]
+    }
+    ProviderOpenAIOfficial._filter_empty_assistant_messages(payloads)
+    assert len(payloads["messages"]) == 2


suggestion (testing): Strengthen the None-content test with more explicit assertions on list contents.

Right now this only checks the final list length. Please also assert that the two remaining items are the expected user messages (e.g., with explicit role and content checks) so the test fails if filtering ever mutates or reorders non‑assistant messages.

Suggested change

def test_filter_empty_assistant_messages_removes_none_content():

payloads = {

"messages": [

{"role": "user", "content": "hello"},

{"role": "assistant", "content": None},

{"role": "user", "content": "world"},

]

}

ProviderOpenAIOfficial._filter_empty_assistant_messages(payloads)

assert len(payloads["messages"]) == 2

def test_filter_empty_assistant_messages_removes_none_content():

payloads = {

"messages": [

{"role": "user", "content": "hello"},

{"role": "assistant", "content": None},

{"role": "user", "content": "world"},

]

}

ProviderOpenAIOfficial._filter_empty_assistant_messages(payloads)

assert len(payloads["messages"]) == 2

# Ensure the remaining messages are the original user messages in order

assert payloads["messages"][0]["role"] == "user"

assert payloads["messages"][0]["content"] == "hello"

assert payloads["messages"][1]["role"] == "user"

assert payloads["messages"][1]["content"] == "world"

lingyun14beta added 2 commits April 22, 2026 14:07

Update test_openai_source.py

7edf6bc

Update openai_source.py

a4f8439

dosubot Bot added size:M This PR changes 30-99 lines, ignoring generated files. area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. labels Apr 22, 2026

gemini-code-assist Bot reviewed Apr 22, 2026

View reviewed changes

sourcery-ai Bot reviewed Apr 22, 2026

View reviewed changes

lingyun14beta added 2 commits April 22, 2026 14:45

Update openai_source.py

6564230

Update tool_loop_agent_runner.py

65fcf4b

dosubot Bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:M This PR changes 30-99 lines, ignoring generated files. labels Apr 22, 2026

lingyun14beta added 2 commits April 22, 2026 15:16

Update tool_loop_agent_runner.py

bb44c84

Update tool_loop_agent_runner.py

4755a41

dosubot Bot added size:M This PR changes 30-99 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Apr 22, 2026

Update tool_loop_agent_runner.py

45856e7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: extend empty assistant message filter to _query_stream and add empty-list coverage#7723

fix: extend empty assistant message filter to _query_stream and add empty-list coverage#7723
lingyun14beta wants to merge 7 commits intoAstrBotDevs:masterfrom
lingyun14beta:fix/empty-assistant-message-query-stream

lingyun14beta commented Apr 22, 2026 •

edited by sourcery-ai Bot

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Apr 22, 2026

Uh oh!

gemini-code-assist Bot Apr 22, 2026

Uh oh!

sourcery-ai Bot left a comment

Uh oh!

sourcery-ai Bot Apr 22, 2026

Uh oh!

sourcery-ai Bot Apr 22, 2026

Uh oh!

sourcery-ai Bot Apr 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

lingyun14beta commented Apr 22, 2026 • edited by sourcery-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Modifications / 改动点

Screenshots or Test Results / 运行截图或测试结果

Checklist / 检查清单

Summary by Sourcery

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

sourcery-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

sourcery-ai Bot Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

sourcery-ai Bot Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

sourcery-ai Bot Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

lingyun14beta commented Apr 22, 2026 •

edited by sourcery-ai Bot

Loading