docs(openai): document empty response from thinking-mode models by planetf1 · Pull Request #1062 · generative-computing/mellea

planetf1 · 2026-05-12T13:25:17Z

Misc PR

Type of PR

Bug Fix
New Feature
Documentation
Other

Description

Link to Issue: Fixes OpenAI backend silently returns empty string when model produces no text content (content=None) #1060

Document the empty-value symptom that thinking-mode models exhibit on the OpenAI backend, rather than enforcing it with a RuntimeError in post_processing().

A model returning content=None with finish_reason=stop and non-zero completion_tokens is the literal response from the API — the reasoning content is preserved on ModelOutputThunk._thinking. Raising would break legitimate thinking-mode flows and bypass Mellea's sampling and validator machinery, so the right fix is discoverability, not enforcement.

Adds an Empty value from a thinking-mode model subsection to docs/docs/integrations/openai.md covering:

How to diagnose: result.value, result.generation.usage, result._thinking
The vLLM/Qwen3 case as the most common concrete trigger
The chat_template_kwargs.enable_thinking=False workaround for callers who did not intend thinking mode
A pointer to result._thinking for callers who did

Testing

Tests added to the respective file if code was changed
New code has 100% coverage if code as added
Ensure existing tests and github automation passes (a maintainer will kick off the github automation when the rest of the PR is populated)

Docs-only change — pre-commit (including markdownlint) passes locally.

Attribution

AI coding assistants used

Assisted-by: Claude Code

github-actions · 2026-05-13T09:18:34Z

The PR description has been updated. Please fill out the template for your PR to be reviewed.

ajbozarth

A few notes inline — nothing on the error-path span cleanup before the raise since that's getting rewritten anyway.

jakelorocco

I think I disagree with this change. Users should be able to enable thinking and we should handle that gracefully. If a model only produces thinking tokens, that is the reality of the response we received.

We can maybe log a warning or have better documentation that it's a possibility; but I think this is expected behavior.

jakelorocco · 2026-05-18T14:44:17Z

+                'model_options={"extra_body": {"chat_template_kwargs": '
+                '{"enable_thinking": False}}}.'


If we go this route, we should advertise the Mellea specific ModelOption.THINKING model option instead.

While I was doing a re-review Claude made the following observation on this that's worth noting:

Jake's right that ModelOption.THINKING is the canonical Mellea-side knob, but there's a wrinkle worth flagging before we change the message.

The OpenAI backend currently maps ModelOption.THINKING to reasoning_effort (openai.py:672-686). For OpenAI proper / o-series that's correct, but for Qwen3 via vLLM the lever to disable thinking is chat_template_kwargs.enable_thinking, which is a different parameter — reasoning_effort=False would just get ignored (or rejected) by vLLM. So if we swap the error message to advertise model_options={ModelOption.THINKING: False} today, it'd send users at the exact failure mode this PR is trying to surface to a workaround that doesn't actually work for them.

Two options:

Land this PR as-is with the raw extra_body form (which is what works on vLLM/Qwen3 today), and open a follow-up to broaden ModelOption.THINKING in the OpenAI backend so THINKING=False also emits chat_template_kwargs.enable_thinking=False for compatible providers. Then update the message.

Do that broadening in this PR before advertising the abstraction.

I'd lean toward (1) to keep the fix scoped — the error message is already a big improvement over the silent empty string, and we shouldn't block it on a semantics change to THINKING.

Now docs only — the runtime change is being scrapped (see my comment above). The advertised workaround moves into the integration docs rather than an error message, so the ModelOption.THINKING vs extra_body question does not apply here any more.

Closes generative-computing#1060. When a thinking-mode model (e.g. Qwen3 via vLLM with --reasoning-parser) emits only reasoning tokens, the OpenAI backend faithfully returns content=None — surfacing as result.value == "" with non-zero completion_tokens. The reasoning trace is preserved on ModelOutputThunk._thinking. This is expected behaviour, not a backend bug, so document the symptom in the OpenAI integration troubleshooting section: how to diagnose, where the reasoning content lives, and how to disable thinking via chat_template_kwargs for vLLM/Qwen3. Assisted-by: Claude Code

planetf1 · 2026-05-19T10:23:38Z

I'm thinking now we should not make this change. I hit it when trying qwen as I didn't appreciate the different behaviour. (We did make another change in the backend relating to the response field). I agree with Jake in that we can't change behaviour - if no real response is returned, well that's the response - and why we have validators etc. Even a warning is potentially noise so probably not right either.

I therefore am scrapping the code change, and instead just offering a small docs tweak

jakelorocco

lgtm; opened #1093 to handle getting ModelOption.Thinking to work in these edge cases as well

github-actions Bot added the bug Something isn't working label May 12, 2026

planetf1 force-pushed the fix/1060-openai-empty-content branch from b37b587 to 601a224 Compare May 12, 2026 13:28

planetf1 marked this pull request as ready for review May 13, 2026 11:25

planetf1 requested a review from a team as a code owner May 13, 2026 11:25

planetf1 requested review from ajbozarth and jakelorocco May 13, 2026 11:25

ajbozarth reviewed May 13, 2026

View reviewed changes

Comment thread mellea/backends/openai.py Outdated

Comment thread mellea/backends/openai.py Outdated

Comment thread test/backends/test_openai_unit.py Outdated

planetf1 force-pushed the fix/1060-openai-empty-content branch from 42e83d9 to 5e01a8d Compare May 18, 2026 12:32

jakelorocco requested changes May 18, 2026

View reviewed changes

planetf1 changed the title ~~fix(backends): raise error when OpenAI backend receives content=None~~ docs(openai): document empty response from thinking-mode models May 19, 2026

github-actions Bot added the documentation Improvements or additions to documentation label May 19, 2026

planetf1 force-pushed the fix/1060-openai-empty-content branch from 5e01a8d to ae70f9a Compare May 19, 2026 10:43

planetf1 requested a review from jakelorocco May 19, 2026 10:44

jakelorocco mentioned this pull request May 19, 2026

fix: improve ModelOption.THINKING with vLLM through OpenAI #1093

Open

jakelorocco approved these changes May 19, 2026

View reviewed changes

planetf1 added this pull request to the merge queue May 19, 2026

Merged via the queue into generative-computing:main with commit 801bbfd May 19, 2026
8 checks passed

planetf1 deleted the fix/1060-openai-empty-content branch May 19, 2026 13:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs(openai): document empty response from thinking-mode models#1062

docs(openai): document empty response from thinking-mode models#1062
planetf1 merged 1 commit into
generative-computing:mainfrom
planetf1:fix/1060-openai-empty-content

planetf1 commented May 12, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 13, 2026

Uh oh!

ajbozarth left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jakelorocco left a comment

Uh oh!

jakelorocco May 18, 2026

Uh oh!

ajbozarth May 18, 2026

Uh oh!

planetf1 May 19, 2026

Uh oh!

planetf1 commented May 19, 2026

Uh oh!

jakelorocco left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		'model_options={"extra_body": {"chat_template_kwargs": '
		'{"enable_thinking": False}}}.'

Conversation

planetf1 commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Misc PR

Type of PR

Description

Testing

Attribution

Uh oh!

github-actions Bot commented May 13, 2026

Uh oh!

ajbozarth left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jakelorocco left a comment

Choose a reason for hiding this comment

Uh oh!

jakelorocco May 18, 2026

Choose a reason for hiding this comment

Uh oh!

ajbozarth May 18, 2026

Choose a reason for hiding this comment

Uh oh!

planetf1 May 19, 2026

Choose a reason for hiding this comment

Uh oh!

planetf1 commented May 19, 2026

Uh oh!

jakelorocco left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

planetf1 commented May 12, 2026 •

edited

Loading