feat(stt): honor proxy in OpenAI Whisper provider#8668
Conversation
There was a problem hiding this comment.
Code Review
This pull request adds proxy support to the OpenAI Whisper API source by creating and passing a custom HTTP client configured with a proxy to AsyncOpenAI. It also includes unit tests to verify that the proxy configuration is correctly propagated. The reviewer identified a potential connection leak, noting that custom HTTP clients passed to AsyncOpenAI are not automatically closed by the OpenAI SDK. To resolve this, the reviewer suggested storing the custom HTTP client as an instance variable so it can be explicitly closed during termination.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| def _create_http_client(self, provider_config: dict) -> httpx.AsyncClient: | ||
| proxy = provider_config.get("proxy", "") | ||
| httpx_module: Any = httpx | ||
| try: | ||
| from openai import _base_client as openai_base_client | ||
|
|
||
| httpx_module = getattr(openai_base_client, "httpx", httpx) | ||
| except ImportError: | ||
| pass | ||
| return create_proxy_client( | ||
| "OpenAI Whisper", | ||
| proxy, | ||
| httpx_module=httpx_module, | ||
| ) |
There was a problem hiding this comment.
When a custom http_client is passed to AsyncOpenAI, calling self.client.close() does not close the custom client (the OpenAI SDK explicitly skips closing custom clients to avoid lifecycle conflicts). This leads to unclosed client/connection leaks.
To fix this, store the created client in self.http_client so that it can be closed in terminate():
async def terminate(self):
if self.client:
await self.client.close()
if hasattr(self, "http_client") and self.http_client:
await self.http_client.aclose()| def _create_http_client(self, provider_config: dict) -> httpx.AsyncClient: | |
| proxy = provider_config.get("proxy", "") | |
| httpx_module: Any = httpx | |
| try: | |
| from openai import _base_client as openai_base_client | |
| httpx_module = getattr(openai_base_client, "httpx", httpx) | |
| except ImportError: | |
| pass | |
| return create_proxy_client( | |
| "OpenAI Whisper", | |
| proxy, | |
| httpx_module=httpx_module, | |
| ) | |
| def _create_http_client(self, provider_config: dict) -> httpx.AsyncClient: | |
| proxy = provider_config.get("proxy", "") | |
| httpx_module: Any = httpx | |
| try: | |
| from openai import _base_client as openai_base_client | |
| httpx_module = getattr(openai_base_client, "httpx", httpx) | |
| except ImportError: | |
| pass | |
| self.http_client = create_proxy_client( | |
| "OpenAI Whisper", | |
| proxy, | |
| httpx_module=httpx_module, | |
| ) | |
| return self.http_client |
There was a problem hiding this comment.
Hey - I've found 2 issues, and left some high level feedback:
- In
_create_http_client, consider passingNonewhenproxyis not configured instead of an empty string socreate_proxy_clientcan fall back to its default behavior (e.g., respecting environment configuration) rather than receiving a potentially misleading empty proxy value. - The regression test reaches into
openai._base_client.httpx, which is an internal detail of the OpenAI SDK and may change; it might be more robust to assert thathttp_clientis constructed viacreate_proxy_clientwith the expected module passed in (e.g., by stubbing the module parameter) without relying on a private_base_clientattribute.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `_create_http_client`, consider passing `None` when `proxy` is not configured instead of an empty string so `create_proxy_client` can fall back to its default behavior (e.g., respecting environment configuration) rather than receiving a potentially misleading empty proxy value.
- The regression test reaches into `openai._base_client.httpx`, which is an internal detail of the OpenAI SDK and may change; it might be more robust to assert that `http_client` is constructed via `create_proxy_client` with the expected module passed in (e.g., by stubbing the module parameter) without relying on a private `_base_client` attribute.
## Individual Comments
### Comment 1
<location path="tests/test_whisper_api_source.py" line_range="40-49" />
<code_context>
+def test_provider_passes_configured_proxy_to_openai_http_client(monkeypatch):
</code_context>
<issue_to_address>
**suggestion (testing):** Consider adding a complementary test case for when no proxy is configured
Please also add a test where `ProviderOpenAIWhisperAPI` is constructed without a `proxy` key in `provider_config` (e.g. `test_provider_uses_default_http_client_when_proxy_missing`). That test should verify `create_proxy_client` is called with the correct provider label and the default `proxy` value (empty string or `None`, per the intended contract), so the no-proxy behavior remains covered during refactors.
Suggested implementation:
```python
def test_provider_passes_configured_proxy_to_openai_http_client(monkeypatch):
captured: dict[str, object] = {}
fake_http_client = object()
def fake_create_proxy_client(
provider_label: str,
proxy: str | None = None,
headers: dict[str, str] | None = None,
verify=None,
httpx_module=None,
):
captured["provider_label"] = provider_label
captured["proxy"] = proxy
captured["headers"] = headers
captured["verify"] = verify
captured["httpx_module"] = httpx_module
return fake_http_client
# Adjust the target string here to match where `create_proxy_client` is imported/used
monkeypatch.setattr(
"whisper_api_source.create_proxy_client",
fake_create_proxy_client,
)
provider = ProviderOpenAIWhisperAPI(
provider_config={
# include whatever keys are normally required by your provider_config
"provider_label": "openai_whisper_api",
"proxy": "http://configured-proxy.example.com",
}
)
# Trigger the code path that builds the HTTP client. Adjust this call as needed.
http_client = provider._get_http_client()
assert http_client is fake_http_client
assert captured["provider_label"] == "openai_whisper_api"
assert captured["proxy"] == "http://configured-proxy.example.com"
def test_provider_uses_default_http_client_when_proxy_missing(monkeypatch):
captured: dict[str, object] = {}
fake_http_client = object()
def fake_create_proxy_client(
provider_label: str,
proxy: str | None = None,
headers: dict[str, str] | None = None,
verify=None,
httpx_module=None,
):
captured["provider_label"] = provider_label
captured["proxy"] = proxy
captured["headers"] = headers
captured["verify"] = verify
captured["httpx_module"] = httpx_module
return fake_http_client
# Adjust the target string here to match where `create_proxy_client` is imported/used
monkeypatch.setattr(
"whisper_api_source.create_proxy_client",
fake_create_proxy_client,
)
# Construct the provider WITHOUT a `proxy` key in provider_config
provider = ProviderOpenAIWhisperAPI(
provider_config={
# include whatever keys are normally required by your provider_config
"provider_label": "openai_whisper_api",
# NOTE: no "proxy" key here on purpose
}
)
# Trigger the code path that builds the HTTP client. Adjust this call as needed.
http_client = provider._get_http_client()
assert http_client is fake_http_client
assert captured["provider_label"] == "openai_whisper_api"
# Depending on your intended contract, assert the default value here:
# - if default is "", use `== ""`
# - if default is None, use `is None`
assert captured["proxy"] in ("", None)
```
To integrate this cleanly with your existing codebase, you will likely need to:
1. **Adjust the monkeypatch target**: Replace `"whisper_api_source.create_proxy_client"` with the actual import path used in your production code (e.g. `"app.providers.whisper_api_source.create_proxy_client"`).
2. **Align `provider_config`**:
- Replace `"provider_label": "openai_whisper_api"` with the real key/value(s) your `ProviderOpenAIWhisperAPI` expects (e.g. `{"label": "whisper_api"}` or whatever your config schema is).
- Ensure all required config keys (API key, base URL, etc.) are included so the provider initializes correctly.
3. **Use the proper provider label assertion**:
- If `create_proxy_client` is called with some constant like `ProviderOpenAIWhisperAPI.PROVIDER_LABEL`, assert against that instead of the hardcoded `"openai_whisper_api"`.
4. **Match the HTTP client creation API**:
- If your provider does not expose `_get_http_client()`, replace that call with whatever actually triggers the creation of the OpenAI HTTP client (e.g. accessing `provider._openai_http_client`, calling `provider._ensure_client()`, etc.).
5. **Proxy default contract**:
- If the intended default proxy is always `None`, change `assert captured["proxy"] in ("", None)` to `assert captured["proxy"] is None`.
- If it is always an empty string, change it to `assert captured["proxy"] == ""`.
</issue_to_address>
### Comment 2
<location path="tests/test_whisper_api_source.py" line_range="11-16" />
<code_context>
from astrbot.core.provider.sources.whisper_api_source import ProviderOpenAIWhisperAPI
+class _FakeAsyncOpenAI:
+ def __init__(self, **kwargs):
+ self.kwargs = kwargs
+
+ async def close(self):
+ return None
+
+
</code_context>
<issue_to_address>
**suggestion (testing):** Tighten the fake client assertions to better lock in the HTTP client wiring
`_FakeAsyncOpenAI` currently only stores `kwargs`, and tests assert individual keys later. To better lock in the client wiring, also assert that `kwargs` either contains `http_client` (at minimum) or only has the expected keys (`api_key`, `base_url`, `timeout`, `http_client`). This will help catch any unexpected parameters passed to `AsyncOpenAI` in future changes.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| def test_provider_passes_configured_proxy_to_openai_http_client(monkeypatch): | ||
| captured: dict[str, object] = {} | ||
| fake_http_client = object() | ||
|
|
||
| def fake_create_proxy_client( | ||
| provider_label: str, | ||
| proxy: str | None = None, | ||
| headers: dict[str, str] | None = None, | ||
| verify=None, | ||
| httpx_module=None, |
There was a problem hiding this comment.
suggestion (testing): Consider adding a complementary test case for when no proxy is configured
Please also add a test where ProviderOpenAIWhisperAPI is constructed without a proxy key in provider_config (e.g. test_provider_uses_default_http_client_when_proxy_missing). That test should verify create_proxy_client is called with the correct provider label and the default proxy value (empty string or None, per the intended contract), so the no-proxy behavior remains covered during refactors.
Suggested implementation:
def test_provider_passes_configured_proxy_to_openai_http_client(monkeypatch):
captured: dict[str, object] = {}
fake_http_client = object()
def fake_create_proxy_client(
provider_label: str,
proxy: str | None = None,
headers: dict[str, str] | None = None,
verify=None,
httpx_module=None,
):
captured["provider_label"] = provider_label
captured["proxy"] = proxy
captured["headers"] = headers
captured["verify"] = verify
captured["httpx_module"] = httpx_module
return fake_http_client
# Adjust the target string here to match where `create_proxy_client` is imported/used
monkeypatch.setattr(
"whisper_api_source.create_proxy_client",
fake_create_proxy_client,
)
provider = ProviderOpenAIWhisperAPI(
provider_config={
# include whatever keys are normally required by your provider_config
"provider_label": "openai_whisper_api",
"proxy": "http://configured-proxy.example.com",
}
)
# Trigger the code path that builds the HTTP client. Adjust this call as needed.
http_client = provider._get_http_client()
assert http_client is fake_http_client
assert captured["provider_label"] == "openai_whisper_api"
assert captured["proxy"] == "http://configured-proxy.example.com"
def test_provider_uses_default_http_client_when_proxy_missing(monkeypatch):
captured: dict[str, object] = {}
fake_http_client = object()
def fake_create_proxy_client(
provider_label: str,
proxy: str | None = None,
headers: dict[str, str] | None = None,
verify=None,
httpx_module=None,
):
captured["provider_label"] = provider_label
captured["proxy"] = proxy
captured["headers"] = headers
captured["verify"] = verify
captured["httpx_module"] = httpx_module
return fake_http_client
# Adjust the target string here to match where `create_proxy_client` is imported/used
monkeypatch.setattr(
"whisper_api_source.create_proxy_client",
fake_create_proxy_client,
)
# Construct the provider WITHOUT a `proxy` key in provider_config
provider = ProviderOpenAIWhisperAPI(
provider_config={
# include whatever keys are normally required by your provider_config
"provider_label": "openai_whisper_api",
# NOTE: no "proxy" key here on purpose
}
)
# Trigger the code path that builds the HTTP client. Adjust this call as needed.
http_client = provider._get_http_client()
assert http_client is fake_http_client
assert captured["provider_label"] == "openai_whisper_api"
# Depending on your intended contract, assert the default value here:
# - if default is "", use `== ""`
# - if default is None, use `is None`
assert captured["proxy"] in ("", None)To integrate this cleanly with your existing codebase, you will likely need to:
- Adjust the monkeypatch target: Replace
"whisper_api_source.create_proxy_client"with the actual import path used in your production code (e.g."app.providers.whisper_api_source.create_proxy_client"). - Align
provider_config:- Replace
"provider_label": "openai_whisper_api"with the real key/value(s) yourProviderOpenAIWhisperAPIexpects (e.g.{"label": "whisper_api"}or whatever your config schema is). - Ensure all required config keys (API key, base URL, etc.) are included so the provider initializes correctly.
- Replace
- Use the proper provider label assertion:
- If
create_proxy_clientis called with some constant likeProviderOpenAIWhisperAPI.PROVIDER_LABEL, assert against that instead of the hardcoded"openai_whisper_api".
- If
- Match the HTTP client creation API:
- If your provider does not expose
_get_http_client(), replace that call with whatever actually triggers the creation of the OpenAI HTTP client (e.g. accessingprovider._openai_http_client, callingprovider._ensure_client(), etc.).
- If your provider does not expose
- Proxy default contract:
- If the intended default proxy is always
None, changeassert captured["proxy"] in ("", None)toassert captured["proxy"] is None. - If it is always an empty string, change it to
assert captured["proxy"] == "".
- If the intended default proxy is always
| class _FakeAsyncOpenAI: | ||
| def __init__(self, **kwargs): | ||
| self.kwargs = kwargs | ||
|
|
||
| async def close(self): | ||
| return None |
There was a problem hiding this comment.
suggestion (testing): Tighten the fake client assertions to better lock in the HTTP client wiring
_FakeAsyncOpenAI currently only stores kwargs, and tests assert individual keys later. To better lock in the client wiring, also assert that kwargs either contains http_client (at minimum) or only has the expected keys (api_key, base_url, timeout, http_client). This will help catch any unexpected parameters passed to AsyncOpenAI in future changes.
Summary
openai_whisper_apiSTT provider honor its configuredproxyvalue when constructing the OpenAI SDK client.httpxmodule when building the custom HTTP client, matching the chat-completion provider pattern.http_client.中文说明
这个 PR 把我们本地插件桥里验证过的最小修复拆到 AstrBot 本体:
openai_whisper_api语音转文字 provider 之前会读取api_base/timeout,但没有把配置里的proxy传给 OpenAI SDK,因此健康检查或实际转写在需要代理的环境中仍可能直连失败。这里复用现有create_proxy_client(),不改变模型、API key、转写流程或其他 STT 行为。Duplicate check
Checked open PRs for
Whisper proxy,STT proxy,openai_whisper_api, andwhisper_api_source proxy; no direct duplicate was found. Existing Whisper-related PRs appear to cover other options such as language/prompt, not proxy passthrough.Tests
python -m pytest tests/test_whisper_api_source.py -qpython -m ruff format --check .python -m ruff check .Summary by Sourcery
Make the OpenAI Whisper STT provider use a proxy-aware HTTP client consistent with other OpenAI providers.
Bug Fixes:
Enhancements:
Tests: