Skip to content

fix(kosong): add default 120s timeout to create_openai_client#2409

Open
wintrover wants to merge 2 commits into
MoonshotAI:mainfrom
wintrover:fix/openai-default-timeout
Open

fix(kosong): add default 120s timeout to create_openai_client#2409
wintrover wants to merge 2 commits into
MoonshotAI:mainfrom
wintrover:fix/openai-default-timeout

Conversation

@wintrover
Copy link
Copy Markdown

@wintrover wintrover commented May 31, 2026

Problem

create_openai_client() in kosong/chat_provider/openai_common.py does not pass a timeout to AsyncOpenAI(). The OpenAI SDK defaults to 600s, so when an upstream proxy times out earlier (e.g. MiMo API proxy at ~300s), the client silently waits another 5+ minutes before the SDK timeout fires. Users see "infinite loading."

Root Cause

22:43:43  API call starts (450K tokens, openai_legacy → MiMo proxy)
22:48:48  APIConnectionError after 5m5s (proxy-side timeout)
          → connection recovery → retrying once → also hangs
22:49:29  User Ctrl+C after 41s

The AsyncOpenAI client inherits the SDK default of 600s timeout. The proxy times out at ~300s, leaving the client hanging.

Fix

Inject httpx.Timeout(120.0, connect=30.0) as a default when the caller does not specify one. Callers can still override by passing timeout=... in client_kwargs.

def create_openai_client(*, api_key, base_url, client_kwargs):
    kwargs = dict(client_kwargs)
    if "timeout" not in kwargs:
        kwargs["timeout"] = httpx.Timeout(120.0, connect=30.0)
    return AsyncOpenAI(api_key=api_key, base_url=base_url, **kwargs)

Trade-offs

Parameter Value Rationale
connect 30s TCP handshake
read/write/pool 120s Covers normal large-context requests (30-60s typical)
  • DeepSeek, OpenAI: responses typically <30s → no impact
  • MiMo large context: 30-60s normal → 120s provides headroom
  • Fails 4x faster than current 600s default

Fixes #2384 (for openai_legacy providers)


Open in Devin Review

wintrover added 2 commits May 26, 2026 23:03
…-trip

- contrib/deepseek/deepseek_proxy.py: zero-dependency proxy for DeepSeek API
  that injects thinking mode and stores reasoning_content for multi-turn
  reliability
- contrib/deepseek/config.toml.example: example Kimi CLI config showing both
  native (OpenAILegacy) and proxy approaches
- contrib/deepseek/README.md: comprehensive documentation explaining native
  support vs proxy approach, configuration, and related code references

The OpenAILegacy provider's reasoning_key parameter already natively supports
DeepSeek's reasoning_content field — no source patches needed.
When the caller does not specify a timeout, inject httpx.Timeout(120.0,
connect=30.0) instead of relying on the OpenAI SDK's 600s default.

Without this, openai_legacy and openai_responses providers appear
"stuck" when an upstream proxy times out earlier (e.g. MiMo API
proxy at ~300s) — the client silently waits another 5 minutes before
the SDK timeout fires, and the user sees infinite loading.

Fixes MoonshotAI#2384 (for openai_legacy providers)
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 potential issues.

View 4 additional findings in Devin Review.

Open in Devin Review

Comment on lines +79 to +86
logging.basicConfig(
level=logging.INFO,
format="%(asctime)s [%(levelname)s] %(message)s",
handlers=[
logging.FileHandler(LOG_FILE, mode="a"),
logging.StreamHandler(sys.stderr),
],
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Logging FileHandler created at module level before log directory exists

The logging.basicConfig call at module level (line 79–86) instantiates a logging.FileHandler(LOG_FILE) for ~/.kimi-code/proxy.log. This executes at import time, before main() runs os.makedirs(os.path.dirname(LOG_FILE), exist_ok=True) at line 333. If the ~/.kimi-code/ directory does not yet exist (e.g. on first run), FileHandler.__init__ calls open(...) which raises FileNotFoundError, crashing the script before it ever reaches main().

Prompt for agents
The fix is to ensure the log directory exists before configuring the logging handlers. Move the `os.makedirs(os.path.dirname(LOG_FILE), exist_ok=True)` call from `main()` (line 333) to just before the `logging.basicConfig(...)` call at module level (around line 78). This way the directory is guaranteed to exist when the FileHandler tries to open the log file. Alternatively, move the entire logging configuration into `main()` so it runs after the directory creation.
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines +152 to +165
for key, value in resp.headers.items():
k = key.lower()
if k not in (
"transfer-encoding",
"content-encoding",
"connection",
"keep-alive",
):
try:
self.send_header(key, value)
except Exception:
pass
# Override content-length to match our read body
self.send_header("Content-Length", str(len(body)))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Duplicate Content-Length headers sent in _send_upstream_response

In _send_upstream_response, the header-copying loop (lines 152–163) forwards all upstream headers except transfer-encoding, content-encoding, connection, and keep-alive — but content-length is not in the exclusion list. After the loop, line 165 adds another Content-Length header. Since BaseHTTPRequestHandler.send_header appends rather than replaces, this results in duplicate Content-Length headers in the response. If urllib decompressed the body (after stripping content-encoding), the two values will differ, which violates HTTP spec and can confuse clients.

Suggested change
for key, value in resp.headers.items():
k = key.lower()
if k not in (
"transfer-encoding",
"content-encoding",
"connection",
"keep-alive",
):
try:
self.send_header(key, value)
except Exception:
pass
# Override content-length to match our read body
self.send_header("Content-Length", str(len(body)))
for key, value in resp.headers.items():
k = key.lower()
if k not in (
"transfer-encoding",
"content-encoding",
"content-length",
"connection",
"keep-alive",
):
try:
self.send_header(key, value)
except Exception:
pass
# Override content-length to match our read body
self.send_header("Content-Length", str(len(body)))
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant