fix: update Groq model metadata for openai/gpt-oss-120b#576
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 16b813669e
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "Codex (@codex) review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "Codex (@codex) address that feedback".
| "databricks-meta-llama-3-1-405b-instruct": ["databricks"], | ||
| "databricks-meta-llama-3-1-8b-instruct": ["databricks"], | ||
| "openai/gpt-oss-120b": ["together", "groq", "baseten"], | ||
| "openai/gpt-oss-120b": ["groq","together","baseten"], |
There was a problem hiding this comment.
Preserve Together as first endpoint for gpt-oss-120b
Changing the provider order here alters runtime behavior in the API-key lookup fallback path: EdgeProxy uses getModelEndpointTypes(model) and then selects endpointTypes[0] as the secret type when lookup fails (packages/proxy/edge/index.ts lines 275-279). With this reorder, fallback requests for openai/gpt-oss-120b now default to groq instead of together, which can cause authentication/routing failures for users supplying a Together key during lookup outages or 5xx conditions.
Useful? React with 👍 / 👎.
fix: update Groq model metadata for openai/gpt-oss-120b
Closes #566
Source issue: #566
Summary
openai/gpt-oss-120bopenai/gpt-oss-120b2
Verified metadata
Verification notes
Verification
Official sources consulted
input_cost_per_mil_tokens=0.15,output_cost_per_mil_tokens=0.60,max_input_tokens=131072,max_output_tokens=65536. Model listed as production.input_cost_per_mil_tokens=0.15,output_cost_per_mil_tokens=0.60. Context window listed as 128,000. Max output tokens not published.sync_models (LiteLLM) cross-check
The sync_models catalog only has
azure_ai/gpt-oss-120b(no groq/ or together_ai/ prefixed entries). Comparison:azure_ai/gpt-oss-120b)input_cost_per_mil_tokensoutput_cost_per_mil_tokensmax_output_tokensmax_input_tokensFields not changed
input_cache_read_cost_per_mil_tokens(0.075): Retained from existing catalog. Neither Groq nor Together publish cache pricing for this model, and no official source contradicts the current value.available_providers: Retained as["groq", "together", "baseten"]. Baseten pricing not independently verified but the provider mapping already exists in the catalog.displayName,format,flavor,reasoning: Retained unchanged from existing catalog entry — no official source suggests these should change.Fields not applicable or not published
parent: Not applicable — this is a base model, not a dated snapshot or variant.deprecated/deprecation_date: Not applicable — model is active/production on both Groq and Together.multimodal: Not published by either provider for this model.supported_regions: Not applicable — no vertex provider.locations: Not applicable — no vertex provider.sync_models vs proposed update
sync_models cross-check found differences. Official provider verification was used for the applied values, and sync_models discrepancies are listed below for review.