Specify model parameters directly in model strings using URL query parameter syntax for explicit per-request control.
The URI Model Parameters feature allows you to specify model parameters directly in the model string using URL query parameter syntax. This provides explicit per-request control over model behavior without modifying configuration files or headers.
This feature is particularly useful when you need to:
- Override default model parameters for specific requests
- Apply different parameters to reasoning and execution models in hybrid backends
- Test different parameter combinations without changing configuration
- Provide per-request parameter control in automated workflows
- Inline Parameter Specification: Append parameters to model strings using URI syntax (e.g.,
backend:model?temperature=0.5) - Multiple Parameters: Support for multiple parameters in a single model string (e.g.,
?temperature=0.5&reasoning_effort=high) - Hybrid Backend Support: Apply different parameters to reasoning and execution models independently
- Clear Precedence: URI parameters on the model selector override A-leg JSON body fields,
extra_body, and config; session commands and connector-forced settings can still override URI - Graceful Error Handling: Invalid parameters are logged but don't break requests
The following parameters can be specified via URI syntax:
- temperature: Controls randomness in model outputs (0.0-2.0)
- reasoning_effort: Controls computational effort for reasoning models (
low/medium/high; OpenAI Codex backends also supportxhighwhere the upstream API allows it) - top_p: Controls diversity via nucleus sampling (e.g., 0.9)
- top_k: Controls diversity by filtering to the K most likely next tokens (e.g., 40)
Parameters are resolved from multiple sources with the following precedence (highest to lowest):
- Connector-forced settings (backend
extra/ connector policy) — hard overrides from configuration - Interactive session — session reasoning mode and commands such as
!/temperature(0.5)(and edit-precision promotions where applicable) - URI parameters — query string on the routed model id, e.g.
openai-codex:gpt-5.4-mini?reasoning_effort=xhigh - A-leg request body — top-level OpenAI-style fields on the inbound request (for example
temperature,reasoning_effort) when they were actually supplied by the client (schema defaults are not treated as overrides) extra_bodysampling fields — same parameter names carried inextra_body(lower than top-level body for resolution)- Backend / app configuration — defaults from
config.yamland backend blocks
When the same parameter is specified in multiple sources, the higher priority source wins. This allows you to:
- Set defaults in config files
- Let the client send common API fields, but prefer the model string when you encode tuning in
backend:model?... - Override dynamically with session commands (or connector-forced policy when operators require it)
When debug logging is enabled, the proxy provides detailed information about parameter resolution:
DEBUG: Parsed URI parameters from model string 'openai:gpt-4?temperature=0.5': {'temperature': 0.5}
DEBUG: Parameter resolution for openai:
temperature: 0.5 (source: uri, overrode: config=0.8)
reasoning_effort: high (source: session)
This helps you understand:
- Which parameters were parsed from the URI
- The effective value of each parameter
- Which source provided each parameter
- Which sources were overridden
Simple Model with Temperature:
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "openai:gpt-4?temperature=0.5",
"messages": [{"role": "user", "content": "Hello"}]
}'Multiple Parameters:
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "anthropic:claude-3?temperature=0.7&reasoning_effort=high",
"messages": [{"role": "user", "content": "Solve this problem"}]
}'Complex Model Path:
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "openrouter:anthropic/claude-3-haiku:beta?temperature=0.3&reasoning_effort=medium",
"messages": [{"role": "user", "content": "Quick task"}]
}'Apply different parameters to reasoning and execution models:
# Use high temperature for creative reasoning, low for precise execution
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "hybrid:[openai:gpt-4?temperature=0.9,anthropic:claude-3?temperature=0.1]",
"messages": [{"role": "user", "content": "Write a creative story"}]
}'# config.yaml
model_defaults:
openai:gpt-4:
temperature: 0.8# Request with URI parameter overrides config
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "openai:gpt-4?temperature=0.3",
"messages": [{"role": "user", "content": "Hello"}]
}'
# Effective temperature: 0.3 (URI overrides config)# Start with URI parameter
model: openai:gpt-4?temperature=0.5
# Override with session command (takes precedence)
!/temperature(0.8)
# Effective temperature: 0.8 (session command wins)The proxy validates URI parameters and handles errors gracefully:
Valid Parameter:
openai:gpt-4?temperature=0.5
# Parsed and applied
Invalid Value:
openai:gpt-4?temperature=3.5
# Logged as error: "temperature: 3.5 above maximum value (2.0)"
# Request continues with default/config temperature
Unknown Parameter:
openai:gpt-4?unknown_param=value
# Logged as warning: "Unknown URI parameter 'unknown_param'"
# Request continues, parameter ignored
Malformed Query String:
openai:gpt-4?invalid
# Logged as warning: "Malformed URI query string"
# Request continues without URI parameters
Quickly test different parameter values without modifying configuration files:
# Test low temperature
curl ... -d '{"model": "openai:gpt-4?temperature=0.1", ...}'
# Test high temperature
curl ... -d '{"model": "openai:gpt-4?temperature=0.9", ...}'
# Test with reasoning effort
curl ... -d '{"model": "openai:gpt-4?temperature=0.5&reasoning_effort=high", ...}'Apply specific parameters for different types of requests:
# Creative writing - high temperature
curl ... -d '{"model": "openai:gpt-4?temperature=0.9", "messages": [{"role": "user", "content": "Write a story"}]}'
# Code generation - low temperature
curl ... -d '{"model": "openai:gpt-4?temperature=0.1", "messages": [{"role": "user", "content": "Write a function"}]}'
# Analysis - medium temperature
curl ... -d '{"model": "openai:gpt-4?temperature=0.5", "messages": [{"role": "user", "content": "Analyze this"}]}'Optimize reasoning and execution phases independently:
# Creative reasoning + precise execution
curl ... -d '{
"model": "hybrid:[openai:gpt-4?temperature=0.9&reasoning_effort=high,anthropic:claude-3?temperature=0.1]",
"messages": [...]
}'
# Fast reasoning + detailed execution
curl ... -d '{
"model": "hybrid:[openai:gpt-4?reasoning_effort=low,anthropic:claude-3?temperature=0.5]",
"messages": [...]
}'Programmatically control model parameters in scripts:
import requests
def call_llm(prompt, temperature=0.5, reasoning_effort="medium"):
model = f"openai:gpt-4?temperature={temperature}&reasoning_effort={reasoning_effort}"
response = requests.post(
"http://localhost:8000/v1/chat/completions",
json={
"model": model,
"messages": [{"role": "user", "content": prompt}]
}
)
return response.json()
# Use different parameters for different tasks
creative_response = call_llm("Write a story", temperature=0.9, reasoning_effort="high")
code_response = call_llm("Write a function", temperature=0.1, reasoning_effort="low")Compare model behavior with different parameters:
# Version A - conservative
curl ... -d '{"model": "openai:gpt-4?temperature=0.3", ...}'
# Version B - creative
curl ... -d '{"model": "openai:gpt-4?temperature=0.8", ...}'
# Compare outputs to determine optimal parametersProblem: URI parameters don't seem to affect model behavior
Solutions:
- Check parameter precedence — session and connector-forced settings override URI; URI overrides duplicate fields on the A-leg request body or in
extra_body - Verify parameter name spelling (case-sensitive)
- Enable debug logging to see parameter resolution
- Check that the backend supports the parameter
Problem: Getting warnings about invalid parameter values
Solutions:
- Check parameter ranges (e.g., temperature: 0.0-2.0)
- Verify parameter format (e.g.,
reasoning_effort:low/medium/high, andxhighon supported Codex routes) - Review logs for specific validation errors
- Consult backend documentation for supported values
Problem: URI parameters not being parsed
Solutions:
- Ensure proper URL encoding for special characters
- Use
&to separate multiple parameters - Use
=to assign values to parameters - Check for typos in the query string syntax
Problem: Parameters not applying correctly to hybrid backend models
Solutions:
- Specify parameters for each model independently
- Use square brackets to group hybrid models:
hybrid:[model1?param=value,model2?param=value] - Check logs to verify which parameters are applied to which model
- Ensure both models support the specified parameters
- Hybrid Backend - Use URI parameters with hybrid reasoning/execution models
- Model Name Rewrites - Transform model names before URI parameter parsing
- Planning Phase Overrides - Combine with planning phase model selection
- Identity Override - Control client identity alongside model parameters