URI Model Parameters

Specify model parameters directly in model strings using URL query parameter syntax for explicit per-request control.

Overview

The URI Model Parameters feature allows you to specify model parameters directly in the model string using URL query parameter syntax. This provides explicit per-request control over model behavior without modifying configuration files or headers.

This feature is particularly useful when you need to:

Override default model parameters for specific requests
Apply different parameters to reasoning and execution models in hybrid backends
Test different parameter combinations without changing configuration
Provide per-request parameter control in automated workflows

Key Features

Inline Parameter Specification: Append parameters to model strings using URI syntax (e.g., backend:model?temperature=0.5)
Multiple Parameters: Support for multiple parameters in a single model string (e.g., ?temperature=0.5&reasoning_effort=high)
Hybrid Backend Support: Apply different parameters to reasoning and execution models independently
Clear Precedence: URI parameters on the model selector override A-leg JSON body fields, extra_body, and config; session commands and connector-forced settings can still override URI
Graceful Error Handling: Invalid parameters are logged but don't break requests

Configuration

Supported Parameters

The following parameters can be specified via URI syntax:

temperature: Controls randomness in model outputs (0.0-2.0)
reasoning_effort: Controls computational effort for reasoning models (low / medium / high; OpenAI Codex backends also support xhigh where the upstream API allows it)
top_p: Controls diversity via nucleus sampling (e.g., 0.9)
top_k: Controls diversity by filtering to the K most likely next tokens (e.g., 40)

Parameter Precedence

Parameters are resolved from multiple sources with the following precedence (highest to lowest):

Connector-forced settings (backend extra / connector policy) — hard overrides from configuration
Interactive session — session reasoning mode and commands such as !/temperature(0.5) (and edit-precision promotions where applicable)
URI parameters — query string on the routed model id, e.g. openai-codex:gpt-5.4-mini?reasoning_effort=xhigh
A-leg request body — top-level OpenAI-style fields on the inbound request (for example temperature, reasoning_effort) when they were actually supplied by the client (schema defaults are not treated as overrides)
extra_body sampling fields — same parameter names carried in extra_body (lower than top-level body for resolution)
Backend / app configuration — defaults from config.yaml and backend blocks

When the same parameter is specified in multiple sources, the higher priority source wins. This allows you to:

Set defaults in config files
Let the client send common API fields, but prefer the model string when you encode tuning in backend:model?...
Override dynamically with session commands (or connector-forced policy when operators require it)

Debug Logging

When debug logging is enabled, the proxy provides detailed information about parameter resolution:

DEBUG: Parsed URI parameters from model string 'openai:gpt-4?temperature=0.5': {'temperature': 0.5}
DEBUG: Parameter resolution for openai:
  temperature: 0.5 (source: uri, overrode: config=0.8)
  reasoning_effort: high (source: session)

This helps you understand:

Which parameters were parsed from the URI
The effective value of each parameter
Which source provided each parameter
Which sources were overridden

Usage Examples

Basic Usage

Simple Model with Temperature:

curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai:gpt-4?temperature=0.5",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

Multiple Parameters:

curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anthropic:claude-3?temperature=0.7&reasoning_effort=high",
    "messages": [{"role": "user", "content": "Solve this problem"}]
  }'

Complex Model Path:

curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openrouter:anthropic/claude-3-haiku:beta?temperature=0.3&reasoning_effort=medium",
    "messages": [{"role": "user", "content": "Quick task"}]
  }'

Hybrid Backend with Independent Parameters

Apply different parameters to reasoning and execution models:

# Use high temperature for creative reasoning, low for precise execution
curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "hybrid:[openai:gpt-4?temperature=0.9,anthropic:claude-3?temperature=0.1]",
    "messages": [{"role": "user", "content": "Write a creative story"}]
  }'

Override Config Temperature

# config.yaml
model_defaults:
  openai:gpt-4:
    temperature: 0.8

# Request with URI parameter overrides config
curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai:gpt-4?temperature=0.3",
    "messages": [{"role": "user", "content": "Hello"}]
  }'
# Effective temperature: 0.3 (URI overrides config)

Session Command Override

# Start with URI parameter
model: openai:gpt-4?temperature=0.5

# Override with session command (takes precedence)
!/temperature(0.8)

# Effective temperature: 0.8 (session command wins)

Validation and Error Handling

The proxy validates URI parameters and handles errors gracefully:

Valid Parameter:

openai:gpt-4?temperature=0.5
# Parsed and applied

Invalid Value:

openai:gpt-4?temperature=3.5
# Logged as error: "temperature: 3.5 above maximum value (2.0)"
# Request continues with default/config temperature

Unknown Parameter:

openai:gpt-4?unknown_param=value
# Logged as warning: "Unknown URI parameter 'unknown_param'"
# Request continues, parameter ignored

Malformed Query String:

openai:gpt-4?invalid
# Logged as warning: "Malformed URI query string"
# Request continues without URI parameters

Use Cases

Testing Different Parameter Combinations

Quickly test different parameter values without modifying configuration files:

# Test low temperature
curl ... -d '{"model": "openai:gpt-4?temperature=0.1", ...}'

# Test high temperature
curl ... -d '{"model": "openai:gpt-4?temperature=0.9", ...}'

# Test with reasoning effort
curl ... -d '{"model": "openai:gpt-4?temperature=0.5&reasoning_effort=high", ...}'

Per-Request Parameter Control

Apply specific parameters for different types of requests:

# Creative writing - high temperature
curl ... -d '{"model": "openai:gpt-4?temperature=0.9", "messages": [{"role": "user", "content": "Write a story"}]}'

# Code generation - low temperature
curl ... -d '{"model": "openai:gpt-4?temperature=0.1", "messages": [{"role": "user", "content": "Write a function"}]}'

# Analysis - medium temperature
curl ... -d '{"model": "openai:gpt-4?temperature=0.5", "messages": [{"role": "user", "content": "Analyze this"}]}'

Hybrid Backend Optimization

Optimize reasoning and execution phases independently:

# Creative reasoning + precise execution
curl ... -d '{
  "model": "hybrid:[openai:gpt-4?temperature=0.9&reasoning_effort=high,anthropic:claude-3?temperature=0.1]",
  "messages": [...]
}'

# Fast reasoning + detailed execution
curl ... -d '{
  "model": "hybrid:[openai:gpt-4?reasoning_effort=low,anthropic:claude-3?temperature=0.5]",
  "messages": [...]
}'

Automated Workflows

Programmatically control model parameters in scripts:

import requests

def call_llm(prompt, temperature=0.5, reasoning_effort="medium"):
    model = f"openai:gpt-4?temperature={temperature}&reasoning_effort={reasoning_effort}"
    response = requests.post(
        "http://localhost:8000/v1/chat/completions",
        json={
            "model": model,
            "messages": [{"role": "user", "content": prompt}]
        }
    )
    return response.json()

# Use different parameters for different tasks
creative_response = call_llm("Write a story", temperature=0.9, reasoning_effort="high")
code_response = call_llm("Write a function", temperature=0.1, reasoning_effort="low")

A/B Testing

Compare model behavior with different parameters:

# Version A - conservative
curl ... -d '{"model": "openai:gpt-4?temperature=0.3", ...}'

# Version B - creative
curl ... -d '{"model": "openai:gpt-4?temperature=0.8", ...}'

# Compare outputs to determine optimal parameters

Troubleshooting

Parameters Not Applied

Problem: URI parameters don't seem to affect model behavior

Solutions:

Check parameter precedence — session and connector-forced settings override URI; URI overrides duplicate fields on the A-leg request body or in extra_body
Verify parameter name spelling (case-sensitive)
Enable debug logging to see parameter resolution
Check that the backend supports the parameter

Invalid Parameter Values

Problem: Getting warnings about invalid parameter values

Solutions:

Check parameter ranges (e.g., temperature: 0.0-2.0)
Verify parameter format (e.g., reasoning_effort: low / medium / high, and xhigh on supported Codex routes)
Review logs for specific validation errors
Consult backend documentation for supported values

Malformed Query Strings

Problem: URI parameters not being parsed

Solutions:

Ensure proper URL encoding for special characters
Use & to separate multiple parameters
Use = to assign values to parameters
Check for typos in the query string syntax

Hybrid Backend Parameter Conflicts

Problem: Parameters not applying correctly to hybrid backend models

Solutions:

Specify parameters for each model independently
Use square brackets to group hybrid models: hybrid:[model1?param=value,model2?param=value]
Check logs to verify which parameters are applied to which model
Ensure both models support the specified parameters

Related Features

Hybrid Backend - Use URI parameters with hybrid reasoning/execution models
Model Name Rewrites - Transform model names before URI parameter parsing
Planning Phase Overrides - Combine with planning phase model selection
Identity Override - Control client identity alongside model parameters

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

URI Model Parameters

Overview

Key Features

Configuration

Supported Parameters

Parameter Precedence

Debug Logging

Usage Examples

Basic Usage

Hybrid Backend with Independent Parameters

Override Config Temperature

Session Command Override

Validation and Error Handling

Use Cases

Testing Different Parameter Combinations

Per-Request Parameter Control

Hybrid Backend Optimization

Automated Workflows

A/B Testing

Troubleshooting

Parameters Not Applied

Invalid Parameter Values

Malformed Query Strings

Hybrid Backend Parameter Conflicts

Related Features

FilesExpand file tree

uri-model-parameters.md

Latest commit

History

uri-model-parameters.md

File metadata and controls

URI Model Parameters

Overview

Key Features

Configuration

Supported Parameters

Parameter Precedence

Debug Logging

Usage Examples

Basic Usage

Hybrid Backend with Independent Parameters

Override Config Temperature

Session Command Override

Validation and Error Handling

Use Cases

Testing Different Parameter Combinations

Per-Request Parameter Control

Hybrid Backend Optimization

Automated Workflows

A/B Testing

Troubleshooting

Parameters Not Applied

Invalid Parameter Values

Malformed Query Strings

Hybrid Backend Parameter Conflicts

Related Features