Skip to content

[client-v2] Default sends query parameters in URI, causing HTTP 414 for large IN-lists / vector embeddings #2856

@claude

Description

@claude

Description

In client-v2, query parameters are serialized into the request URI by default. When parameters are large — e.g. an Array(Array(Float32)) of high-dimensional vector embeddings, or a long IN (...) list — the resulting URI exceeds limits enforced by typical load balancers / reverse proxies (AWS ALB, CloudFront, nginx default large_client_header_buffers, etc.), and the request fails with HTTP 414 Request-URI Too Long before ever reaching the ClickHouse server.

A multipart POST-body alternative exists (executeMultiPartRequest) and is gated by ClientConfigProperties.HTTP_SEND_PARAMS_IN_BODY (config key client.http.use_form_request_for_query, builder method useHttpFormDataForQuery(boolean)). However, this is opt-in — the property defaults to "false", so by default every user with non-trivial parameter payloads will hit 414 until they discover and enable the flag.

Issue #2324 (closed-completed, milestone 0.9.7) shipped the opt-in mechanism but did not change the default. The defect therefore lingers for default users. This is the cross-client analogue of clickhouse-connect issue #526.

Code evidence

  • client-v2/src/main/java/com/clickhouse/client/api/ClientConfigProperties.java:191:

    HTTP_SEND_PARAMS_IN_BODY("client.http.use_form_request_for_query", Boolean.class, "false"),

    Default is "false".

  • client-v2/src/main/java/com/clickhouse/client/api/Client.java:1685-1693:

    boolean  useMultipart = ClientConfigProperties.HTTP_SEND_PARAMS_IN_BODY.getOrDefault(requestSettings.getAllSettings());
    if (queryParams != null && useMultipart) {
        httpResponse = httpClientHelper.executeMultiPartRequest(selectedEndpoint,
                requestSettings.getAllSettings(), sqlQuery);
    } else {
        httpResponse = httpClientHelper.executeRequest(selectedEndpoint,
                requestSettings.getAllSettings(),
                sqlQuery);
    }

    Only when useMultipart == true is the body path taken; otherwise params are serialized to the URI.

ClickHouse server version

Code analysis only; not verified against a running server (defect is in the client request construction, server-side params-in-body has been supported for years — see ClickHouse/ClickHouse#8842).

Reproduction

import com.clickhouse.client.api.Client;
import com.clickhouse.client.api.query.QuerySettings;
import java.util.HashMap;
import java.util.Map;
import java.util.ArrayList;
import java.util.List;

public class ReproduceUri414 {
    public static void main(String[] args) throws Exception {
        // Default builder — note: NO useHttpFormDataForQuery(true)
        Client client = new Client.Builder()
                .addEndpoint("http://your-alb-or-nginx-fronted-clickhouse:8123")
                .setUsername("default")
                .setPassword("")
                .build();

        // Build a large Array(Array(Float32)) param: 50 x 1024-dim vectors
        List<List<Float>> embeddings = new ArrayList<>();
        for (int i = 0; i < 50; i++) {
            List<Float> v = new ArrayList<>(1024);
            for (int j = 0; j < 1024; j++) v.add((float) Math.random());
            embeddings.add(v);
        }

        Map<String, Object> params = new HashMap<>();
        params.put("embs", embeddings);

        QuerySettings settings = new QuerySettings();

        // This serializes `embs` into the request URI by default.
        // Expected:  query reaches the server and returns rows.
        // Actual:    intermediary returns HTTP 414 Request-URI Too Long.
        client.query("SELECT arrayMap(x -> arraySum(x), {embs:Array(Array(Float32))})",
                     params, settings).get();
    }
}

Workaround today: add .useHttpFormDataForQuery(true) to the builder. With the workaround the same call succeeds because executeMultiPartRequest sends params in the POST body.

Suggested fix

Two reasonable options, in order of preference:

  1. Flip the default of HTTP_SEND_PARAMS_IN_BODY from "false" to "true" in ClientConfigProperties.java:191. The server has long supported params-in-body (Send Query Parameters through POST body ClickHouse#8842), and the multipart code path is already exercised by HttpTransportTests. The URI default penalizes the increasingly common vector / large-IN workloads.

  2. Auto-promote to multipart POST when the rendered query-string length exceeds a safe threshold (e.g. 4 KiB). Keeps the URI behavior for tiny params (useful for server-side logging) while transparently handling large ones.

If the project prefers to keep opt-in, please surface a prominent warning on Client.Builder.useHttpFormDataForQuery Javadoc and the README pointing at the HTTP 414 failure mode, and consider logging once at WARN when an outgoing request URI exceeds a threshold.

Link

Cross-client source: ClickHouse/clickhouse-connect#526
Prior tracker (opt-in implementation): #2324
Central tracking: ClickHouse/integrations-ai-playground#150

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions