feat(qwp): scope the SYMBOL dictionary to a single query#58
Conversation
Add client support for the QWP per-query query_flags trailer so a caller can scope the connection-scoped SYMBOL dict to a single query. QwpQueryClient gains execute(..., boolean resetSymbolDict) overloads. When set, resolveQueryFlags maps the request to QUERY_FLAG_RESET_DICT, but only when the server advertised CAP_QUERY_FLAGS; otherwise the flag is dropped. QwpEgressIoThread carries the resolved flags on QueryRequest and appends the query_flags varint trailer to QUERY_REQUEST only when non-zero, so a baseline frame stays byte-identical and the server defaults the flags to 0. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
QwpQueryClientQueryFlagsTest drives execute() against the mock WebSocket server, captures the emitted QUERY_REQUEST, and asserts the optional query_flags trailer: - reset flag + server CAP_QUERY_FLAGS -> QUERY_FLAG_RESET_DICT trailer - the trailer is the only difference from the byte-identical flag-off baseline - no capability -> flag dropped; the flag-less overloads send no trailer TestWebSocketServer gains setCapabilities(int) so its SERVER_INFO frame can advertise CAP_QUERY_FLAGS; the default of 0 keeps the existing byte layout for every other test. Covers the previously-untested client paths: QwpQueryClient.resolveQueryFlags and the execute(..., resetSymbolDict) overloads, plus the QwpEgressIoThread query_flags trailer append. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
[PR Coverage check]😍 pass : 18 / 18 (100.00%) file detail
|
Code review (level 3 — full pass)Change-surface map + 10-agent fan-out (incl. fresh-context adversarial & cross-context) with per-finding source verification. Verdict: Approve — no blocking issuesThe change is small, additive, and correctly wired. Existing callers are untouched (the new ModerateM1 — Feature correctness depends entirely on the tandem server PR; confirm two contracts there (cross-repo, fails safe). So the feature is coherent only if the server, when honoring
The cited Minorm1 — m2 — No test for the reset flag together with bind parameters. m3 — m4 — Silent flag-drop gives the caller no direct signal (intended). PR metadataClean. Valid Conventional Commits title, correct Dropped during verification (false positives)
Findings: 10 draft → 5 verified (1 Moderate, 4 Minor), 5 dropped. 0 out-of-diff caller bugs (all callsites + 🤖 Review generated with Claude Code (Opus 4.8, 1M context) |
Tandem PR (server): questdb/questdb#7321 — it carries the protocol change and, because most of this client's real-server coverage runs against a live
ServerMainin the questdb repo, the end-to-end and protocol tests for this change. Review the two together.Why
QWP carries a
SYMBOLcolumn as a dictionary (the distinct string values) plus per-row integer codes. The fastest egress consumers — the Python client'sto_polars()/to_pandas()dataframe paths — reuse that wire dictionary directly as the pandas/polars Categorical dictionary: the per-row codes are adopted zero-copy, with no re-interning and no per-row remap. That direct reuse is what makesSYMBOL → dataframemaximally efficient.The catch is that the QWP dictionary is connection-scoped: it accumulates every symbol seen across all queries on a connection and only resets at a soft cap. Because the consumer reuses the dictionary as-is, a result that references few symbols but lands on a connection an earlier high-cardinality query already grew inherits a bloated Categorical — full of categories it never uses. The very zero-copy reuse that makes the path fast is what drags the cross-query bloat in.
The protocol fix (server PR #7321) adds a per-query
query_flagsfield toQUERY_REQUESTwhose first bit,QUERY_FLAG_RESET_DICT, asks the server to reset the connection-scoped dictionary before the query streams — keeping the zero-copy speed without the bloat. This PR is the Java client half: it lets a Java caller set that flag per query.What
QwpQueryClientgainsexecute(..., boolean resetSymbolDict)overloads. The existing two- and three-argument forms delegate withfalse, so current callers are unchanged. When the flag is set, the client asks the server to reset the connection-scoped SYMBOL dictionary before the query streams, scoping the dictionary to that query's own symbols. The flag is opt-in and defaults off.resolveQueryFlagsmaps the request to theQUERY_FLAG_RESET_DICTbit, but only when the server advertised theCAP_QUERY_FLAGScapability inSERVER_INFO; against a server without it the flag is silently dropped.QwpEgressIoThreadappends thequery_flagsvarint trailer toQUERY_REQUESTonly when the resolved flags are non-zero, so a baseline frame stays byte-identical on the wire.Compatibility
Tradeoff
Setting the flag gives up cross-query dictionary reuse for that query: the dictionary and its per-batch delta are rebuilt from scratch, so the query's symbols are re-sent rather than amortized across the connection. That is the intended exchange — a lean, query-scoped Categorical in return for re-sending the query's own symbols. Callers that benefit from connection-scoped accumulation should leave the flag off (the default).
Testing
Most of this client's real-server coverage lives in the questdb repo (it runs against a live
ServerMain), so the protocol and end-to-end tests for this change ship in the tandem PR #7321:QwpEgressServerInfoTest— the OSS server advertisesCAP_QUERY_FLAGS, which is what gates the client appending the trailer.QwpEgressRequestDecoderTest— decoding of the optionalquery_flagstrailer: the reset bit, a trailer after a non-empty bind section, multi-byte varint values, and high/sign bits all decode; a baseline frame leaves it 0.QwpEgressQueryFlagsResetTest— the decode → dict-reset effect: flag absent keeps the conn dict; flag set clears it and stages theCACHE_RESET; an empty dict suppresses the redundant frame; an unknown flag bit is ignored.QwpEgressProcessorStateCacheResetTest— reset-mask boundaries.QwpEgressCacheResetWireTest— the end-to-end client↔serverCACHE_RESETwire path this builds on, driving this client against a live server.In this repo, the existing QWP client/egress unit tests still pass and a flag-off frame stays byte-identical on the wire.