Skip to content

feat(query): fetch results as Arrow IPC instead of JSON#103

Merged
eddietejeda merged 5 commits into
mainfrom
feat/query-arrow-ipc
May 23, 2026
Merged

feat(query): fetch results as Arrow IPC instead of JSON#103
eddietejeda merged 5 commits into
mainfrom
feat/query-arrow-ipc

Conversation

@eddietejeda
Copy link
Copy Markdown
Contributor

Summary

  • Switches the async query path and poll subcommand from JSON to Arrow IPC (application/vnd.apache.arrow.stream) for result fetching
  • Arrow format is ~39% smaller on the wire and ~150 ms faster on large result sets compared to JSON
  • Inlines the async polling loop (500 ms interval with spinner) so slow queries complete without requiring a second hotdata query status invocation

Changes

  • src/api.rs: Added get_bytes(path, accept) for binary HTTP responses with a custom Accept header
  • src/query.rs: Added arrow_cell, arrow_ipc_to_query_response, fetch_arrow_result for Arrow IPC decoding; async 202 path now polls inline; poll subcommand uses Arrow fetch
  • Cargo.toml: Added arrow = { version = "54", default-features = false, features = ["ipc"] }

Notes

  • The sync fast-path (200 response from /query) remains JSON since the server returns the result inline and does not support an Arrow response there
  • Dates, timestamps, and other exotic Arrow types fall back to ArrayFormatter for string rendering

Test plan

  • SELECT COUNT(*) against managed database returns correct row count
  • Multi-column query with date + float columns renders correctly in table, JSON, and CSV output
  • Window function query (12-month rolling average) executes and returns correct results

🤖 Generated with Claude Code

Switch the async query path and poll command from JSON to Arrow IPC
(application/vnd.apache.arrow.stream) for result fetching. Arrow is
~39% smaller on the wire and ~150 ms faster on large result sets.

- Add `get_bytes()` to ApiClient for binary HTTP responses
- Add Arrow decoding: `arrow_cell`, `arrow_ipc_to_query_response`,
  `fetch_arrow_result` in query.rs
- Inline async polling (500 ms interval with spinner) so callers
  never need to re-run the command for slow queries
- `poll` subcommand updated to use Arrow fetch

The sync fast-path (200 from /query) remains JSON since the server
returns the result inline and does not support Arrow there.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@sentry
Copy link
Copy Markdown

sentry Bot commented May 23, 2026

Codecov Report

❌ Patch coverage is 0% with 164 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/query.rs 0.00% 103 Missing ⚠️
src/main.rs 0.00% 27 Missing ⚠️
src/util.rs 0.00% 24 Missing ⚠️
src/api.rs 0.00% 9 Missing ⚠️
src/results.rs 0.00% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

Comment thread src/query.rs Outdated
Comment thread src/query.rs Outdated
Comment thread src/api.rs
claude[bot]
claude Bot previously approved these changes May 23, 2026
- Route get_bytes through send_debug_bytes so Arrow requests appear
  in --debug output (was bypassing debug logging entirely)
- Replace unwrap_or_default() with resp.bytes()? to propagate body
  read errors instead of silently returning empty bytes
- Remove dead req.build() error arm; send_debug_bytes handles it
- Change execution_time_ms to Option<u64>; async Arrow results show
  "—" in the table footer instead of a misleading "0 ms"
- Add 5-minute polling timeout to the async execute loop with a hint
  to re-check via `hotdata query status <run_id>`

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace the `_ => continue` catch-all in the async polling loop with
explicit known in-flight states (`running | queued | pending`) and an
unknown-status arm that clears the spinner, prints the unexpected
status, and exits with code 2 — matching the behaviour of `poll()`.

This prevents infinite loops on terminal statuses like `cancelled` or
`timed_out` that the server may add in the future.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
claude[bot]
claude Bot previously approved these changes May 23, 2026
Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prior nit threads addressed. No new issues.

- Rename QueryRunResponse.error -> error_message to match the actual
  server field name; previously failed async queries always printed
  "query failed: unknown error" instead of the real message
- Make fetch_arrow_result pub(crate) and use it in results::get so
  that `hotdata results get <id>` fetches Arrow IPC like the rest of
  the query path, rather than falling back to JSON

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
claude[bot]
claude Bot previously approved these changes May 23, 2026
skills list
- Add List variant to SkillCommands as an alias for status

databases show <id>
- Add Show subcommand to DatabasesCommands; dispatches to databases::get()
- Bare positional (hotdata databases <id>) preserved for backward compat

databases tables <db_id>
- Make Tables subcommand optional; bare positional db_id triggers tables_list
- hotdata databases tables <id> now lists tables without requiring `list` subcommand
- hotdata databases tables list --database <id> still works (backward compat)
- hotdata databases tables with no args prints help

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@eddietejeda eddietejeda merged commit 0cbcf47 into main May 23, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant