Skip to content

Read SQL from stdin or file in query, memory, and bulk #765

Description

@fuleinist

Problem

sqlite-utils query, sqlite-utils memory, and sqlite-utils bulk all accept SQL as a single positional string argument. Long or multi-statement scripts (table migrations, conditional UPSERTs, INSERTs with many columns) need shell escaping ($(), \\\", \\$, etc.) or wrapper scripts to invoke. The same codebase already supports the equivalent for Python code (in convert) and for data file inputs (- / stdin in insert, bulk, memory, insert-files via click.argument("file", type=click.File("rb"))). Only the SQL argument hasn't been given the same treatment.

Today this is what the user has to write for a 30-line migration script:

sqlite-utils query data.db "$(cat migrations/2026_07_01_backfill.sql)"

Two pain points:

  1. Quoting hell — semicolons, single quotes inside SQL strings, and $(...) references in the SQL body all collide with shell quoting rules. Multi-line SQL containing ' is a known footgun.
  2. Stream chaining — when the upstream pipeline already produces SQL (e.g. a generated WITH-CTE from another tool), there is no way to pipe it.

The repo's other CLI conventions already handle exactly this for similar inputs:

  • convert (sqlite_utils/cli.py:3085):
    if code == "-":
        # Read code from standard input
        code = sys.stdin.read()
  • insert/upsert/bulk and memory already accept - as a sentinel for stdin on file arguments (e.g. cat animals.csv | sqlite-utils insert data.db birds -).
  • insert-files (sqlite_utils/cli.py:2791) accepts - for a path argument and dispatches to sys.stdin.read().

Proposal

Treat sql == "-" as a sentinel in the three SQL-accepting commands; read from stdin. Optionally also add a --sql-file PATH flag for the explicit case where the user wants both sql and --sql-file semantics (mirrors how --functions "Python code or file path" already works on query, memory, and bulk).

# New: stdin
sqlite-utils query data.db - < migrations/2026_07_01_backfill.sql
cat <<'SQL' | sqlite-utils memory stdin:csv - "select * from stdin where species = 'dog'"
select 1;
SQL

# New: explicit file (matches the --functions precedent)
sqlite-utils query data.db --sql-file migrations/2026_07_01_backfill.sql

Behavior:

  • sql == "-"sql = sys.stdin.read().strip(); emit the same friendly error path as convert if stdin is closed (raise click.ClickException("Reading SQL from stdin requires an interactive terminal or a piped input")).
  • --sql-file PATHsql = pathlib.Path(PATH).read_text(). Mutually exclusive with each other and with a literal sql argument (raise click.ClickException("--sql-file cannot be combined with a SQL argument") if both are passed).
  • All existing flags (--attach, --param, --functions, --flatten, --raw, --csv, --tsv, --nl) compose unchanged.
  • Applies to query, memory, bulk. (rows_from_file-style commands already handle their own stdin path; they don't take a SQL argument.)

Implementation surface

Three locations, all bounded, mechanical:

  • sqlite_utils/cli.py:1426 (bulk) — add if sql == "-": sql = sys.stdin.read() after the click parsing, like convert's pattern at line 3085.
  • sqlite_utils/cli.py:1811 (query) — same.
  • sqlite_utils/cli.py:1894 (memory) — same.
  • New load_extension_option-style helper decorator sql_input_options(fn) that adds --sql-file PATH and reuses the -/file detection logic, applied via @sql_input_options to all three commands. Keeps the click decorator ordering clean.

Each touchpoint is ~5 lines. No signature changes, no breaking changes to existing callers, no schema or migration work.

Use cases

  • CI scriptsrun: sqlite-utils query $DB - < sql/maintenance.sql instead of inlining the SQL in run: blocks (GitHub Actions, Drone, Woodpecker).
  • Migration tooling — when migrating from MySQL/Postgres, tools often emit a generated SQL dump; pipe it directly: mysqldump ... | sqlite-utils memory stdin:nl - "$sql".
  • Notebook-eval workflows — Jupyter/Codespaces users can copy-paste a multi-statement SQL block into a shell pipe instead of saving to a file first.
  • Editor integrations — SQL clients that emit select ...; per query (some LSP-driven tools) can pipe rather than write to a tmpfile each time.

What I considered and rejected

  • sqlite-utils memory is not a substitutememory requires a data source on the command line; it doesn't help when you want the SQL+path-form of query.
  • sqlite-utils query ... "$(cat file.sql)" is the current workaround. It works for simple cases but breaks under nested quotes, command substitution in the SQL, and pipeline chaining.
  • Stream-from-stdin as a separate sql-stream subcommand — adds vocabulary; better to fix the existing commands in place.
  • Universal --sql-file only, no - sentinel — drops the pipeline case, which is the more interesting use case. The code == "-" precedent in convert makes the sentinel coherent.

Out of scope

  • Multi-statement semantics — if the user pipes select 1; select 2;, query runs only the first (matches current behavior on a literal string). To run multiple, the --sql-file form can later be extended; this issue stays surgical.
  • --functions file inference — already covered by --functions "Python code or file path"; not changing it.
  • Schema introspection of files — not relevant.

Filing this from fuleinist, an automated GitHub-issue helper run from MiniMax. Happy to narrow the proposal if a different surface (just query, just --sql-file, etc.) would land better — let me know which shape passes review.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions