sql-pipe reads CSV, JSON, or NDJSON from stdin, loads it into an in-memory SQLite database, runs a SQL query, and prints the results. No server, no schema files, no setup.
It exists because awk is cryptic, spinning up a Python interpreter for a one-liner feels wrong, and sqlite3 :memory: takes four commands before you can query anything. If you know SQL and work with CSV in the terminal, this is the tool you've been reaching for.
$ curl -s https://example.com/data.csv | sql-pipe 'SELECT region, SUM(revenue) FROM t GROUP BY region ORDER BY 2 DESC'macOS / Linux via Homebrew:
brew tap vmvarela/homebrew-tap
brew install sql-pipePre-built binaries for Linux, macOS (Intel + Apple Silicon), and Windows are also available on the Releases page.
Shell installer (Linux/macOS):
curl -sSL https://raw.githubusercontent.com/vmvarela/sql-pipe/master/install.sh | shBy default it installs to /usr/local/bin. Override with INSTALL_DIR:
curl -sSL https://raw.githubusercontent.com/vmvarela/sql-pipe/master/install.sh | INSTALL_DIR="$HOME/.local/bin" shDebian / Ubuntu (APT repository):
curl -sSL https://vmvarela.github.io/apt-packages/key.gpg \
| sudo tee /etc/apt/keyrings/vmvarela.asc
echo "deb [signed-by=/etc/apt/keyrings/vmvarela.asc] https://vmvarela.github.io/apt-packages stable main" \
| sudo tee /etc/apt/sources.list.d/vmvarela.list
sudo apt update && sudo apt install sql-pipeOr install a single release asset directly:
wget https://github.com/vmvarela/sql-pipe/releases/latest/download/sql-pipe_VERSION_linux_amd64.deb
sudo dpkg -i sql-pipe_VERSION_linux_amd64.debReplace VERSION with the release version (e.g. 0.9.0) and amd64 with your architecture (arm64, arm7, or 386).
Fedora / RHEL / openSUSE (RPM repository):
sudo curl -fsSL https://vmvarela.github.io/rpm-packages/vmvarela.repo \
-o /etc/yum.repos.d/vmvarela.repo
sudo dnf install sql-pipeOr install a single release asset directly:
sudo rpm -i https://github.com/vmvarela/sql-pipe/releases/latest/download/sql-pipe_VERSION_linux_amd64.rpmReplace VERSION with the release version (e.g. 0.9.0) and amd64 with your architecture (arm64).
Alpine Linux (APK repository):
wget -qO /etc/apk/keys/vmvarela.rsa.pub \
https://vmvarela.github.io/apk-packages/vmvarela.rsa.pub
echo "https://vmvarela.github.io/apk-packages" >> /etc/apk/repositories
apk update && apk add sql-pipeOr install a single release asset directly:
wget https://github.com/vmvarela/sql-pipe/releases/latest/download/sql-pipe_VERSION_linux_amd64.apk
sudo apk add --allow-untrusted sql-pipe_VERSION_linux_amd64.apkReplace VERSION with the release version (e.g. 0.9.0) and amd64 with your architecture (arm64).
Arch Linux (AUR): install with your preferred AUR helper:
yay -S sql-pipe
# or
paru -S sql-pipeNix / NixOS:
# Run without installing
nix run github:vmvarela/nix-packages#sql-pipe -- 'SELECT * FROM t'
# Install to profile
nix profile install github:vmvarela/nix-packages#sql-pipe
# Non-flake
nix-env -if https://github.com/vmvarela/nix-packages/archive/main.tar.gzWindows (Chocolatey):
choco install sql-pipeWindows (WinGet):
winget install vmvarela.sql-pipeWindows (Scoop):
scoop bucket add vmvarela https://github.com/vmvarela/scoop-bucket
scoop install sql-pipeTo build from source (requires Zig 0.15+):
git clone https://github.com/vmvarela/sql-pipe
cd sql-pipe
mkdir -p lib
curl -fsSL https://www.sqlite.org/2025/sqlite-amalgamation-3490100.zip -o sqlite.zip
unzip -j sqlite.zip '*/sqlite3.c' '*/sqlite3.h' -d lib/
zig build -Dbundle-sqlite=true -Doptimize=ReleaseSafeBinary lands at ./zig-out/bin/sql-pipe. SQLite is compiled from the official amalgamation — no system dependencies.
The input comes from stdin. For CSV and TSV, the first row must be a header — those column names become the schema for a table called t. Results go to stdout as comma-separated values by default.
$ printf 'name,age\nAlice,30\nBob,25\nCarol,35' | sql-pipe 'SELECT * FROM t'
Alice,30
Bob,25
Carol,35For JSON and NDJSON input, pass -I json (reads an array of objects) or -I ndjson (one object per line). Column names are taken from the keys of the first object:
$ printf '[{"name":"Alice","score":95},{"name":"Bob","score":72}]' \
| sql-pipe -I json 'SELECT name, score FROM t WHERE score > 80'
Alice,95Columns are auto-detected as INTEGER, REAL, DATE, DATETIME, or TEXT based on the first 100 rows. Date and datetime values are normalized to ISO 8601 on insert, so SQLite date functions (date(), strftime(), julianday()) work immediately. Use --no-type-inference to force all columns to TEXT:
$ cat orders.csv | sql-pipe 'SELECT COUNT(*), AVG(amount) FROM t WHERE status = "paid"'
142,87.35Column names with spaces work — quote them in SQL:
$ cat report.csv | sql-pipe 'SELECT "first name", "last name" FROM t WHERE "dept id" = "42"'Use a custom input delimiter with -d / --delimiter (1–8 characters), or --tsv for tab-separated files:
$ cat data.psv | sql-pipe -d '|' 'SELECT * FROM t'
$ cat data.tsv | sql-pipe --tsv 'SELECT * FROM t'
# equivalent:
$ cat data.tsv | sql-pipe --delimiter '\t' 'SELECT * FROM t'
# multi-character delimiters:
$ cat data.psv | sql-pipe -d '||' 'SELECT * FROM t'
$ cat report.txt | sql-pipe --delimiter ' ' 'SELECT * FROM t' # two spacesOutput results as a JSON array of objects with --json:
$ printf 'name,age\nAlice,30\nBob,25' | sql-pipe --json 'SELECT * FROM t'
[{"name":"Alice","age":30},{"name":"Bob","age":25}]--json is mutually exclusive with -H/--header. It can be combined with -d/--delimiter and --tsv to read non-comma-separated input.
For XML input and output, use -I xml / -O xml. By default the root element is <results> and each row is <row>. Override with --xml-root and --xml-row:
$ printf 'name,age\nAlice,30\nBob,25' | sql-pipe -O xml 'SELECT * FROM t'
<?xml version="1.0" encoding="UTF-8"?>
<results>
<row><name>Alice</name><age>30</age></row>
<row><name>Bob</name><age>25</age></row>
</results>
$ cat data.xml | sql-pipe -I xml 'SELECT name FROM t WHERE age > 25'Real-world XML documents (RSS feeds, API responses) nest rows inside a container element. Use --xml-root to navigate to the row container and --xml-row to filter by element tag:
# Query an RSS feed: channel/item → rows
$ curl -s https://feeds.feedburner.com/TheHackersNews \
| sql-pipe -I xml --xml-root channel --xml-row item \
'SELECT title FROM t LIMIT 5'
# Query XML with a custom root and row name
$ cat events.xml | sql-pipe -I xml --xml-root events --xml-row event \
'SELECT name, date FROM t WHERE type = "conference"'Chain queries by piping back in — useful for two-pass aggregations. Pass -H to the first call so the second one sees column names:
$ cat events.csv \
| sql-pipe -H 'SELECT user_id, COUNT(*) as n FROM t GROUP BY user_id' \
| sql-pipe 'SELECT * FROM t WHERE n > 100'| Flag | Description |
|---|---|
-d, --delimiter <char> |
Input field delimiter (single character, default ,) |
--tsv |
Alias for --delimiter '\t' |
-I, --input-format <fmt> |
Input format: csv (default), tsv, json, ndjson, xml |
-O, --output-format <fmt> |
Output format: csv (default), tsv, json, ndjson, xml |
--no-type-inference |
Treat all columns as TEXT (skip auto-detection) |
-H, --header |
Print column names as the first output row |
--json |
Alias for --output-format json (mutually exclusive with -H) |
--max-rows <n> |
Stop if more than n data rows are read (exit 1) |
--validate |
Parse the entire input and print a summary (OK: <n> rows, <m> columns (col TYPE, ...)) to stdout. Exit 0 on success, exit 2 on parse error. No query required. Compatible with --delimiter, --tsv, --no-type-inference, -I/--input-format (csv, tsv, json, ndjson, xml). JSON/NDJSON/XML columns are reported as TEXT. |
--columns |
Read the input header, print each column name on its own line, and exit 0. Supports CSV, TSV, JSON, NDJSON, and XML input. With -v/--verbose, also shows the inferred type per column (name INTEGER). Respects --delimiter and --tsv. Mutually exclusive with a query argument. |
--sample [<n>] |
Print a schema comment block to stderr and the first <n> data rows to stdout as CSV (default: n=10). The schema block lists each column name and its inferred type, prefixed with #. Implies --header. Compatible with --delimiter and --tsv. Mutually exclusive with --json and a query argument. No query required. |
--xml-root <name> |
Root element name for XML I/O (default: results) |
--xml-row <name> |
Row element name for XML I/O (default: row) |
--output <file> |
Write results to the given file instead of stdout. Creates or overwrites the file. Exits 1 if the file cannot be created. |
-v, --verbose |
Print Loaded <n> rows in <t>s to stderr after loading (always on TTY; forced with flag) |
-s, --silent |
Suppress Loaded <n> rows in <t>s and the progress counter from stderr unconditionally. Cannot be combined with -v/--verbose |
-h, --help |
Show usage help and exit |
-V, --version |
Print version and exit |
After loading, sql-pipe prints Loaded <n> rows in <t>s to stderr whenever stderr is a TTY (interactive terminal). The message is suppressed in scripts and pipes to keep them noise-free. Use -v / --verbose to force it regardless of TTY, or -s / --silent to suppress it unconditionally (e.g. when stderr is a TTY but you want clean output):
$ cat sales.csv | sql-pipe --verbose 'SELECT region, SUM(revenue) FROM t GROUP BY region'
# stderr: Loaded 42,317 rows in 1.2sWhen stderr is a TTY and the input exceeds 10,000 rows, a running counter updates in place on stderr during loading:
Loading... 10,000 rows
Loading... 20,000 rows
...
Loaded 42,317 rows in 1.2s
When --max-rows is set, the total limit is shown alongside the current count:
Loading... 10,000 / 100,000 rows
The counter is suppressed in pipes and scripts (zero overhead when stderr is not a TTY). The count uses thousands separators (42,317 not 42317). It is always written to stderr so stdout remains clean for piping.
| Code | Meaning |
|---|---|
0 |
Success |
1 |
Usage error (missing query, bad arguments) |
2 |
CSV parse error (with 1-based row number) |
3 |
SQL error (with sqlite3 error message, available columns, and a "did you mean?" hint when applicable) |
All error messages are prefixed with error: and written to stderr.
On SQL error, sql-pipe also prints the list of columns available in table t and,
when the unknown identifier closely matches a column name (edit distance ≤ 2), a hint:
error: no such column: amout
table "t" has columns: id, amount, region
hint: did you mean "amount"?
Top N rows by a column:
$ cat sales.csv | sql-pipe 'SELECT product, revenue FROM t ORDER BY revenue DESC LIMIT 10'Deduplicate rows:
$ cat contacts.csv | sql-pipe 'SELECT DISTINCT email FROM t'Find rows with missing values:
$ cat users.csv | sql-pipe 'SELECT * FROM t WHERE email = "" OR email IS NULL'Date range filter:
$ cat logs.csv | sql-pipe 'SELECT * FROM t WHERE ts >= "2024-01-01" AND ts < "2024-02-01"'Date columns are auto-detected and stored as ISO 8601 text, so comparison operators and strftime() / julianday() work without any preprocessing.
Compute a derived column:
$ cat products.csv | sql-pipe 'SELECT name, price, ROUND(price * 0.9, 2) as discounted FROM t'Pivot-like aggregation with conditional sums:
$ cat orders.csv | sql-pipe 'SELECT region, SUM(CASE WHEN status="paid" THEN amount ELSE 0 END) as paid, SUM(CASE WHEN status="refunded" THEN amount ELSE 0 END) as refunded FROM t GROUP BY region'These run against live public URLs — no local files needed.
La Liga: all-time home wins (1929–present)
The engsoccerdata dataset covers Spanish first-division football since the inaugural season:
$ curl -s https://raw.githubusercontent.com/jalapic/engsoccerdata/master/data-raw/spain.csv \
| sql-pipe 'SELECT home AS team, COUNT(*) AS wins
FROM t WHERE CAST(hgoal AS INTEGER) > CAST(vgoal AS INTEGER) AND tier=1
GROUP BY home ORDER BY wins DESC LIMIT 8'
Real Madrid,1174
FC Barcelona,1163
Atletico Madrid,956
Athletic Bilbao,942
Valencia CF,917
Sevilla FC,815
Espanyol Barcelona,777
Real Sociedad,721La Liga: highest-scoring seasons as JSON
Same dataset, different angle — output as JSON for downstream tools:
$ curl -s https://raw.githubusercontent.com/jalapic/engsoccerdata/master/data-raw/spain.csv \
| sql-pipe --json \
'SELECT Season, COUNT(*) AS matches,
ROUND(CAST(SUM(CAST(hgoal AS INTEGER)+CAST(vgoal AS INTEGER)) AS REAL)/COUNT(*),2) AS avg_goals
FROM t WHERE tier=1 GROUP BY Season ORDER BY avg_goals DESC LIMIT 5'
[{"Season":1929,"matches":90,"avg_goals":4.67},{"Season":1932,"matches":90,"avg_goals":4.44},...]OWID: countries by solar electricity share (2023)
Our World in Data publishes annual energy statistics for 200+ countries. Find who leads on solar:
$ curl -s https://raw.githubusercontent.com/owid/energy-data/refs/heads/master/owid-energy-data.csv \
| sql-pipe 'SELECT country, ROUND(solar_share_elec,1) AS solar_pct
FROM t WHERE year=2023 AND solar_share_elec IS NOT NULL
AND iso_code NOT LIKE "%OWID%"
ORDER BY solar_pct DESC LIMIT 8'
Cook Islands,50.0
Palestine,40.0
Namibia,27.0
Kiribati,25.0
Lebanon,22.3
Luxembourg,20.6
Chile,20.1
El Salvador,20.1OWID: wind + solar combined — two-pass query
Add wind and solar in a first pass, then filter above 30% in a second.
-H passes column names through to the next stage. Spain sits at 40%:
$ ENERGY=https://raw.githubusercontent.com/owid/energy-data/refs/heads/master/owid-energy-data.csv
$ curl -s "$ENERGY" \
| sql-pipe -H 'SELECT country,
ROUND(solar_share_elec,1) AS solar,
ROUND(wind_share_elec,1) AS wind,
ROUND(solar_share_elec+wind_share_elec,1) AS total
FROM t WHERE year=2023 AND iso_code NOT LIKE "%OWID%"
AND solar_share_elec IS NOT NULL AND wind_share_elec IS NOT NULL' \
| sql-pipe 'SELECT country, solar, wind, total FROM t
WHERE CAST(total AS REAL) >= 30 ORDER BY total DESC LIMIT 10'
Denmark,10.8,57.2,68.0
Lithuania,13.0,47.9,60.9
Luxembourg,20.6,35.5,56.0
Cook Islands,50.0,0.0,50.0
Netherlands,16.3,24.6,41.0
Uruguay,3.8,37.1,41.0
Greece,18.2,22.5,40.7
Spain,17.4,23.0,40.4
Germany,12.6,27.7,40.3
Palestine,40.0,0.0,40.0REST API: European population density
restcountries.com returns a JSON array. Reshape
with jq into NDJSON (one object per line) and query directly with -I ndjson:
$ curl -s https://restcountries.com/v3.1/region/europe \
| jq -c '.[] | {country: .name.common, pop: .population, area: .area}' \
| sql-pipe -I ndjson \
'SELECT country, pop, area, ROUND(CAST(pop AS REAL)/area,1) AS density
FROM t WHERE area > 0 ORDER BY density DESC LIMIT 8'
Monaco,38423,2.02,19021.3
Gibraltar,38000,6.0,6333.3
Malta,574250,316.0,1817.2
Vatican City,882,0.49,1800.0
Jersey,103267,116.0,890.2
Guernsey,64781,78.0,830.5
San Marino,34132,61.0,559.5
Netherlands,18100436,41865.0,432.4Live weather: 7-day Madrid forecast
Open-Meteo serves free forecasts as JSON. The daily
arrays need transposing into objects — jq handles that, then -I ndjson loads
the result:
$ curl -s "https://api.open-meteo.com/v1/forecast?latitude=40.4168&longitude=-3.7038\
&daily=temperature_2m_max,temperature_2m_min,precipitation_sum\
&timezone=Europe%2FMadrid&forecast_days=7" \
| jq -c '.daily
| [.time, .temperature_2m_max, .temperature_2m_min, .precipitation_sum]
| transpose
| .[] | {day:.[0], max_c:.[1], min_c:.[2], rain_mm:.[3]}' \
| sql-pipe -I ndjson 'SELECT day, max_c, min_c, rain_mm FROM t ORDER BY day'
2026-05-01,24.3,11.8,0.0
2026-05-02,19.2,14.5,3.9
2026-05-03,20.5,12.5,7.0
2026-05-04,19.3,11.3,0.2
2026-05-05,16.9,9.1,1.8
2026-05-06,19.7,7.3,0.0
2026-05-07,19.6,10.7,2.1La Liga: season lengths reveal COVID and the World Cup
The same engsoccerdata dataset has a
Date column in YYYY-MM-DD format. sql-pipe auto-detects it as DATE and
stores it as ISO 8601 text, so julianday() works directly — no preprocessing:
$ curl -s https://raw.githubusercontent.com/jalapic/engsoccerdata/master/data-raw/spain.csv \
| sql-pipe 'SELECT Season,
MIN(Date) AS start,
MAX(Date) AS end,
CAST(julianday(MAX(Date)) - julianday(MIN(Date)) AS INTEGER) AS days
FROM t WHERE tier=1 AND Season BETWEEN 2018 AND 2022
GROUP BY Season ORDER BY Season'
2018,2018-08-17,2019-05-19,275
2019,2019-08-16,2020-07-19,338
2020,2020-09-12,2021-05-23,253
2021,2021-08-13,2022-05-22,282
2022,2022-08-12,2023-06-04,296The 2019–20 season spans 338 days: COVID suspended play in March 2020 and pushed the final round to July. The 2022–23 season runs 296 days due to the November World Cup break. A normal season is ~275 days.
Each run opens a fresh :memory: SQLite database. The header row drives a CREATE TABLE t (...) with types inferred from the first 100 rows — INTEGER, REAL, DATE, DATETIME, or TEXT. Date variants use TEXT affinity so ISO 8601 string semantics are preserved and all SQLite date functions work correctly. Rows are loaded in a single transaction via a prepared INSERT statement, then sqlite3_exec runs your query and prints rows one by one.
The database never touches disk and vanishes when the process exits. No state, no cleanup.
- Single table per invocation. For joins, use chained
sql-pipecalls or aWITHCTE.
- q — similar concept in Python; handles quoted CSV fields and more formats. Better if you're already in a Python environment.
- trdsql — Go alternative with multi-format support (JSON, LTSV) and output formatting. Better if you need non-CSV inputs.
- sqlite-utils — better if you need persistent databases, schema management, or Python scripting.