Skip to content

refactor(duckdbservice): extract DuckDB-free arrow helpers into arrowmap subpackage#477

Merged
fuziontech merged 1 commit intomainfrom
feature/binary-split-pr1-arrowmap
May 1, 2026
Merged

refactor(duckdbservice): extract DuckDB-free arrow helpers into arrowmap subpackage#477
fuziontech merged 1 commit intomainfrom
feature/binary-split-pr1-arrowmap

Conversation

@fuziontech
Copy link
Copy Markdown
Member

Summary

  • Step 1 of a 4-PR plan to split duckgres into separate control-plane and worker binaries, so the CP stops linking libduckdb and we can ship per-DuckDB-version worker images via a CI matrix
  • Moves DuckDB-free helpers (DuckDBTypeToArrow, QualifyTableName, QuoteIdent, SupportsLimit, struct/map/decimal type-string parsers) out of duckdbservice/arrow_helpers.go into a new duckdbservice/arrowmap package with zero duckdb-go dependency
  • Pure refactor: no behavior change, no new binary, no CI change. Establishes the package boundary and pattern for the rest of the split.

Why

PR #462 added per-tenant DuckDB/DuckLake version pinning, but every image referenced from the config store today is the all-in-one duckgres binary, which pins one DuckDB version per build. To make that pin per-tenant in production we want to:

  1. Split the binary so the control plane image is DuckDB-free and built once.
  2. Split the binary so the worker image is DuckDB-bound and built per (DuckDB version × arch) from a single CI matrix.
  3. Keep code identical between worker variants (the duckdb-go API is stable across 1.5.x point releases — we just bump the module version).

The CP can't be DuckDB-free until the packages it imports are DuckDB-free. Today controlplane → server → duckdb-go and controlplane → server/flightsqlingress → duckdbservice → duckdb-go. This PR carves out a small piece of duckdbservice (the type-string mapper and identifier quoter) into a leaf package that any caller can use without dragging libduckdb along.

AppendValue is not moved in this PR — it has type switches on duckdb.Interval, duckdb.Decimal, duckdb.UUID, duckdb.OrderedMap, duckdb.Map. Splitting it (DuckDB-free core + DuckDB-bound extension) is the next step.

What's next

  • PR add test user #1.5: move DuckLakeConfig and ducklake_migration.go into a new server/ducklake package. The migration code is already DuckDB-free at the import level (it uses pgx against the tenant metadata Postgres) but currently lives in the same package as server.go, which links DuckDB.
  • PR Add pg_catalog compatibility and binary format support #2: full server/ split into server/wire, server/exec, server/flight, etc. Add a CI step that runs go list -deps ./cmd/duckgres-controlplane and fails if any duckdb-go package shows up.
  • PR Add connection rate limiting to prevent brute-force attacks #3: introduce cmd/duckgres-controlplane and cmd/duckgres-worker binaries; new Dockerfile per binary; standalone all-in-one binary kept for dev.
  • PR Add configurable DuckDB extension loading #4: flip CI/CD and Helm to use the split images; add the per-DuckDB-version worker matrix; update runbooks and the image column semantics.

Test plan

  • go build ./... clean (excluding pre-existing tests/perf issue unrelated to this PR)
  • go build -tags kubernetes ./controlplane/... clean
  • go test ./duckdbservice/arrowmap/... — all moved tests pass in the new package
  • go test ./duckdbservice/... ./server/... ./controlplane/... — green (admin testcontainer failures pre-exist on main and require Docker)
  • go list -deps ./duckdbservice/arrowmap contains zero duckdb-go packages
  • go vet ./... clean

🤖 Generated with Claude Code

…map subpackage

Step 1 of a 4-step plan to split duckgres into separate control-plane and
worker binaries so the control plane can stop linking libduckdb (and the
worker image can ship per-DuckDB-version variants).

This change moves the DuckDB-free helpers (DuckDBTypeToArrow,
QualifyTableName, QuoteIdent, SupportsLimit, and the struct/map/decimal
type-string parsers) out of duckdbservice/arrow_helpers.go into a new
duckdbservice/arrowmap package. arrowmap has zero dependency on
github.com/duckdb/duckdb-go; it's plain string-to-arrow.DataType mapping.

The duckdbservice package keeps the helpers callable as before via thin
re-export shims so existing call sites are unaffected. server/flightsqlingress
is updated to import arrowmap directly for the three call sites that don't
need DuckDB types (the AppendValue call site stays on duckdbservice for now;
splitting AppendValue's duckdb-bound branches lands in a follow-up).

This PR is purely a refactor — no behavior changes, no new binary, no CI
changes. It establishes the package boundary and pattern that subsequent
PRs build on.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@fuziontech fuziontech force-pushed the feature/binary-split-pr1-arrowmap branch from d559dee to 763e55b Compare May 1, 2026 16:02
@fuziontech fuziontech enabled auto-merge (squash) May 1, 2026 16:02
@fuziontech fuziontech merged commit 8e9061a into main May 1, 2026
21 checks passed
@fuziontech fuziontech deleted the feature/binary-split-pr1-arrowmap branch May 1, 2026 16:05
fuziontech added a commit that referenced this pull request May 1, 2026
…ink libduckdb (#498)

After the long sequence of subpackage extractions (PRs #477-#497) the
controlplane import graph is fully free of github.com/duckdb/duckdb-go:

  go list -deps ./controlplane                       | grep duckdb-go   # empty
  go list -tags kubernetes -deps ./controlplane      | grep duckdb-go   # empty
  go list -deps ./cmd/duckgres-controlplane          | grep duckdb-go   # empty
  go list -tags kubernetes -deps ./cmd/duckgres-controlplane | grep duckdb-go   # empty

This PR adds the corresponding entry-point binary at
cmd/duckgres-controlplane. The binary builds and links without
libduckdb, but the actual config-loading + RunControlPlane wiring is
still a stub that errors out at runtime — duckgres' YAML / CLI / env
config resolution lives in the package main file at the repo root
alongside the all-in-one binary, and lifting that into a shared
config-resolution package is its own focused PR.

The point of this PR is to lock in the import-graph win: a CP-only
build is now possible. Future PRs will:

  - Move resolveEffectiveConfig + the YAML/env/CLI plumbing out of
    main.go (repo root) into a shared package both binaries can
    import (still duckdb-free, since it's pure config wrangling).
  - Wire that into cmd/duckgres-controlplane/main.go and remove the
    stub error.
  - Add a CI step that runs the deps check above and fails the build
    if duckdb-go ever creeps back in.
  - Introduce cmd/duckgres-worker for the duckdb-service path so the
    matrix-build CI in the binary-split plan can target it.

Verified:
  - go build ./cmd/duckgres-controlplane/... clean
  - go build -tags kubernetes ./cmd/duckgres-controlplane/... clean
  - go list -deps ./cmd/duckgres-controlplane | grep duckdb-go is empty
  - same with -tags kubernetes
  - The all-in-one duckgres binary at the repo root continues to build
    and run unchanged (it imports duckdbservice, which still pulls
    duckdb-go — that's expected for the worker / standalone modes).

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
fuziontech added a commit that referenced this pull request May 1, 2026
Adds a controlplane-no-libduckdb job that builds cmd/duckgres-controlplane
(both default tags and -tags kubernetes) and then runs

  go list -deps ./cmd/duckgres-controlplane | grep duckdb-go
  go list -deps ./controlplane              | grep duckdb-go

If anything matches, the job fails the build.

This locks in the binary-split achievement from PRs #477-#498. Without
this guard, someone could land a server/* extraction that grows a new
duckdb-go transitive import and silently re-link libduckdb into the CP
build — which would defeat the matrix-build per-DuckDB-version goal.

The error message points at the right diagnostic command and suggests
the right fix (subpackage extraction / registration hook / interface
boundary) rather than encouraging anyone to suppress the check.

Verified locally:
  - go list -deps ./cmd/duckgres-controlplane | grep duckdb-go is empty
  - go list -deps ./controlplane              | grep duckdb-go is empty
  - same with -tags kubernetes

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant