Skip to content

[ZEPPELIN-6208] Enable DuckDB support in JDBC Interpreter by excluding incompatible default properties#5276

Open
hyunw9 wants to merge 1 commit into
apache:masterfrom
hyunw9:ZEPPELIN-6208
Open

[ZEPPELIN-6208] Enable DuckDB support in JDBC Interpreter by excluding incompatible default properties#5276
hyunw9 wants to merge 1 commit into
apache:masterfrom
hyunw9:ZEPPELIN-6208

Conversation

@hyunw9

@hyunw9 hyunw9 commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

What is this PR for?

The JDBC interpreter forwards all of an interpreter's prefixed properties (default.*) straight to the JDBC driver when opening a connection. Many of these keys are consumed by Zeppelin itself or by the DBCP connection pool — e.g. driver, url, precode, statementPrecode,
completer.ttlInSeconds, validationQuery, maxIdle — and are not valid JDBC connection properties.

Lenient drivers (PostgreSQL, MySQL) silently ignore unknown properties, so this went unnoticed. But strict drivers reject the connection outright. For example, DuckDB fails with:

SQLException: Invalid Input Error: The following options were not recognized:
completer.ttlInSeconds, url, driver, maxIdle

This currently makes it impossible to use DuckDB (and other strict drivers such as MS SQL Server) with the JDBC interpreter.

Until now this was worked around with a driver-specific whitelist that only applied when the driver class was Presto/Trino (PRESTO_PROPERTIES). That approach is hard to maintain: every new strict driver needs its own if branch and allow-list, and it also strips any legitimate driver
property the user added but that wasn't on the list.

This PR replaces that with a generic, driver-agnostic deny-list:

  • Introduces NON_DRIVER_PROPERTIES, a single source of truth listing only Zeppelin-internal and DBCP pool keys (user/password are deliberately kept, since they are standard JDBC properties).
  • Adds toDriverProperties(), which returns a filtered copy of the properties handed to the driver, leaving the original untouched (the previous Presto code mutated the shared per-user Properties in place — a latent bug).
  • Removes the Presto/Trino special-case branch and the PRESTO_PROPERTIES whitelist; the generic filter subsumes it.
  • Adds an optional escape hatch, zeppelin.jdbc.driver.excludeProperties (comma-separated), so operators can exclude additional keys for future drivers without a code change.

Result: DuckDB, MS SQL Server, Trino/Presto, and any future strict driver work through one consistent rule, while genuine driver properties (e.g. SSL, useSSL, sslmode) still pass through unchanged.

What type of PR is it?

Bug Fix

Todos

  • Replace Presto/Trino whitelist with a generic internal-property deny-list
  • Keep user/password and arbitrary driver properties (e.g. SSL) flowing to the driver
  • Add unit tests for the filtering logic
  • Add end-to-end tests against real strict drivers (DuckDB embedded, Trino)
Property setting (1) Property setting (2)
스크린샷 2026-06-25 오후 9 20 23 스크린샷 2026-06-25 오후 9 20 29

2. Result — Before / After

Before — DuckDB connection error After — connected successfully
before-duckdb-error after-success

What is the Jira issue?

How should this be tested?

Automated tests added in JDBCInterpreterTest:

  • testToDriverProperties — internal/pool keys are removed; user, password, and arbitrary driver props (SSL) are kept; the original Properties is not mutated.
  • testToDriverPropertiesWithUserDefinedExcludes — zeppelin.jdbc.driver.excludeProperties strips additional user-specified keys.
  • testDuckDbConnectionWithInternalProperties — end-to-end: configures the interpreter with internal keys present and runs CREATE/INSERT/SELECT against an embedded DuckDB (no server needed). Fails on the old code, passes here.
  • testTrinoConnectionWithInternalProperties — end-to-end against a real Trino coordinator; auto-skipped via assumeTrue when none is reachable on localhost:8080.

mvn -pl jdbc -am test -Dtest=JDBCInterpreterTest

Manual: add a JDBC interpreter with default.driver=org.duckdb.DuckDBDriver, default.url=jdbc:duckdb:, add the org.duckdb:duckdb_jdbc dependency in the interpreter settings, leave an internal key such as default.completer.ttlInSeconds=120, and run a query — it now connects successfully.

Questions:

  • Does the license files need to update? - X
  • Is there breaking changes for older versions? - X
  • Does this need documentation? - Optional

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant