Skip to content

[core] Add file format provider SPI#8292

Open
tchivs wants to merge 11 commits into
apache:masterfrom
tchivs:paimon-no-hadoop-format-spi
Open

[core] Add file format provider SPI#8292
tchivs wants to merge 11 commits into
apache:masterfrom
tchivs:paimon-no-hadoop-format-spi

Conversation

@tchivs

@tchivs tchivs commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Purpose

Follow-up to #8193, which lets
engines create a CatalogContext without loading Hadoop configuration when they
provide their own FileIO loader.

That removes one Hadoop initialization path, but ORC/Parquet format construction
can still fall back to Hadoop-backed implementations in some scenarios. This PR
adds an experimental FileFormatProvider SPI so engines can supply ORC/Parquet
format implementations at runtime.

The built-in FileFormatFactory lookup remains the default fallback. Catalog
table-runtime.* options can inject providers without persisting them to table
schemas.

Runtime table options are only used while constructing and validating tables;
they are not persisted into table schemas, and table-runtime.path does not
override the real table path.

A related motivation is supporting engines that provide their own native file
format implementations while reducing Hadoop/Hive dependencies, e.g.
trinodb/trino#15921.

Tests

  • mvn -pl paimon-common,paimon-core,paimon-format -am -Pfast-build -DfailIfNoTests=false -Dtest=CoreOptionsTest,FileFormatProviderTest,FormatProviderNoHadoopTest test
  • mvn -pl paimon-bundle -am -Pfast-build -DskipTests install
  • mvn -pl paimon-api,paimon-common,paimon-core -am -DskipTests spotless:check
  • git diff --check apache/master...HEAD
  • mvn -N org.apache.rat:apache-rat-plugin:0.15:check
  • mvn -pl paimon-core -am -Pfast-build -DfailIfNoTests=false -Dtest=FallbackReadFileStoreTableTest#testSwitchToBranch test
  • mvn -pl paimon-common,paimon-core,paimon-format -am -Pfast-build -DfailIfNoTests=false -Dtest=FileFormatProviderTest,FormatProviderNoHadoopTest,FileSystemCatalogTest#testCreateTableWithRuntimeCatalogOptions,SchemaValidationTest#testDynamicOptionsCanRemoveSchemaOptionsDuringValidation+testSnapshotSequenceOrderingHonorsDynamicWriteOnlyValue,SchemaManagerTest#testCreateTableWithDynamicOptions+testCommitChangesWithDynamicOptions,FileMetaUtilsTest,FallbackReadFileStoreTableTest#testSwitchToBranch,CoreOptionsTest,FileFormatTest test

@tchivs tchivs changed the title [core] Introduce file format provider boundary for no-Hadoop engines [core] Add file format provider SPI Jun 19, 2026
…rmat-spi

# Conflicts:
#	paimon-core/src/main/java/org/apache/paimon/schema/SchemaValidation.java
@tchivs tchivs marked this pull request as ready for review June 23, 2026 02:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant