Skip to content

[lake/paimon] Support custom Paimon lake table path#3476

Open
wzx140 wants to merge 4 commits into
apache:mainfrom
wzx140:codex/custom-paimon-lake-table-path-red
Open

[lake/paimon] Support custom Paimon lake table path#3476
wzx140 wants to merge 4 commits into
apache:mainfrom
wzx140:codex/custom-paimon-lake-table-path-red

Conversation

@wzx140

@wzx140 wzx140 commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Support custom Paimon lake table path with table.datalake.database-name and table.datalake.table-name.
  • Route Flink $lake reads and Paimon tiering creation/writes to the mapping lake table path.
  • Route all alter lake requests to the mapping lake table path.
  • Clear existing tiering progress when lake table path mapping parameters change, so later tiering starts from the beginning for the new sink table.
  • Allow paimon.path to be changed only together with lake table path mapping parameters; reject standalone paimon.path alteration.
  • Reject custom lake table path mapping for Spark lake reads in this phase.

UT Coverage

  • FlinkCatalogTest#testGetLakeTableWithCustomLakePath: verifies Flink catalog resolves $lake and metadata-table reads to the custom lake database/table.
  • FlinkUnionReadPrimaryKeyTableITCase#testUnionReadWithCustomLakeTablePath: verifies Flink union read can read from the mapped Paimon lake table and merge lake data with the Fluss tail.
  • PaimonTieringITCase#testTieringWithCustomLakeTablePath: verifies Paimon tiering creates/writes the lake table at the custom mapped path and records the expected snapshot progress.
  • LakeEnabledTableCreateITCase#testAlterLakePathOptionsValidation: verifies lake path mapping options can be altered while datalake is disabled, and cannot be altered after datalake is enabled.
  • LakeEnabledTableCreateWithHiveCatalogITCase#testAlterPaimonPathRequiresLakePathOptions: verifies standalone paimon.path alteration is rejected, while paimon.path altered together with mapping options is accepted and used when enabling datalake with HiveCatalog.
  • CoordinatorEventProcessorTest#testAlterLakePathClearsProgressWhenDatalakeDisabled: verifies changing lake path mapping while datalake is disabled clears existing tiering progress.

@luoyuxia luoyuxia requested a review from Copilot June 12, 2026 04:07

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds support for mapping Fluss lake-enabled tables to a custom Paimon lake table database/table name, and routes lake reads/writes/alter operations through the resolved lake table path while validating/guarding unsupported connectors.

Changes:

  • Introduce table.datalake.database-name / table.datalake.table-name and LakeTableUtil to resolve the effective lake TablePath.
  • Route Flink $lake reads, coordinator lake-catalog operations, and Paimon tiering writer/committer to the mapped lake table path.
  • Clear persisted tiering progress when lake table path mapping changes; add Hive-catalog IT coverage and Spark connector restriction for custom mapping.

Reviewed changes

Copilot reviewed 26 out of 26 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
pom.xml Adds Hive/Derby/Datanucleus version properties used by new Hive-catalog IT.
fluss-spark/fluss-spark-common/src/main/scala/org/apache/fluss/spark/SparkTable.scala Rejects custom lake path usage in Spark connector (currently).
fluss-server/src/test/java/org/apache/fluss/server/zk/data/lake/LakeTableHelperTest.java Adds test for clearing lake progress snapshot + offset files.
fluss-server/src/test/java/org/apache/fluss/server/coordinator/CoordinatorEventProcessorTest.java Adds test to ensure altering lake path clears committed progress when datalake disabled.
fluss-server/src/main/java/org/apache/fluss/server/zk/data/lake/LakeTableHelper.java Adds clearLakeTableProgress API to delete ZK entry + discard snapshot metadata.
fluss-server/src/main/java/org/apache/fluss/server/zk/ZooKeeperClient.java Adds deleteLakeTable helper to remove lake progress node.
fluss-server/src/main/java/org/apache/fluss/server/utils/TableDescriptorValidation.java Validates custom lake path support + alter-property restrictions (including paimon.path rules).
fluss-server/src/main/java/org/apache/fluss/server/coordinator/MetadataManager.java Routes lake catalog create/alter/schema ops via resolved lake table path.
fluss-server/src/main/java/org/apache/fluss/server/coordinator/CoordinatorService.java Routes lake table creation via resolved lake table path.
fluss-server/src/main/java/org/apache/fluss/server/coordinator/CoordinatorEventProcessor.java Clears lake table progress when resolved lake path changes.
fluss-lake/fluss-lake-paimon/src/test/java/org/apache/fluss/lake/paimon/tiering/PaimonTieringITCase.java Adds tiering IT verifying custom mapped lake table path.
fluss-lake/fluss-lake-paimon/src/test/java/org/apache/fluss/lake/paimon/flink/FlinkUnionReadPrimaryKeyTableITCase.java Adds union-read IT validating $lake reads with custom mapping.
fluss-lake/fluss-lake-paimon/src/test/java/org/apache/fluss/lake/paimon/LakeEnabledTableCreateWithHiveCatalogITCase.java Adds Hive metastore IT verifying paimon.path alteration rules with mapping options.
fluss-lake/fluss-lake-paimon/src/test/java/org/apache/fluss/lake/paimon/LakeEnabledTableCreateITCase.java Adds IT validating alter behavior of lake path options when enabled/disabled.
fluss-lake/fluss-lake-paimon/src/main/java/org/apache/fluss/lake/paimon/utils/PaimonConversions.java Allows path option to be set (removes CoreOptions.PATH from unsettable).
fluss-lake/fluss-lake-paimon/src/main/java/org/apache/fluss/lake/paimon/tiering/PaimonLakeWriter.java Uses resolved lake table path when opening the Paimon table for writing.
fluss-lake/fluss-lake-paimon/src/main/java/org/apache/fluss/lake/paimon/tiering/PaimonLakeCommitter.java Uses resolved lake table path for table access and stats computation.
fluss-lake/fluss-lake-paimon/pom.xml Adds Hive/Derby/Datanucleus test dependencies for Hive-catalog IT.
fluss-flink/fluss-flink-common/src/test/java/org/apache/fluss/flink/catalog/FlinkCatalogTest.java Adds Flink catalog test for $lake resolution with custom mapping.
fluss-flink/fluss-flink-common/src/main/java/org/apache/fluss/flink/utils/LakeSourceUtils.java Creates lake sources using resolved lake table path.
fluss-flink/fluss-flink-common/src/main/java/org/apache/fluss/flink/lake/LakeTableFactory.java Resolves Flink lake ObjectIdentifier using mapped database/table + metadata suffix.
fluss-flink/fluss-flink-common/src/main/java/org/apache/fluss/flink/catalog/FlinkCatalog.java Routes lake table lookup to mapped lake path and correct metadata-table naming.
fluss-common/src/main/java/org/apache/fluss/metadata/TableInfo.java Exposes getLakeTablePath() / hasCustomLakePath() helpers.
fluss-common/src/main/java/org/apache/fluss/metadata/LakeTableUtil.java New utility to resolve lake TablePath and metadata table names with mapping.
fluss-common/src/main/java/org/apache/fluss/config/FlussConfigUtils.java Allows lake mapping options to be altered (when permitted).
fluss-common/src/main/java/org/apache/fluss/config/ConfigOptions.java Adds new config options for lake database/table mapping.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pom.xml
@wzx140

wzx140 commented Jun 12, 2026

Copy link
Copy Markdown
Contributor Author

@luoyuxia Could you please help review this one?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants