Skip to content

API, Spark 4.1: Add ignore_missing_files to migrate procedure#16643

Open
drexler-sky wants to merge 1 commit into
apache:mainfrom
drexler-sky:migrate
Open

API, Spark 4.1: Add ignore_missing_files to migrate procedure#16643
drexler-sky wants to merge 1 commit into
apache:mainfrom
drexler-sky:migrate

Conversation

@drexler-sky
Copy link
Copy Markdown
Contributor

No description provided.

@huaxingao huaxingao changed the title API, SPark 4.1: Add ignore_missing_files to migrate procedure API, Spark 4.1: Add ignore_missing_files to migrate procedure Jun 1, 2026
private static final ProcedureParameter PARALLELISM_PARAM =
optionalInParameter("parallelism", DataTypes.IntegerType);
private static final ProcedureParameter IGNORE_MISSING_FILES_PARAM =
optionalInParameter("ignore_missing_files", DataTypes.BooleanType);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you document the new parameter in docs/docs/spark-procedures.md

}

@TestTemplate
public void testMigrateIgnoreMissingFiles() throws IOException {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this test only exercises => true, could you add a negative test too?

sql("SELECT * FROM %s", tableName));
}

private static void deleteDirectory(Path dir) throws IOException {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use FileUtils.deleteDirectory method instead.

* @param ignore whether to ignore missing source files
* @return this for method chaining
*/
default MigrateTable ignoreMissingFiles(boolean ignore) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we really need boolean ignore argument here. We could remove it, and modify MigrateTableProcedure to something like:

      boolean ignoreMissingFiles = input.asBoolean(IGNORE_MISSING_FILES_PARAM, false);
      if (ignoreMissingFiles) {
        migrateTableSparkAction = migrateTableSparkAction.ignoreMissingFiles();
      }

The existing drop_backup parameter employs this style.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants