Skip to content

HBASE-30049 RestoreSnapshotHelper creates StoreFileTracker with wrong…#8013

Open
bjomobo wants to merge 1 commit intoapache:masterfrom
bjomobo:HBASE-30049-fix-restore-sft
Open

HBASE-30049 RestoreSnapshotHelper creates StoreFileTracker with wrong…#8013
bjomobo wants to merge 1 commit intoapache:masterfrom
bjomobo:HBASE-30049-fix-restore-sft

Conversation

@bjomobo
Copy link
Copy Markdown

@bjomobo bjomobo commented Mar 31, 2026

Description

RestoreSnapshotHelper.restoreRegion() creates a StoreFileTracker using the raw Master conf which lacks table-level settings like hbase.store.file-tracker.impl=FILE. This causes DefaultStoreFileTracker to be used, whose doSetStoreFiles() is a no-op. The .filelist is never updated after restore, leading to FileNotFoundException when regions try to open files that were archived.

Regression introduced by HBASE-28564.

Changes

  • Merge table descriptor config via StoreUtils.createStoreConfiguration() before creating the tracker in restoreRegion()
  • Move tracker creation inside the snapshotFamilyFiles != null check to avoid NullPointerException on families being removed
  • Add withColumnFamilyDescriptor() to the "Add families not present in the table" code path

New Tests

  • TestRestoreSnapshotProcedureFileBasedSFT — end-to-end restore with FILE tracker
  • TestRestoreSnapshotHelperWithFileBasedSFT — unit-level .filelist verification
  • TestRestoreSnapshotFileTrackerTableLevel — table-level FILE tracker with compaction and multi-family restore

Jira: https://issues.apache.org/jira/browse/HBASE-30049

@bjomobo bjomobo force-pushed the HBASE-30049-fix-restore-sft branch 2 times, most recently from e3e3c55 to 6af91ac Compare April 2, 2026 18:52
… config causing no-op filelist updates

RestoreSnapshotHelper.restoreRegion() creates a StoreFileTracker using
the raw Master Configuration object, which does not contain table-level
settings like hbase.store.file-tracker.impl=FILE. This causes
DefaultStoreFileTracker to be instantiated, whose doSetStoreFiles() is
a complete no-op. The .filelist is never updated after the restore moves
HFiles to the archive and creates link files for the snapshot's HFiles.

When a region subsequently opens, the stale .filelist references HFiles
that were moved to the archive, resulting in FileNotFoundException and
the region getting stuck in OPENING state indefinitely.

This is a regression introduced by HBASE-28564, which refactored
reference file creation to go through the StoreFileTracker interface.
The cloneRegion() method in the same commit correctly merges the table
descriptor config via StoreUtils.createStoreConfiguration() before
creating the tracker, but restoreRegion() was missed.

The fix applies the same pattern: merge the table descriptor and column
family descriptor configuration into the Configuration object before
passing it to StoreFileTrackerFactory.create(). This ensures the
correct StoreFileTracker implementation is resolved based on the
table-level setting.

Both locations in restoreRegion() are fixed:
1. For existing families already on disk
2. For new families added from the snapshot
@bjomobo bjomobo force-pushed the HBASE-30049-fix-restore-sft branch from 6af91ac to 282757c Compare April 6, 2026 15:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants