Skip to content

Hive: Fix integer overflow in HMS createTime and lastAccessTime#16620

Open
wombatu-kun wants to merge 2 commits into
apache:mainfrom
wombatu-kun:hive-fix-hms-time-overflow
Open

Hive: Fix integer overflow in HMS createTime and lastAccessTime#16620
wombatu-kun wants to merge 2 commits into
apache:mainfrom
wombatu-kun:hive-fix-hms-time-overflow

Conversation

@wombatu-kun

@wombatu-kun wombatu-kun commented May 30, 2026

Copy link
Copy Markdown
Contributor

HiveOperationsBase.newHmsTable and HiveViewOperations.newHMSView set the HMS createTime and lastAccessTime as (int) currentTimeMillis / 1000. Java operator precedence makes this ((int) currentTimeMillis) / 1000, so the 64-bit epoch-millis value is truncated to 32 bits before the division. The resulting seconds value is nonsensical: it lands around January 1970 and is negative for part of every ~49-day wrap cycle, instead of the intended creation time. The fix parenthesizes the division as (int) (currentTimeMillis / 1000).

HMS overwrites createTime server-side on create, so the persisted and observable symptom is lastAccessTime, which HMS stores exactly as the client sends it. A garbage lastAccessTime shows up in DESCRIBE FORMATTED and can mislead staleness or retention tooling that reads TBLS.LAST_ACCESS_TIME.

Tests

Added regression coverage that creates a table and a view through HiveCatalog and asserts the HMS lastAccessTime reflects the current epoch second: TestHiveCatalog.newTableSetsCurrentHmsLastAccessTime and TestHiveViewCatalog.newViewSetsCurrentHmsLastAccessTime. Both fail before the fix (the value lands near 1970) and pass after it.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added the hive label May 30, 2026

@Baunsgaard Baunsgaard left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, I have a few minor nits.

Comment on lines -216 to -217
(int) currentTimeMillis / 1000,
(int) currentTimeMillis / 1000,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you are anyway fixing it, it is a bit odd to do the calculation twice,

How about editing the variable above and call it:

final int currentTime =  (int) (System.currentTimeMillis() / 1000L);

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done 2334107 - hoisted into a single currentTimeSeconds local (named for the unit) in both the table and view creation paths.

Comment on lines +183 to +185
// HMS overwrites createTime server-side on create, but keeps the lastAccessTime that Iceberg
// sends. Computing it as '(int) currentTimeMillis / 1000' casts the long before dividing and
// overflows to a 1970-era value; assert it instead reflects the current epoch second.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest removing this comment.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done 2334107 - also dropped the mirrored comment in the view test.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@wombatu-kun wombatu-kun requested a review from Baunsgaard June 5, 2026 10:32

@Baunsgaard Baunsgaard left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants