Added a flag which allows disabling locks with Hive catalog#3121
Open
jcellary wants to merge 3 commits intoapache:mainfrom
Open
Added a flag which allows disabling locks with Hive catalog#3121jcellary wants to merge 3 commits intoapache:mainfrom
jcellary wants to merge 3 commits intoapache:mainfrom
Conversation
Added lock disable
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR introduces a new flag LOCK_ENABLED in hive catalog. It is set to true by default to allow for backwards compatibility. If set to false no locks are being created when writing/committing a table.
Rationale for this change
In our production environment we are relying on our own external locking mechanism, so locks are not needed here and furthermore they are causing deadlocks. The current implementation creates a lock, writes data and then removes the lock. When the application dies during data write in a hard way (without a chance to run the finally clause of try catch) then the lock is never removed. There is no mechanism in the lib of removing stale old locks and checking their age. So the effect of that is that when one of the job dies in the wrong moment, all other jobs are stuck forever. Currently in our prod environment every few weeks we have to remove the contents of
HIVE_LOCKStable to unblock jobs.To prevent this from happening we introduced a flag which allows skipping creation of locks.
Are these changes tested?
Yes, we are running a fork with these changes on our production cluster
Are there any user-facing changes?
No