Fix compactor concurrency limit not enforced across compaction levels#7303
Open
adamalexandru4 wants to merge 2 commits intocortexproject:masterfrom
Open
Fix compactor concurrency limit not enforced across compaction levels#7303adamalexandru4 wants to merge 2 commits intocortexproject:masterfrom
adamalexandru4 wants to merge 2 commits intocortexproject:masterfrom
Conversation
Signed-off-by: Alexandru Adam <aadam@adobe.com>
72a39f1 to
352b8db
Compare
Signed-off-by: Alexandru Adam <45754470+adamalexandru4@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this PR does
BucketCompactor.Compact()has aforloop that repeatedly callsgrouper.Groups()after each successful compaction (shouldRerun=true). The grouper returns up tocompactionConcurrencygroups per call, but since the loop calls it multiple times, the effective limit is unbounded.With
concurrency=1and blocks[6×2h, 2×12h]:Groups()→ 1 group (6×2h → 12h), compacts,shouldRerun=trueGroups()→ 1 group (3×12h → 24h), compacts,shouldRerun=trueGroups()→ 0 groups, loop breaksResult: 2 compactions in one pass despite
concurrency=1. Each downloads blocks to local disk, causing unexpected disk usage spikes.Fix
Track cumulative groups returned across
Groups()calls within oneCompact()invocation. Once the total reachescompactionConcurrency, return empty:This is safe because the grouper is created fresh in each
compactUser()call, sototalGroupsPlannedresets every pass.The fix currently covers:
ShuffleShardingGrouper(sharding_strategy: shuffle-sharding)PartitionCompactionGrouper(sharding_strategy: shuffle-sharding+compaction_strategy: partitioning)Gap: The
DefaultBlocksGrouperFactorypath (used whensharding_strategy: defaultor sharding is disabled) delegates to the ThanosDefaultGrouper, which is not patched. This affects single-tenant / non-sharded setups. Plan is to wrap it in a concurrency-limiting adapter.Tests
ShuffleShardingGrouperandPartitionCompactionGrouper: callGroups()twice withconcurrency=1, assert second call returns 0 groupsWhich issue(s) this PR fixes:
Fixes [#7298]
Checklist
CHANGELOG.mdupdated - the order of entries should be[CHANGE],[FEATURE],[ENHANCEMENT],[BUGFIX]