Fix compactor concurrency limit not enforced across compaction levels by adamalexandru4 · Pull Request #7303 · cortexproject/cortex

adamalexandru4 · 2026-02-28T10:02:22Z

What this PR does

BucketCompactor.Compact() has a for loop that repeatedly calls grouper.Groups() after each successful compaction (shouldRerun=true). The grouper returns up to compactionConcurrency groups per call, but since the loop calls it multiple times, the effective limit is unbounded.

With concurrency=1 and blocks [6×2h, 2×12h]:

Iteration 1: Groups() → 1 group (6×2h → 12h), compacts, shouldRerun=true
Iteration 2: Groups() → 1 group (3×12h → 24h), compacts, shouldRerun=true
Iteration 3: Groups() → 0 groups, loop breaks

Result: 2 compactions in one pass despite concurrency=1. Each downloads blocks to local disk, causing unexpected disk usage spikes.

Fix

Track cumulative groups returned across Groups() calls within one Compact() invocation. Once the total reaches compactionConcurrency, return empty:

func (g *ShuffleShardingGrouper) Groups(blocks map[ulid.ULID]*metadata.Meta) ([]*compact.Group, error) {
    remainingConcurrency := g.compactionConcurrency - g.totalGroupsPlanned
    if remainingConcurrency <= 0 {
        return nil, nil
    }
    // ... existing logic, using remainingConcurrency as the limit ...
    g.totalGroupsPlanned += len(outGroups)
    return outGroups, nil
}

This is safe because the grouper is created fresh in each compactUser() call, so totalGroupsPlanned resets every pass.

The fix currently covers:

ShuffleShardingGrouper (sharding_strategy: shuffle-sharding)
PartitionCompactionGrouper (sharding_strategy: shuffle-sharding + compaction_strategy: partitioning)

Gap: The DefaultBlocksGrouperFactory path (used when sharding_strategy: default or sharding is disabled) delegates to the Thanos DefaultGrouper, which is not patched. This affects single-tenant / non-sharded setups. Plan is to wrap it in a concurrency-limiting adapter.

Tests

Unit tests for both ShuffleShardingGrouper and PartitionCompactionGrouper: call Groups() twice with concurrency=1, assert second call returns 0 groups
Integration test reproducing the exact bug scenario (6×2h + 2×12h blocks)

Which issue(s) this PR fixes:
Fixes [#7298]

Checklist

Tests updated
Documentation added
CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

yeya24 · 2026-03-12T21:09:21Z

Hey @adamalexandru4, can you fix changelog and lint?

adamalexandru4 · 2026-03-16T12:19:08Z

I'll handle ASAP.

Signed-off-by: Alexandru Adam <aadam@adobe.com>

yeya24 · 2026-03-30T04:33:50Z

Umm, I don't know if it is a real bug. We run partitioning compactor in prod for several years and I am pretty sure that it runs 1 job at most with concurrency set to 1.
Thanos worker limits the number of concurrent jobs with concurrency. https://github.com/thanos-io/thanos/blob/aec3c15b6803a4c24286ad3d3a9dca6f357657aa/pkg/compact/compact.go#L1510

So it should block and wait for the current jobs to finish before compacting more.

With concurrency=1 and blocks [6×2h, 2×12h]:
Iteration 1: Groups() → 1 group (6×2h → 12h), compacts, shouldRerun=true
Iteration 2: Groups() → 1 group (3×12h → 24h), compacts, shouldRerun=true
Iteration 3: Groups() → 0 groups, loop breaks

I got this example. But compactor should stop and wait after each iteration

pull-request-size Bot added the size/L label Feb 28, 2026

dosubot Bot added component/compactor type/bug labels Feb 28, 2026

adamalexandru4 force-pushed the fix/compactor-concurrency-limit branch from 72a39f1 to 352b8db Compare February 28, 2026 10:06

Fix compactor concurrency limit not enforced across compaction levels

f8995cb

Signed-off-by: Alexandru Adam <aadam@adobe.com>

adamalexandru4 force-pushed the fix/compactor-concurrency-limit branch from 4f3f454 to f8995cb Compare March 17, 2026 18:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix compactor concurrency limit not enforced across compaction levels#7303

Fix compactor concurrency limit not enforced across compaction levels#7303
adamalexandru4 wants to merge 1 commit intocortexproject:masterfrom
adamalexandru4:fix/compactor-concurrency-limit

adamalexandru4 commented Feb 28, 2026

Uh oh!

yeya24 commented Mar 12, 2026

Uh oh!

adamalexandru4 commented Mar 16, 2026

Uh oh!

yeya24 commented Mar 30, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

adamalexandru4 commented Feb 28, 2026

What this PR does

Fix

Tests

Uh oh!

yeya24 commented Mar 12, 2026

Uh oh!

adamalexandru4 commented Mar 16, 2026

Uh oh!

yeya24 commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

yeya24 commented Mar 30, 2026 •

edited

Loading