AI Tool Usage Notice
If you used an AI tool to help draft this issue,
please make sure you have reviewed and validated all content before submitting.
You are responsible for the accuracy and quality of everything in this report.
Low-quality or unreviewed AI-generated submissions may be closed without further investigation.
See our Generative AI Contribution Policy for details.
Describe the bug
TestPartitionCompactor_ShouldCompactOnlyUsersOwnedByTheInstanceOnShardingEnabledAndMultipleInstancesRunning (pkg/compactor) intermittently fails when a compactor does not complete a compaction run within the 60s poll window:
--- FAIL: TestPartitionCompactor_ShouldCompactOnlyUsersOwnedByTheInstanceOnShardingEnabledAndMultipleInstancesRunning (77.67s)
compactor_paritioning_test.go:1217: expected true, got false
The assertion:
// pkg/compactor/compactor_paritioning_test.go:1217
cortex_testutil.Poll(t, 60*time.Second, true, func() any {
return prom_testutil.ToFloat64(c.CompactionRunsCompleted) >= 1
})
With multiple compactors under shuffle-sharding, ring convergence at startup is sometimes slow enough on loaded CI runners that a compactor does not reach one completed run within 60s. (Note the non-partition sibling test, TestCompactor_ShouldCompactOnlyUsersOwnedByTheInstanceOnShardingEnabledAndMultipleInstancesRunning, uses a 120s poll for the same check.) This is the same class of sharding/ring-convergence startup-timing flake as the repeatedly-adjusted compactor ownership tests (e.g. #7486, #7503), but the partition variant has no dedicated issue.
To Reproduce
Steps to reproduce the behavior:
- Start Cortex (recent
master)
- Run repeatedly (flaky):
go test -count=20 -run TestPartitionCompactor_ShouldCompactOnlyUsersOwnedByTheInstanceOnShardingEnabledAndMultipleInstancesRunning ./pkg/compactor/
Expected behavior
Each compactor reliably completes at least one compaction run within the poll window and the test passes deterministically (or the poll timeout accounts for realistic ring-convergence time, matching the non-partition variant's 120s).
Environment:
- Infrastructure: GitHub Actions CI,
ubuntu-24.04 (amd64), test job
- Deployment tool: N/A (Go unit test)
Additional Context
Observed on CI (2026-05-30): https://github.com/cortexproject/cortex/actions/runs/26632776611 (job test (amd64)). The failure is intermittent — the test passes on the large majority of runs.
Filed from CI failure-log analysis with AI assistance; the run link and compactor_paritioning_test.go:1217 were reviewed and verified against master before submitting.
AI Tool Usage Notice
If you used an AI tool to help draft this issue,
please make sure you have reviewed and validated all content before submitting.
You are responsible for the accuracy and quality of everything in this report.
Low-quality or unreviewed AI-generated submissions may be closed without further investigation.
See our Generative AI Contribution Policy for details.
Describe the bug
TestPartitionCompactor_ShouldCompactOnlyUsersOwnedByTheInstanceOnShardingEnabledAndMultipleInstancesRunning(pkg/compactor) intermittently fails when a compactor does not complete a compaction run within the 60s poll window:The assertion:
With multiple compactors under shuffle-sharding, ring convergence at startup is sometimes slow enough on loaded CI runners that a compactor does not reach one completed run within 60s. (Note the non-partition sibling test,
TestCompactor_ShouldCompactOnlyUsersOwnedByTheInstanceOnShardingEnabledAndMultipleInstancesRunning, uses a 120s poll for the same check.) This is the same class of sharding/ring-convergence startup-timing flake as the repeatedly-adjusted compactor ownership tests (e.g. #7486, #7503), but the partition variant has no dedicated issue.To Reproduce
Steps to reproduce the behavior:
master)Expected behavior
Each compactor reliably completes at least one compaction run within the poll window and the test passes deterministically (or the poll timeout accounts for realistic ring-convergence time, matching the non-partition variant's 120s).
Environment:
ubuntu-24.04(amd64),testjobAdditional Context
Observed on CI (2026-05-30): https://github.com/cortexproject/cortex/actions/runs/26632776611 (job
test (amd64)). The failure is intermittent — the test passes on the large majority of runs.Filed from CI failure-log analysis with AI assistance; the run link and
compactor_paritioning_test.go:1217were reviewed and verified againstmasterbefore submitting.