Optimize ImmutableHashSet<T>.SetEquals to avoid unnecessary allocations#126309
Conversation
1eb3cee to
8c80f9d
Compare
8c80f9d to
6709a9e
Compare
6709a9e to
9910d86
Compare
9910d86 to
ff6af74
Compare
ff6af74 to
5f2749e
Compare
3c685c8 to
45c2c14
Compare
45c2c14 to
6a3ebf6
Compare
eiriktsarpalis
left a comment
There was a problem hiding this comment.
This change is adding a whole lot of runtime type checks. Is there tangible evidence (e.g. in the form of microbenchmarks) showing improvement here (both when other is a set but more importantly when it is not)?.
As the benchmark results indicate, there is no performance regression even in the fallback paths. This demonstrates that the added runtime type checks do not impact performance, while providing massive gains in the optimized paths |
edeeb71 to
d769373
Compare
What do the numbers show when comparing small (0-10 elements) or collections that are not equal? |
d769373 to
5e1f434
Compare
Performance Comparison: Before vs. After Optimization (10 elements)
Based on the data, there is a minor increase in execution time for micro-collections (N=10) due to the necessary type-checks and comparer validation. In exchange, we achieved Zero Allocations for all Set-to-Set comparisons. What do you think, is it a good trade-off ? |
|
The count validation has been moved out of This refactoring enables us to implement logic like the following: private static bool IsProperSubsetOf(IEnumerable<T> other, MutationInput origin)
{
Requires.NotNull(other, nameof(other));
if (origin.Root.IsEmpty)
{
return other.Any();
}
if (other is ICollection<T> otherAsICollectionGeneric)
{
// We check for < instead of != because other is not guaranteed to be a set, it could be a collection with duplicates.
if (otherAsICollectionGeneric.Count <= origin.Count)
{
return false;
}
switch (other)
{
case ImmutableHashSet<T> otherAsImmutableHashSet:
if (origin.EqualityComparer.Equals(otherAsImmutableHashSet.KeyComparer))
{
return SetEqualsWithImmutableHashset(otherAsImmutableHashSet, origin);
}
break;
case HashSet<T> otherAsHashset:
if (origin.EqualityComparer.Equals(otherAsHashset.Comparer))
{
return SetEqualsWithHashset(otherAsHashset, origin);
}
break;
}
}
else if (other is ICollection otherAsICollection && otherAsICollection.Count <= origin.Count)
{
return false;
}
var otherSet = new HashSet<T>(other, origin.EqualityComparer);
if (otherSet.Count <= origin.Count)
{
return false;
}
return SetEqualsWithHashset(otherSet, origin);
} |
5e1f434 to
efd60f6
Compare
efd60f6 to
35150bc
Compare
35150bc to
bc9e969
Compare
eiriktsarpalis
left a comment
There was a problem hiding this comment.
Could you share updated benchmark numbers please?
I have already updated the PR description with the latest benchmark results. You can find the comparison and the performance improvements detailed there |
|
CI seems to be stuck on infrastructure issues. Could someone please trigger a re-run for the failed checks? Thanks! |
|
/ba-g test failures are unrelated |
Fixes #90986, Part of #127279
Summary
ImmutableHashSet<T>.SetEqualsalways creates a new intermediateHashSet<T>for theothercollection, leading to avoidable allocations and GC pressure, especially for large datasetsOptimization Logic
falseifotheris anICollectionwith a smallerCount, avoiding any overhead.ImmutableHashSet<T>andHashSet<T>to bypass intermediate allocations.EqualityComparercompatibility before triggering fast paths to ensure logical consistency.Countwithin specialized paths for an immediate exit beforenew HashSet<T>(other)fallback.IEnumerabletypes.Click to expand Benchmark Source Code
Click to expand Benchmark Results
Benchmark Results (Before Optimization)
Benchmark Results (After Optimization)
Performance Analysis Summary (100,000 Elements)