[benchmark/filtered-search prep] Make benchmarks stateful by hildebrandmw · Pull Request #995 · microsoft/DiskANN

hildebrandmw · 2026-04-29T01:56:52Z

A recurring problem with our current benchmark infrastructure is the SearchPhase enum (selecting what kind of search is conducted) does its job a little too well: every time a new variant is added, we need to either update all users of SearchPhase (bloating compile times) or explicitly opt-out of a particular search phase, which is brittle especially with respect to Benchmark::try_match consistency.

For example see:

This PR takes the first step towards systematically solving this problem by allowing benchmarks registered with diskann_benchmark_runner::Benchmarks to have state rather
than being purely type-level constructs. Stateful benchmarks can have "search plugins" dynamically registered at construction time. These plugins participate in
Benchmark::try_match, Benchmark::description, and Benchmark::run, allowing individual benchmarks to opt into new search-phase variants without requiring changes across all
benchmarks. See #996 as a follow-up implementing this idea

Suggested Reviewing Order

In diskann-benchmark-runner:

benchmark.rs: This is the main change. It simply changes the Benchmark and Regression traits to receive by &self.
registry.rs: Change the signatures of Benchmarks::register and Benchmarks::register_regression to receive the benchmark type by-value.
The rest of the changes are updates to the test infrastructure.

In diskann-benchmark: The main changes involve cleaning up the 'static hack and removing the BuildAndSearch/BuildAndDynamicRun indirection traits that are no longer necessary.

In diskann-benchmark-simd: Feel free to skip.

Copilot

Pull request overview

This PR updates the benchmark framework to support stateful benchmarks by switching Benchmark/Regression APIs from type-level (static) methods to instance methods (&self) and updating the registry to register benchmark values (enabling future dynamic “search plugin” registration per benchmark instance).

Changes:

Convert Benchmark and Regression trait methods (try_match, description, run, check) to take &self.
Update Benchmarks::register / register_regression to accept a benchmark instance by value and store it behind a type-erased wrapper.
Refactor benchmark implementations across diskann-benchmark and diskann-benchmark-simd to remove the prior 'static/dispatcher indirection patterns.

Reviewed changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
diskann-benchmark/src/utils/mod.rs	Updates stub benchmark registration and trait method signatures to the new `&self` API.
diskann-benchmark/src/backend/index/spherical.rs	Refactors spherical index benchmarks to unit-like/stateful benchmarks and inlines prior dispatch indirection.
diskann-benchmark/src/backend/index/scalar.rs	Refactors scalar quantized index benchmarks to constructable/stateful benchmark instances.
diskann-benchmark/src/backend/index/product.rs	Refactors PQ index benchmark to a constructable/stateful benchmark instance.
diskann-benchmark/src/backend/index/benchmarks.rs	Removes `BuildAndSearch`/`BuildAndDynamicRun` indirection and moves logic directly into `Benchmark::run(&self, ...)`.
diskann-benchmark/src/backend/filters/benchmark.rs	Converts metadata index benchmark into a stateful/unit-like benchmark and extracts run logic into a free function.
diskann-benchmark/src/backend/exhaustive/spherical.rs	Refactors exhaustive spherical benchmarks to unit-like/stateful benchmarks.
diskann-benchmark/src/backend/exhaustive/product.rs	Refactors exhaustive product benchmarks to unit-like/stateful benchmarks.
diskann-benchmark/src/backend/exhaustive/minmax.rs	Refactors exhaustive minmax benchmarks to unit-like/stateful benchmarks.
diskann-benchmark/src/backend/disk_index/benchmarks.rs	Refactors disk index benchmark/regression to stateful benchmarks and updates registration accordingly.
diskann-benchmark-simd/src/lib.rs	Updates SIMD regression benchmarks to be instance-based and adjusts kernel execution to pass arch at call time.
diskann-benchmark-runner/src/test/typed.rs	Updates typed test benchmarks/regressions to instance-based implementations (adds constructors).
diskann-benchmark-runner/src/test/mod.rs	Updates benchmark registration in test harness to pass benchmark instances.
diskann-benchmark-runner/src/test/dim.rs	Updates dim test benchmarks/regression to the new `&self` trait signatures.
diskann-benchmark-runner/src/registry.rs	Changes registry APIs to accept benchmark instances and stores them in the wrapper.
diskann-benchmark-runner/src/benchmark.rs	Changes core traits to `&self` methods and updates internal type-erased wrapper and regression plumbing.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

codecov-commenter · 2026-04-29T02:14:06Z

Codecov Report

❌ Patch coverage is 79.55801% with 74 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.50%. Comparing base (767879f) to head (74995cc).

Files with missing lines	Patch %	Lines
diskann-benchmark/src/backend/index/benchmarks.rs	45.21%	63 Missing ⚠️
diskann-benchmark-simd/src/lib.rs	95.65%	5 Missing ⚠️
diskann-benchmark/src/backend/filters/benchmark.rs	91.93%	5 Missing ⚠️
diskann-benchmark/src/utils/mod.rs	75.00%	1 Missing ⚠️

❌ Your patch status has failed because the patch coverage (79.55%) is below the target coverage (90.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #995      +/-   ##
==========================================
+ Coverage   89.48%   89.50%   +0.02%     
==========================================
  Files         448      448              
  Lines       84095    84167      +72     
==========================================
+ Hits        75250    75333      +83     
+ Misses       8845     8834      -11

Flag	Coverage Δ
miri	`89.50% <79.55%> (+0.02%)`	⬆️
unittests	`89.34% <79.55%> (+0.02%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
diskann-benchmark-runner/src/benchmark.rs	`89.21% <100.00%> (-0.60%)`	⬇️
diskann-benchmark-runner/src/registry.rs	`88.26% <100.00%> (+0.26%)`	⬆️
diskann-benchmark-runner/src/test/dim.rs	`89.21% <100.00%> (+1.30%)`	⬆️
diskann-benchmark-runner/src/test/mod.rs	`100.00% <100.00%> (ø)`
diskann-benchmark-runner/src/test/typed.rs	`96.47% <100.00%> (+0.51%)`	⬆️
diskann-benchmark/src/backend/exhaustive/minmax.rs	`100.00% <ø> (ø)`
...iskann-benchmark/src/backend/exhaustive/product.rs	`100.00% <ø> (ø)`
...kann-benchmark/src/backend/exhaustive/spherical.rs	`100.00% <ø> (ø)`
diskann-benchmark/src/backend/index/product.rs	`100.00% <ø> (ø)`
diskann-benchmark/src/backend/index/scalar.rs	`100.00% <ø> (ø)`
... and 5 more

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 16 out of 16 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

arkrishn94

Thanks Mark, looks good to me.

JordanMaples

lgtm

Bump version to 0.51.0 due to propagate changes to downstream consumers ## Breaking API changes (AI Generated) - **`ObjectPool` moved** (#975): now lives in `diskann-utils`. Update imports from `diskann::...::ObjectPool` → `diskann_utils::ObjectPool`. - **`AlignedSlice` removed** (#994): the `AlignedSlice` abstraction in `diskann-vector` is gone. Code that converted between vector representations through `AlignedSlice` should now use the `Poly` / `CastFromSlice` polymorphic interfaces directly (see `diskann-vector::conversion` and `diskann-quantization::alloc::poly`). Storage that previously held `AlignedSlice` values should hold `Poly<T, A>` instead. - **`AsThreadPool` generic removed** (#967): functions that previously took `pool: impl AsThreadPool` now take `pool: &RayonThreadPool`. Pass a borrow of an existing pool; remove the generic parameter from your call sites. - **`sgemm()` returns `Result`** (#997): in `diskann-linalg`, the new signature is: ```rust pub fn sgemm( atranspose: Transpose, btranspose: Transpose, m: usize, n: usize, k: usize, alpha: f32, a: &[f32], b: &[f32], beta: Option<f32>, c: &mut [f32], ) -> Result<(), SgemmError> ``` `SgemmError` has variants `InvalidMatrixDimensions { matrix_name, expected_rows, expected_cols, actual_len }` and `DimensionOverflow { matrix_name, rows, cols }`. Replace previous panic-on-bad-input assumptions with explicit handling. - **Benchmarks are stateful** (#995): the `Benchmark` impls in `diskann-benchmark` are no longer stateless unit structs. Each benchmark type now has a `::new()` constructor (often holding `PhantomData<T>` or plugin state), and registration uses an instance: ```rust // before benchmarks.register("name", MyBench); // after benchmarks.register("name", MyBench::<T>::new()); ``` If you wrote a custom benchmark, give it a `new()` and register an instance. Combined with #996, search-side benchmarks now compose `Plugins<Provider, Phase, Strategy>` and expose builder methods like `.search(plugin)` to register search plugins on the instance. - **`diskann-benchmark`: `async` → `graph-index`** (#1009): the benchmark category previously named `async` was renamed to `graph-index`. JSON config `type` values and example file names changed accordingly: - `async-build` → `graph-index-build` - `async-dynamic-run` → `graph-index-dynamic-run` - and the same prefix swap for `*-pq`, `*-sq`, `*-spherical-quantization`, etc. Update any benchmark config files, scripts, or CI that reference the old `async-*` names. - **`diskann-disk` buffer alignment decoupled from `block_size`** (#984): code that assumed I/O buffer alignment equals the disk block size should now configure alignment explicitly. ## Non-breaking - New cache-aware block-transposed Chamfer/MaxSim distance for f32/f16 (#863). - A/A benchmark documentation (#974); CI publish workflow improvements (#755, #1017); openssl bump (#973); `compute_closest_centers` allocation reduction (#980). - **`DistanceComputer` `'static` bound relaxed** (#1007) and **redundant `DistanceFunction` impls removed** (#1008) **Full Changelog**: v0.50.1...v0.51.0

Make benchmarks stateful.

dcdc377

hildebrandmw mentioned this pull request Apr 29, 2026

[benchmark/filtered-search prep] Search Plugins #996

Merged

Merge branch 'main' into mhildebr/stateful-benchmarks

8a918f5

hildebrandmw marked this pull request as ready for review April 29, 2026 01:59

hildebrandmw requested review from a team and Copilot April 29, 2026 01:59

Copilot started reviewing on behalf of hildebrandmw April 29, 2026 01:59 View session

Copilot AI reviewed Apr 29, 2026

View reviewed changes

Comment thread diskann-benchmark-runner/src/benchmark.rs

Comment thread diskann-benchmark/src/backend/exhaustive/product.rs Outdated

Comment thread diskann-benchmark/src/backend/index/scalar.rs Outdated

Apply suggestions from code review

66d599a

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

hildebrandmw requested a review from Copilot April 29, 2026 17:04

Copilot AI reviewed Apr 29, 2026

View reviewed changes

Comment thread diskann-benchmark-runner/src/benchmark.rs

Comment thread diskann-benchmark/src/backend/index/benchmarks.rs

Comment thread diskann-benchmark-simd/src/lib.rs

Copilot started reviewing on behalf of hildebrandmw April 29, 2026 17:29 View session

arkrishn94 approved these changes Apr 30, 2026

View reviewed changes

Comment thread diskann-benchmark-runner/src/benchmark.rs

Merge branch 'main' into mhildebr/stateful-benchmarks

74995cc

JordanMaples approved these changes May 1, 2026

View reviewed changes

harsha-simhadri approved these changes May 1, 2026

View reviewed changes

hildebrandmw enabled auto-merge (squash) May 1, 2026 17:47

hildebrandmw merged commit e4cd2f3 into main May 1, 2026
26 checks passed

hildebrandmw deleted the mhildebr/stateful-benchmarks branch May 1, 2026 17:52

arkrishn94 mentioned this pull request May 4, 2026

[v0.51.0] Bump version to 0.51.0 #1013

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[benchmark/filtered-search prep] Make benchmarks stateful#995

[benchmark/filtered-search prep] Make benchmarks stateful#995
hildebrandmw merged 4 commits intomainfrom
mhildebr/stateful-benchmarks

hildebrandmw commented Apr 29, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov-commenter commented Apr 29, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

arkrishn94 left a comment

Uh oh!

Uh oh!

JordanMaples left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

hildebrandmw commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Suggested Reviewing Order

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov-commenter commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

arkrishn94 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

JordanMaples left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

hildebrandmw commented Apr 29, 2026 •

edited

Loading

codecov-commenter commented Apr 29, 2026 •

edited

Loading