Faster true count using AVX2 and AVX512 instructions #6931
CodSpeed HQ / CodSpeed Performance Analysis
succeeded
Mar 14, 2026 in 0s
Performance Gate Passed
⚡ 8 improved benchmarks
✅ 1001 untouched benchmarks
⏩ 1515 skipped benchmarks1
Performance Changes
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ⚡ | Simulation | null_count_run_end[(100000, 1024, 0.1)] |
65.6 µs | 57.8 µs | +13.37% |
| ⚡ | Simulation | null_count_run_end[(100000, 1024, 0.5)] |
65.9 µs | 58.2 µs | +13.31% |
| ⚡ | Simulation | null_count_run_end[(100000, 256, 0.5)] |
75.2 µs | 67.4 µs | +11.59% |
| ⚡ | Simulation | null_count_run_end[(100000, 256, 0.01)] |
72.8 µs | 64.9 µs | +12.14% |
| ⚡ | Simulation | null_count_run_end[(100000, 256, 0.1)] |
73.1 µs | 65.3 µs | +12.06% |
| ⚡ | Simulation | true_count_vortex_buffer[16384] |
3.6 µs | 2.4 µs | +52.36% |
| ⚡ | Simulation | true_count_vortex_buffer[65536] |
11.8 µs | 6.6 µs | +79.68% |
| ⚡ | Simulation | true_count_vortex_buffer[128] |
1,013.9 ns | 538.6 ns | +88.24% |
Comparing rk/truecount (d3c062d) with develop (fc4d111)
Footnotes
-
1515 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩
Loading