Skip to content

perf[buffer]: iteration for fallible operations with validity#8120

Open
joseph-isaacs wants to merge 21 commits into
developfrom
ji/fast-iter-valid
Open

perf[buffer]: iteration for fallible operations with validity#8120
joseph-isaacs wants to merge 21 commits into
developfrom
ji/fast-iter-valid

Conversation

@joseph-isaacs
Copy link
Copy Markdown
Contributor

@joseph-isaacs joseph-isaacs commented May 27, 2026

Currently use (and arrow) handle fallible operations with scalar (non-SIMD) code.

This PR add a trait and methods to have fast SIMD checked operations (includes cast) but verified else where that checked_add benefits

@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented May 27, 2026

Merging this PR will improve performance by 16.14%

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

⚡ 6 improved benchmarks
✅ 1259 untouched benchmarks
🆕 10 new benchmarks
⏩ 1 skipped benchmark1

Performance Changes

Mode Benchmark BASE HEAD Efficiency
🆕 Simulation cast_i32_to_u32[65536] N/A 832.9 µs N/A
🆕 Simulation cast_u32_to_u8[65536] N/A 250.5 µs N/A
🆕 Simulation cast_u16_to_u32[65536] N/A 210.6 µs N/A
Simulation patched_take_10k_dispersed 316.3 µs 286 µs +10.61%
Simulation patched_take_10k_first_chunk_only 302.6 µs 272.3 µs +11.14%
Simulation patched_take_10k_adversarial 257.2 µs 226.9 µs +13.37%
Simulation take_10k_dispersed 284.8 µs 239.8 µs +18.76%
Simulation take_10k_first_chunk_only 271.1 µs 226.2 µs +19.86%
🆕 Simulation map_with_mask_widen_u16_u32[65536] N/A 189.6 µs N/A
🆕 Simulation try_map_masked_into_widen_u16_u32[65536] N/A 190 µs N/A
🆕 Simulation try_map_into_narrow_u64_u32[65536] N/A 424.1 µs N/A
🆕 Simulation try_map_masked_into_narrow_i32_u32[65536] N/A 292.3 µs N/A
🆕 Simulation try_map_masked_in_place_narrow_i32_u32[65536] N/A 172.7 µs N/A
🆕 Simulation map_with_mask_narrow_u64_u32[65536] N/A 387.1 µs N/A
🆕 Simulation lanezip_checked_add_u32[65536] N/A 452.7 µs N/A
Simulation bitwise_not_vortex_buffer_mut[128] 304.4 ns 246.1 ns +23.7%

Tip

Curious why this is faster? Comment @codspeedbot explain why this is faster on this PR, or directly use the CodSpeed MCP with your agent.


Comparing ji/fast-iter-valid (fc9b5e8) with develop (a2323f1)

Open in CodSpeed

Footnotes

  1. 1 benchmark was skipped, so the baseline result was used instead. If it was deleted from the codebase, click here and archive it to remove it from the performance reports.

Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
f
Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
f
Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
f
Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
f
Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
f
Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
f
Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
f
Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
f
Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
f
Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
f
Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
@joseph-isaacs joseph-isaacs changed the title faster iteration infra perf[buffer]: iteration for fallible operations with validity May 27, 2026
@joseph-isaacs joseph-isaacs marked this pull request as ready for review May 27, 2026 15:13
@joseph-isaacs
Copy link
Copy Markdown
Contributor Author

Open question is where to put this code?

f
Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
f
Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
f
Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
f
Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
f
Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
f
Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
f
Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
f
Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
@joseph-isaacs joseph-isaacs added the changelog/performance A performance improvement label May 27, 2026
@robert3005
Copy link
Copy Markdown
Contributor

Sounds like we want a crate in between the array and vortex-buffer or this could be a feature flag in vortex-buffer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog/performance A performance improvement

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants