Knowledge Base: Streaming Hypergraph Multi-Pass Evaluation (Entry 035)
New community-contributed knowledge base entry from an AAE session optimizing FREIGHT streaming hypergraph partitioning (pybind11 bindings in CHSZLabLib).
Key techniques (1.82x speedup):
- Per-net bit vectors with
__builtin_popcountllreplacingvector<vector<int64_t>>reverse mapping +std::set-per-net evaluation - Objective-specific evaluation: direct
CUT_NETcounting for cut-net mode (no bit vectors needed) - Incremental bit-setting during the main partitioning loop
- Copy elimination via
memcpy, last-pass snapshot skip, pre-allocation
Detailed "what didn't work" section covering loop fusion (hurts ILP), LTO (icache regression), software prefetching (low average degree), code path duplication (icache pressure), and raw pointers vs pybind11 unchecked proxies (pybind11 generated better code).
Full experiment log with 15 iterations (8 kept, 7 discarded).