Wasm loop aware rpo compute last block in loop quickly#125101
Wasm loop aware rpo compute last block in loop quickly#125101AndyAyersMS wants to merge 2 commits intodotnet:mainfrom
Conversation
…Flow The LaRPO guarantees loop bodies are contiguous, so the end position of a loop starting at cursor is simply cursor + NumLoopBlocks(). This replaces the O(n) while loop that scanned forward calling ContainsBlock per block. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
@jakobbotsch PTAL Verified the fix in a local crossgen run. |
|
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
There was a problem hiding this comment.
Pull request overview
This PR optimizes WebAssembly control-flow interval construction by leveraging the invariant that Loop-Aware RPO (LaRPO) keeps loop bodies contiguous, enabling O(1) computation of a loop’s end position in the LaRPO order.
Changes:
- Compute loop interval end position as
headerIndex + NumLoopBlocks()instead of scanning forward through the layout. - Add a DEBUG-only verification that the new computation matches the previous scan-based behavior.
- Update the TODO list to remove items that are no longer applicable.
| // | ||
| unsigned endCursor = cursor; | ||
| while ((endCursor < numBlocks) && loop->ContainsBlock(initialLayout[endCursor])) | ||
| unsigned endCursor = cursor + loop->NumLoopBlocks(); |
There was a problem hiding this comment.
Would VisitLoopBlocksPostOrder work too and be even faster? NumLoopBlocks still requires iterating the full set.
There was a problem hiding this comment.
Maybe not since the first block in the post order may belong to a nested loop that was visited early in the loop aware RPO. Hard to be sure at least.
There was a problem hiding this comment.
Not sure I follow. NumLoopBlocks just does a bit vector count so ought to be fairly fast.
There was a problem hiding this comment.
I was proposing implementing this as:
BasicBlock* lastBlock = nullptr;
loop->VisitLoopBlocksPostOrder([&](BasicBlock* block) { lastBlock = block; return BasicBlockVisit::Abort; });
It accesses the last block in constant time. But I am not convinced that it would be correct in all cases since it may belong to a nested loop.
There was a problem hiding this comment.
Aha: PostOrder, not ReversePostOrder.
There was a problem hiding this comment.
Returning a block in a nested loop would be ok. We just need to find the last block that might try and branch back to the loop head.
But I think I'll just stick with this version, it should be fast enough.
Since Loop Aware RPO keeps loops compact, we can quickly figure out the last block in the loop in the order.