Replaced WFE instruction with ISB in mi_atomic_yield on ARM64#1215
Replaced WFE instruction with ISB in mi_atomic_yield on ARM64#1215akaStiX wants to merge 1 commit intomicrosoft:devfrom
Conversation
… Windows as it significantly improves performance
|
Interesting PR. Mmm, it would be best if we didn't need atomic yield at all :-( Anyways, in v3 there is (almost) no use of atomic yield so that improves matters. In the other cases, the atomic yield is really used to signify that another thread needs to make progress for the current thread to advance. As such ISB (essentially a memory barrier) doesn't quite do that? WFE (wait for another thread event) seems more appropiate .. but then, you are seeing improved perfomance? Is that on v2 ? |
|
I tested v3 and it was not good in both perf and memory in comparison to v2 on my use case.
It's a recommended way to do it on ARM actually. There's nothing that acts like a PAUSE instruction on x64 on ARM, and the code that uses this yield was designed with x64 in mind. As I said before, on one closed source ARM platform that I ported mimalloc to WFE resulted in an app freeze since no event was raised. I don't remember all the details about WFE from the top of my head, but I think one must also raise an event explicitly to make sure WFE doesn't wait way too much time as the lock might have been released long time ago.
Yes, I am seeing a huge perf improvement with ISB on v2. |
Replaced WFE instruction with ISB in mi_atomic_yield on ARM64 for non Windows as it significantly improves performance.
Also on one closed platform I ported mimalloc to, WFE resulted in an application freeze, because no even was generated to wake threads that were waiting with WFE