SME2/SVE/NEON heuristic - ACL#1294
Open
damdoo01-arm wants to merge 3 commits into
Open
Conversation
…count Signed-off-by: Damien Dooley <damien.dooley@arm.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Title:
Allow runtime masking of SVE/SVE2 CPU feature exposure
Description:
This PR adds runtime control over whether ACL exposes SVE/SVE2 capabilities through arm_compute::CPUInfo. For full context, please refer to the associated ArmNN PR at: ARM-software/armnn#820
Problem statement:
ArmNN needs a way to steer ACL away from SME/SME2 and, in some cases, SVE/SVE2 kernel families when graph-level shape heuristics indicate that those paths regress performance. The regression is most visible on SME2-capable hardware under high thread count, where the hardware/resource pressure around SME2 packing can dominate the expected matmul acceleration for some Geekbench AI shapes.
High-level approach:
ACL already had runtime masking for SME/SME2 via set_sme_allowed(). This PR adds equivalent SVE/SVE2 masking:
void CPUInfo::set_sve_allowed(bool is_allowed);
When disabled, ACL reports SVE/SVE2 and related features as unavailable through:
has_sve()
has_sve2()
has_svebf16()
has_svei8mm()
has_svef32mm()
get_isa()
This lets ArmNN apply its graph-level policy while keeping ACL’s existing kernel selection mechanisms intact.
Relationship to ArmNN PR:
The ArmNN PR emits CpuAcc options such as:
SmeEnabled=false
SveEnabled=true
or:
SmeEnabled=false
SveEnabled=false
The ACL PR provides the underlying mechanism that makes those options affect runtime kernel selection.