Skip to content

[SM6.10][Bugfix] MultiplyAdd - Convert Bias vector to output vector type#8394

Open
hekota wants to merge 5 commits intomicrosoft:mainfrom
hekota:muladd-convert-in-header
Open

[SM6.10][Bugfix] MultiplyAdd - Convert Bias vector to output vector type#8394
hekota wants to merge 5 commits intomicrosoft:mainfrom
hekota:muladd-convert-in-header

Conversation

@hekota
Copy link
Copy Markdown
Member

@hekota hekota commented Apr 21, 2026

If the Bias vector type/interpretation on MultiplyAdd differs from the output vector type, convert it first to the output vector type before passing it into the __builtin_LinAlg_MatrixVectorMultiplyAdd built-in function (which gets translated to dx.op.linAlgMatVecMulAdd op). The bias vector interpretation is always going to be the same as the output vector interpretation on the call.

To accommodate this, each MultiplyAdd now has two SFINAE-constrained variants:

  1. If the bias interpretation matches the output vector interpretation, no conversion is necessary and the function calls built-in/dxil op directly.
  2. If types don't match, the bias vector is first converted to the output vector type before passing it into the built-in/dxil op.

Adds hlsl::__detail::TypeTraits struct to enable mapping of HLSL scalar types to component type enum values.

Fixes #8390

If the Bias vector type on MultiplytAdd is different than the output vector type, convert it first to the output vector type before calling `dx.op.linAlgMatVecMulAdd` op. The interpretation for the bias vector is always going to be set to the output vector interpretation on the op.

To accommodate this, MultiplyAdd functions have been split into two variants -
one where bias interpretation matches the output vector interpretation and no conversion is necessary, and second where the types don't match and the bias vector is first converted to the output vector type before passing it into the __builtin_LinAlg_MatrixVectorMultiplyAdd.

Adds`hlsl::__detail::TypeTraits` struct to enable mapping of HLSL scalar types to component type enum values.

Fixes microsoft#8390
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 21, 2026

✅ With the latest revision this PR passed the C/C++ code formatter.

@hekota hekota requested a review from llvm-beanz April 21, 2026 02:26
Copy link
Copy Markdown
Collaborator

@llvm-beanz llvm-beanz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple comments, but none that need to block this.

Comment thread tools/clang/lib/Headers/hlsl/dx/linalg.h Outdated
MultiplyAdd(Matrix<MatrixDT, M, K, MatrixUse::A, MatrixScope::Thread> MatrixA,
InterpretedVector<InputElTy, VecK, InputInterp> InterpVec,
vector<BiasElTy, M> Bias) {
vector<OutputElTy, M> Result;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the approach of having two overloads is fine, but we could alternatively have less SFINAE if we had just one overload where the convert was under an if statement. The if condition would be what you have in the SFINAE now so it would always be a constant evaluated expression and the branch would always get optimized away.

Either approach is fine.

Copy link
Copy Markdown
Contributor

@tex3d tex3d Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just thinking, what about if the Convert template function had either a) do-nothing overload for when CompType is the same (no conversion necessary), or b) an if around the builtin operation.

Then you could use the templated Convert in these methods instead of the builtin directly, which would auto-resolve to a no-op when the type matches, instead of adding the logic to each of these methods.

I slightly prefer the a) separate overload, just because it makes the initial IR gen cleaner and you can test for that. When it's just Convert, it's less ugly to only double Convert for the no-op case, than doubling all overloads that use Convert.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a version with the convert under an if statement, but for packed data types that was reporting a warning about vector truncation. I could eliminate the warning by changing the __builtin_LinAlg_MatrixVectorMultiplyAdd signature to allow for vectors of 3 different sizes, but that is not correct either - the bias count and result count must always be the same. That's why I split all of the MultiplyAdd functions.

I will try the suggestion of using the templated Convert and having do-nothing Convert overload for no conversion path. I think that should work and look nicer.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed an update that adds a no-op Convert version in case DestTy==OriginTy and removed the double versions of MultiplyAdd.

I did however find an issue with Convert that it does not support packed type vector sizes that are not a multiple of the number of elements per scalar. I used a workaround to make it work and filed #8418 to fix it.

Copy link
Copy Markdown
Contributor

@tex3d tex3d left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks good, but wouldn't it be even cleaner to overload Convert to do-nothing when types match, then just always call that?

hekota added 2 commits May 1, 2026 11:45
Removed double versions of MultiplyAdd; these now call the Convert
function, which will or will not result in a on-op depending on
whether the bias component type matches the output type or not.

Found an issue that Convert does not support packed type vector
sizes that are not a multiple of the number of elements per scalar.
Implemented a workaround and file issue microsoft#8418 to fix this.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: New

Development

Successfully merging this pull request may close these issues.

[SM6.10] MultiplyAdd should convert Bias to output type

3 participants