Skip to content

Fix XPU dequantize ops for non-contiguous tensors#1911

Merged
matthewdouglas merged 1 commit intobitsandbytes-foundation:mainfrom
jiqing-feng:contiguous
Apr 3, 2026
Merged

Fix XPU dequantize ops for non-contiguous tensors#1911
matthewdouglas merged 1 commit intobitsandbytes-foundation:mainfrom
jiqing-feng:contiguous

Conversation

@jiqing-feng
Copy link
Copy Markdown
Contributor

What does this PR do?

Adds A.contiguous() to _dequantize_blockwise_impl and _dequantize_4bit_impl in the XPU backend, matching the existing CUDA backend behavior. XPU SYCL kernels require contiguous memory layout; passing non-contiguous tensors produces silently incorrect results (99% element mismatch).

Hi @matthewdouglas . Please review this PR. Thanks!

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
@jiqing-feng jiqing-feng marked this pull request as ready for review April 3, 2026 02:08
@matthewdouglas matthewdouglas added this to the v0.50.0 milestone Apr 3, 2026
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 3, 2026

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@matthewdouglas matthewdouglas merged commit 61c0375 into bitsandbytes-foundation:main Apr 3, 2026
91 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants