[Cadence: Vision] ResNet18 & ResNet50: Optimized, DMA-enabled, functional by cad-rlc · Pull Request #19111 · pytorch/executorch

cad-rlc · 2026-04-24T15:18:38Z

Summary

Optimized Cadence Vision DSP operators for ResNet18 and ResNet50 inference. All operators are DMA-enabled with ping-pong tiling and functionally verified (int8 quantized, NCHW layout).

Operators

Conv2d (`quantized_conv2d_nchw`)

Kernel variants: 7x7j2, 3x3j1, 3x3j2, 1x1j1, 1x1j2
Modes: DMA ping-pong tiling (with iDMA) and cache-only (no DMA)
Dispatch: Automatic kernel selection based on layer config (kernel size, stride, dilation)
Quantization: int8 asymmetric input × symmetric weights, per-tensor output scaling
Bias correction: 24-bit clamped kernel bias with post-kernel residual correction
Config generator: Python tool to generate per-DRAM-size layer config headers

MaxPool2d (`maxpool_exec_mxnj2`)

Kernel: Arbitrary MxN kernel size, stride-2
Modes: DMA tiled and cache-only (no DMA)
Layout: NCHW float32

Mean / AdaptiveAvgPool (`mean_exec_dma`)

Kernel: SIMD-optimized channel-wise mean with DMA tiling
Layout: NCHW float32, reduces spatial dims to 1x1

Quantize / Dequantize (`quantize_per_tensor`, `dequantize_per_tensor`)

Modes: DMA ping-pong and HW-optimized (no DMA)
Types: int8 asymmetric (asym8s)

Quantized ReLU (`quantized_relu`)

Modes: DMA ping-pong and HW-optimized (no DMA)
Type: int8 clamp

Quantized Linear (`quantized_linear_out`)

Mode: SIMD with DMA tiling
Type: int8 input × int8 weights, int32 bias

Add (`op_add`)

Mode: DMA ping-pong element-wise float32 add

Softmax (`op_softmax`)

Mode: HW-optimized softmax

Build Configuration

Supports configurable DRAM buffer sizes.
Automatic DMA vs cache-only dispatch based on DRAM availability

cc @mcremon-meta @hsharma35 @zonglinpengmeta

pytorch-bot · 2026-04-24T15:18:42Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19111

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 2 Active SEVs

There are 2 currently active SEVs. If your PR is affected, please view them below:

⚠️ 11 Awaiting Approval

As of commit 93271c8 with merge base 513a4ea ():

AWAITING APPROVAL - The following workflows need approval before CI can run:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

pytorch-bot · 2026-04-24T15:18:57Z

The following ciflow label(s) have been added but CI has not been triggered yet because the workflows are awaiting approval:

ciflow/trunk

Once a maintainer approves the workflows (scroll to the bottom of the PR page), the corresponding CI jobs will be triggered automatically. Please ping one of the reviewers if you do not have access to approve and run workflows.

github-actions · 2026-04-24T15:19:32Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

cad-rlc · 2026-05-08T10:33:42Z

@mcremon-meta @hsharma35 @zonglinpengmeta
This is the final PR for the ResNet18 and ResNet50 models.

mcremon-meta

Will continue the review later, but can we clean the set of files first? I don't quite understand why we have so many files checked in, including CMakeFiles etc.

mcremon-meta · 2026-05-08T20:21:06Z

@@ -0,0 +1,25 @@
+Collecting matplotlib


not sure what this file is?

mcremon-meta · 2026-05-08T20:21:32Z

      kernel_name: impl::generic::quantized_matmul_asym8uxasym8u_asym8u_out

- func: cadence::im2row.out(Tensor input, int[2] kernel_size, int[2] dilation, int[2] padding, int[2] stride, Tensor in_zero_point, bool channel_last=False, *, Tensor(a!) out) -> Tensor(a!)
+- func: cadence::im2row.out(Tensor input, int[2] kernel_size, int[2] dilation, int[2] padding, int[2] stride, Tensor in_zero_point, bool channel_last, *, Tensor(a!) out) -> Tensor(a!)


why is this needed?

cad-rlc · 2026-05-15T12:51:25Z

@mcremon-meta few stale files were accidentally committed in this pull request. We are addressing the issue and will submit a new PR shortly.

linux-foundation-easycla · 2026-05-28T13:42:26Z

❌ The email address for the commit (3dd5559, 4491310, 6a46467, 9991681, bd7fb9f, c2a48d2, f1693c2) is not linked to the GitHub account, preventing the EasyCLA check. Consult this Help Article and GitHub Help to resolve. (To view the commit's email address, add .patch at the end of this PR page's URL.) For further assistance with EasyCLA, please visit our EasyCLA portal and chat with our support bot.
❌ - login: @cad-rlc / name: cad-rlc. The commit (93271c8) is not authorized under a signed CLA. Please click here to be authorized. For further assistance with EasyCLA, please visit our EasyCLA portal and chat with our support bot.

…onal - Add DMA-optimized operators: conv2d (1x1/3x3/7x7), maxpool, quantize/dequantize, relu, add, mean, softmax, linear - Add new operators: embedding, full, im2row, quantized_fully_connected, quantized_layer_norm, quantized_matmul, requantize, view_copy - Add vision/kernels library and quantized_ops.h header - Add config generator for DMA buffer sizing - Update functions_vision.yaml and CMakeLists.txt - Add third-party XAI libraries (libxai, libxai_common, libxa_nnlib) - FACTO submodule update

cad-rlc requested review from GregoryComer, JacobSzwejbka, digantdesai, kimishpatel, kirklandsign, larryliu0820, lucylq, manuelcandales and mergennachin as code owners April 24, 2026 15:18

meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 24, 2026

github-actions Bot added ciflow/trunk module: arm Issues related to arm backend labels Apr 24, 2026

mcremon-meta requested changes May 8, 2026

View reviewed changes

Suraj Raut added 7 commits May 28, 2026 07:16

Reset non-Cadence files to upstream/main

c2a48d2

Remove accidental files

9991681

Sync submodule pointers with upstream/main

3dd5559

Sync remaining files with upstream/main

6a46467

Reset backends/cadence/aot/ to upstream (keep functions_vision.yaml)

4491310

Reset submodule pointers to upstream/main

f1693c2

cad-rlc force-pushed the main branch from 589ae7b to f1693c2 Compare May 29, 2026 08:26

cad-rlc requested review from psiddh and robert-kalmar as code owners May 29, 2026 08:26

Merge branch 'pytorch:main' into main

93271c8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Cadence: Vision] ResNet18 & ResNet50: Optimized, DMA-enabled, functional#19111

[Cadence: Vision] ResNet18 & ResNet50: Optimized, DMA-enabled, functional#19111
cad-rlc wants to merge 8 commits into
pytorch:mainfrom
cad-rlc:main

cad-rlc commented Apr 24, 2026 •

edited

Loading

Uh oh!

pytorch-bot Bot commented Apr 24, 2026 •

edited

Loading

Uh oh!

pytorch-bot Bot commented Apr 24, 2026

Uh oh!

github-actions Bot commented Apr 24, 2026

Uh oh!

cad-rlc commented May 8, 2026

Uh oh!

mcremon-meta left a comment

Uh oh!

mcremon-meta May 8, 2026

Uh oh!

mcremon-meta May 8, 2026

Uh oh!

cad-rlc commented May 15, 2026

Uh oh!

linux-foundation-easycla Bot commented May 28, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

cad-rlc commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Operators

Conv2d (quantized_conv2d_nchw)

MaxPool2d (maxpool_exec_mxnj2)

Mean / AdaptiveAvgPool (mean_exec_dma)

Quantize / Dequantize (quantize_per_tensor, dequantize_per_tensor)

Quantized ReLU (quantized_relu)

Quantized Linear (quantized_linear_out)

Add (op_add)

Softmax (op_softmax)

Build Configuration

Uh oh!

pytorch-bot Bot commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19111

❗ 2 Active SEVs

⚠️ 11 Awaiting Approval

Uh oh!

pytorch-bot Bot commented Apr 24, 2026

Uh oh!

github-actions Bot commented Apr 24, 2026

This PR needs a release notes: label

Uh oh!

cad-rlc commented May 8, 2026

Uh oh!

mcremon-meta left a comment

Choose a reason for hiding this comment

Uh oh!

mcremon-meta May 8, 2026

Choose a reason for hiding this comment

Uh oh!

mcremon-meta May 8, 2026

Choose a reason for hiding this comment

Uh oh!

cad-rlc commented May 15, 2026

Uh oh!

linux-foundation-easycla Bot commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cad-rlc commented Apr 24, 2026 •

edited

Loading

Conv2d (`quantized_conv2d_nchw`)

MaxPool2d (`maxpool_exec_mxnj2`)

Mean / AdaptiveAvgPool (`mean_exec_dma`)

Quantize / Dequantize (`quantize_per_tensor`, `dequantize_per_tensor`)

Quantized ReLU (`quantized_relu`)

Quantized Linear (`quantized_linear_out`)

Add (`op_add`)

Softmax (`op_softmax`)

pytorch-bot Bot commented Apr 24, 2026 •

edited

Loading

This PR needs a `release notes:` label

linux-foundation-easycla Bot commented May 28, 2026 •

edited

Loading