Skip to content

Pull requests: ROCm/TransformerEngine

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Optimized rocm specific multicast transpose kernel ci-level 3 CI test level 3
#586 opened May 14, 2026 by alextmagro Contributor Loading…
Add a custom multi_tensor_apply kernels (L2norm, Adam) ci-level 1 CI test level 1
#585 opened May 13, 2026 by matthiasdiener Contributor Draft
1 of 13 tasks
Harden claude-pr-action.yml
#584 opened May 12, 2026 by Micky774 Contributor Loading…
13 tasks
RMS Norm Optimization
#583 opened May 12, 2026 by aris134 Contributor Loading…
1 of 13 tasks
CK JIT integration ci-level 1 CI test level 1
#582 opened May 11, 2026 by ipanfilo Collaborator Loading…
1 of 13 tasks
Add Tealite: pure-Python TransformerEngine for ROCm/AMD GPUs
#581 opened May 7, 2026 by jayfurmanek Contributor Loading…
7 of 8 tasks
CK Tile MXFP8 Group GEMM gfx1250 ci-level 1 CI test level 1
#578 opened May 6, 2026 by aris134 Contributor Loading…
1 of 13 tasks
CK Tile Group GEMM gfx1250 ci-level 1 CI test level 1
#576 opened May 6, 2026 by aris134 Contributor Loading…
1 of 13 tasks
ck_tile grouped gemm: more padding
#574 opened May 5, 2026 by matthiasdiener Contributor Draft
1 of 13 tasks
[ROCm] Allow bf16/bf16/fp32 in nvte_multi_tensor_gemm dispatcher ci-level 1 CI test level 1
#573 opened May 4, 2026 by lizamd Loading…
13 tasks
[No Merge][No Review] testing aiter auto trigger on gh action ci-level 2 CI test level 2
#570 opened May 1, 2026 by VeeraRajasekhar Contributor Draft
13 tasks
add MXFP8 pre-swizzling for gfx1250 GEMM ci-level 1 CI test level 1
#568 opened Apr 29, 2026 by matthiasdiener Contributor Loading…
13 tasks
HipKittens MXFP8 GEMM Support ci-level 3 CI test level 3
#566 opened Apr 28, 2026 by alextmagro Contributor Loading…
Update QoLA reducing [compile time, kernel count, lib size] by ~2x (Diet QoLA) ci-level 3 CI test level 3
#563 opened Apr 27, 2026 by Micky774 Contributor Loading…
1 of 13 tasks
[WIP] TDM porting
#558 opened Apr 22, 2026 by wangye805 Collaborator Draft
13 tasks
IFU v2.14.dev0 ci-level 3 CI test level 3
#557 opened Apr 21, 2026 by ipanfilo Collaborator Loading…
5 of 13 tasks
Enable CI lint gh action on ROCm ci-level 3 CI test level 3
#547 opened Apr 17, 2026 by VeeraRajasekhar Contributor Loading…
13 tasks
CI: auto-trigger AITER prebuilt upload when 3rdparty/aiter updates on dev
#543 opened Apr 15, 2026 by VeeraRajasekhar Contributor Loading…
8 of 13 tasks
Integrate AITER fused RoPE kernels with fallback to TE native
#541 opened Apr 15, 2026 by suachong Contributor Loading…
7 tasks done
NV upstream release 2.12 merge ci-level 3 CI test level 3
#538 opened Apr 13, 2026 by Micky774 Contributor Loading…
13 tasks
NVFP4: hadamard_transform_cast_fusion_columnwise ci-level 1 CI test level 1
#515 opened Apr 1, 2026 by matthiasdiener Contributor Draft
1 of 13 tasks
Add fsdp2 fp8 unit tests TE 2.10 ci-level 3 CI test level 3
#492 opened Mar 17, 2026 by sudhu2k Contributor Loading…
8 of 13 tasks
Add AITER fused RoPE dispatch to FusedRoPEFunc
#489 opened Mar 17, 2026 by sarthak-amd Contributor Loading…
Microbenchmark suite
#487 opened Mar 16, 2026 by Micky774 Contributor Loading…
1 of 13 tasks
ProTip! Type g i on any issue or pull request to go back to the issue listing page.