Skip to content

Pull requests: NVIDIA/TensorRT-LLM

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[None][chore] gitignore NFS system temporary files
#14211 opened May 17, 2026 by zhenhuaw-me Member Loading…
1 task done
[None][chore] Early emission of responses with overlap scheduling
#14208 opened May 17, 2026 by brb-nv Collaborator Draft
1 task done
[None][chore] Refactor handle_responses() to smaller functions
#14207 opened May 16, 2026 by brb-nv Collaborator Draft
1 task done
[None][chore] log KV cache utilization and context tokens per iter
#14206 opened May 16, 2026 by pcicotti Collaborator Loading…
1 task done
[None][test] Waive 4 failed cases for main in QA CI
#14205 opened May 16, 2026 by xinhe-nv Collaborator Draft
@coderabbitai
#14203 opened May 16, 2026 by hnover-nv Collaborator Draft
1 task done
[#13816][feat] AutoDeploy: Optimize gpt-oss-120b perf
#14202 opened May 16, 2026 by taylor-yb-lee Collaborator Draft
1 task
[None][feat] Enable sliding window attention for eagle3 Community want to contribute PRs initiated from Community
#14200 opened May 16, 2026 by murphymatt Loading…
1 task done
[None][bug] NVFP4 MoE: requantize w1/w3 when global scales differ Community want to contribute PRs initiated from Community
#14199 opened May 16, 2026 by johnheo Loading…
3 tasks
(DO NOT SUBMIT) WideEP FT MVP prorotype
#14198 opened May 16, 2026 by chienchunhung Collaborator Draft
1 task
[None][doc] Update spec dec support matrices
#14195 opened May 15, 2026 by mikeiovine Collaborator Loading…
1 task done
[#12702][feat] Autodeploy deprecate the legacy triton attention
#14194 opened May 15, 2026 by nvchenghaoz Collaborator Loading…
1 task done
[None][fix] Add SPDX Apache-2.0 headers to auto_deploy test files
#14193 opened May 15, 2026 by bmarimuthu-nv Collaborator Loading…
1 task done
[https://nvbugs/6133201][fix] Bump GEN max_num_tokens in disagg perf YAMLs
#14191 opened May 15, 2026 by xwang233 Collaborator Loading…
1 task done
[None][test] Add CUTLASS variant to V4-Flash EPLB accuracy tests
#14190 opened May 15, 2026 by Tabrizian Member Loading…
1 task done
Beam search logits processor v2 Community want to contribute PRs initiated from Community
#14189 opened May 15, 2026 by kyurious-george Loading…
1 task
[TRTLLMINF-76][feat] Delegate runKubernetesPodWithInfraRetry to shared lib
#14186 opened May 15, 2026 by dpitman-nvda Collaborator Loading…
1 task
[https://nvbugs/6162128][tests] Skip nano v3 E2E tests entirely on G/B300
#14185 opened May 15, 2026 by 2ez4bz Collaborator Loading…
1 task done
ProTip! Mix and match filters to narrow down what you’re looking for.