Skip to content

Pull requests: NVIDIA/TensorRT-LLM

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[None][feat] Update the logic of FMHA JIT path
#14291 opened May 19, 2026 by heyuhhh Collaborator Loading…
1 task
[TRTLLM-12758][feat] honor named function in tool_choice for non-harmony models
#14290 opened May 19, 2026 by JunyiXu-nv Collaborator Loading…
1 task done
[https://nvbugs/6115562][fix] defer worker registration until HTTP server is accepting
#14289 opened May 19, 2026 by reasonsolo Collaborator Loading…
1 task done
[None][fix] handle ADP dummy allocation failure
#14286 opened May 19, 2026 by qiaoxj07 Collaborator Draft
[None][test] Waive 5 failed cases for main in QA CI
#14283 opened May 19, 2026 by xinhe-nv Collaborator Loading…
[https://nvbugs/6095421][fix] Update resolve_moe_backend
#14282 opened May 19, 2026 by heyuhhh Collaborator Loading…
1 task
[None][fix] Update the OSS headers in derived FLA ops and AD modeling code
#14281 opened May 19, 2026 by bmarimuthu-nv Collaborator Loading…
1 task done
[TRTLLM-12347][feat] enable VSA in VisualGen
#14280 opened May 19, 2026 by o-stoner Collaborator Draft
1 task done
[None][refactor] Add derived properties for the thop.attention call site
#14279 opened May 19, 2026 by yuxianq Collaborator Loading…
1 task done
[TRTLLM-12154][test] Add Qwen3-32B FP8 disagg stress test
#14278 opened May 18, 2026 by brnguyen2 Collaborator Draft
3 tasks
[None][feat] add Visual Gen Auto path for Diffusers transformers
#14277 opened May 18, 2026 by karljang Collaborator Draft
4 tasks done
[None][fix] Handle unset attention_dp_relax in ADP routers
#14276 opened May 18, 2026 by peihu-nv Collaborator Loading…
1 task done
[None][chore] Drop sink_token_length from PyTorch attention surface
#14275 opened May 18, 2026 by yuxianq Collaborator Loading…
1 task done
[None][cleanup] MistralSmall related cleanups
#14271 opened May 18, 2026 by 2ez4bz Collaborator Loading…
1 task
[https://nvbugs/6127669][fix] faster test_performance_alignment
#14270 opened May 18, 2026 by tburt-nv Collaborator Loading…
1 task
[None][fix] Prevent SLURM dispatcher retry duplicate-upload error
#14269 opened May 18, 2026 by dpitman-nvda Collaborator Loading…
1 task done
[None][feat] Indexer TopK: opt-in multi-pass radix + fused split-work paths
#14268 opened May 18, 2026 by dcampora Collaborator Loading…
4 tasks
[None][fix] ADP router crashes on serve when scheduling_params.attent…
#14267 opened May 18, 2026 by nv-guomingz Collaborator Loading…
1 task done
[TRTLLM-12719][cbts] Add core code related rule
#14266 opened May 18, 2026 by crazydemo Collaborator Loading…
1 task done
[None][feat] Pre-allocate multimodal encoder attention workspace api-compatible Accepted LLM API contract change that is backwards-compatible
#14264 opened May 18, 2026 by yechank-nvidia Collaborator Draft
ProTip! Updated in the last three days: updated:>2026-05-16.