Skip to content

Issues: NVIDIA/TensorRT-LLM

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Assignee
Filter by who’s assigned
Sort

Issues list

Are multimodal models supported by trtllm-serve? OpenAI API triaged Issue has been triaged by maintainers
#2714 opened Jan 23, 2025 by xiaoyuzju
how to compile deepseekv3 ? Installation triaged Issue has been triaged by maintainers
#2711 opened Jan 22, 2025 by zmtttt
Support for Blackwell and Thor triaged Issue has been triaged by maintainers
#2710 opened Jan 21, 2025 by phantaurus
NVILA support (Qwen2) bug Something isn't working
#2707 opened Jan 21, 2025 by danigarciaoca
convert NVILA with 0.16.0 bug Something isn't working Investigating LLM API/Workflow triaged Issue has been triaged by maintainers
#2706 opened Jan 20, 2025 by dzy130120
2 of 4 tasks
Support for int2/int3 quantization Investigating Low Precision Issue about lower bit quantization, including int8, int4, fp8 triaged Issue has been triaged by maintainers
#2704 opened Jan 20, 2025 by ZHITENGLI
quantized model using AWQ and lora weights Investigating Low Precision Issue about lower bit quantization, including int8, int4, fp8 triaged Issue has been triaged by maintainers
#2703 opened Jan 17, 2025 by shuyuan-wang
Wrong outputs with FP8 kv_cache reuse bug Something isn't working Investigating KV-Cache Management triaged Issue has been triaged by maintainers
#2699 opened Jan 16, 2025 by lishicheng1996
2 of 4 tasks
What is execution context memory? triaged Issue has been triaged by maintainers
#2698 opened Jan 16, 2025 by wxsms
Custom allreduce performance improvement Customized Kernels Investigating triaged Issue has been triaged by maintainers
#2696 opened Jan 16, 2025 by yizhang2077
Failed TensorRT-LLM Benchmark bug Something isn't working
#2694 opened Jan 15, 2025 by maulikmadhavi
1 of 4 tasks
0.16.0 Qwen2-72B-Struct SQ error bug Something isn't working
#2693 opened Jan 15, 2025 by gy0514020329
4 tasks
NotImplementedError: Cannot copy out of meta tensor; no data! bug Something isn't working
#2692 opened Jan 15, 2025 by chilljudaoren
2 of 4 tasks
(Memory leak) trtllm-build gets OOM without GPTAttentionPlugin bug Something isn't working
#2690 opened Jan 14, 2025 by idantene
2 of 4 tasks
trtllm-build llama3.1-8b failed Investigating LLM API/Workflow triaged Issue has been triaged by maintainers
#2688 opened Jan 14, 2025 by 765500005
internvl-2.5 triaged Issue has been triaged by maintainers
#2686 opened Jan 13, 2025 by ChenJian7578
Inference error encountered while using the draft target model. bug Something isn't working
#2684 opened Jan 13, 2025 by pimang62
2 of 4 tasks
Deepseek-v3 int4 weight only inference outputs garbage words with TP 8 on nvidia H20 GPU Investigating Low Precision Issue about lower bit quantization, including int8, int4, fp8 triaged Issue has been triaged by maintainers
#2683 opened Jan 13, 2025 by handoku
ProTip! What’s not been updated in a month: updated:<2024-12-23.