Skip to content

Issues: NVIDIA/TensorRT-LLM

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Assignee
Filter by who’s assigned
Sort

Issues list

speculative decoding not work
#2804 opened Feb 20, 2025 by biaochen
Possible bug in Qwen convert_checkpoint.py Investigating Low Precision Issue about lower bit quantization, including int8, int4, fp8 triaged Issue has been triaged by maintainers
#2803 opened Feb 19, 2025 by mathijshenquet
run.py gets stuck when running Speculative Decoding with Draft model bug Something isn't working
#2798 opened Feb 19, 2025 by ValeGian
2 of 4 tasks
DeepSeek-R1-Distill-Qwen-1.5B inference bias bug Something isn't working
#2797 opened Feb 19, 2025 by Mitty-ZH
2 of 4 tasks
Multimodal Cross-attention incorrect results in bug Something isn't working
#2796 opened Feb 19, 2025 by mutkach
2 of 4 tasks
Deepseek-v3 running on 2xH100 nodes getting poor performanc bug Something isn't working triaged Issue has been triaged by maintainers
#2786 opened Feb 14, 2025 by zymy-chen
2 of 4 tasks
The performance of Qwen1.5-7B based on the trtllm-bench test was very poor bug Something isn't working
#2785 opened Feb 14, 2025 by ruru5697
3 of 4 tasks
Bug when loading an engine using LoRA through LLM API bug Something isn't working Investigating LLM API/Workflow triaged Issue has been triaged by maintainers
#2782 opened Feb 13, 2025 by pei0033
2 of 4 tasks
GPU Utilization drops gradually over time using Executor API bug Something isn't working
#2778 opened Feb 12, 2025 by MahmoudAshraf97
3 of 4 tasks
Inconsistent Batch Index Order in Decoupled Mode with trt-llm bug Something isn't working
#2777 opened Feb 12, 2025 by Oldpan
2 of 4 tasks
DeepSeek-V3 fp8 tp32 failed to convert chectpoint bug Something isn't working triaged Issue has been triaged by maintainers
#2776 opened Feb 12, 2025 by MtFitzRoy
2 of 4 tasks
Limit max GPU memory used
#2773 opened Feb 11, 2025 by bri25yu
Cannot create checkpoint for llama-3.2 (1B, 3B) bug Something isn't working triaged Issue has been triaged by maintainers
#2772 opened Feb 11, 2025 by falkbene
3 of 4 tasks
ProTip! Add no:assignee to see everything that’s not assigned.