-
Notifications
You must be signed in to change notification settings - Fork 114
Issues: triton-inference-server/tensorrtllm_backend
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Mllama ignores input image when deployed in triton
bug
Something isn't working
#692
opened Feb 5, 2025 by
mutkach
2 of 4 tasks
Performance of triton+trtllm on llava-onevision compared to vllm and sglang
#689
opened Feb 3, 2025 by
alexemme
Unable to build from source for tag Something isn't working
v0.16.0
.
bug
#686
opened Jan 30, 2025 by
jingzhaoou
2 of 4 tasks
DeepSeek-R1-Distill-Qwen-32B FP16 model does not work with Triton server + tensorrtllm_backend (but it works with just TensorRT-LLM)
bug
Something isn't working
#685
opened Jan 30, 2025 by
kelkarn
2 of 4 tasks
What is the purpose of shm-region-prefix-name and what is the prefix0_ files used for?
#684
opened Jan 28, 2025 by
sugam-nexusflow
"error": "Unable to parse 'inputs': attempt to access non-existing object member 'inputs'"
#683
opened Jan 28, 2025 by
adityarap
Beam search diversity lost with in-flight batching
bug
Something isn't working
#682
opened Jan 24, 2025 by
Grace-YingHuang
2 of 4 tasks
Assertion failed: sizeof(T) <= remaining_buffer_size
bug
Something isn't working
#679
opened Jan 14, 2025 by
gawain000000
2 of 4 tasks
Inference error encountered while using the draft target model.
bug
Something isn't working
#678
opened Jan 13, 2025 by
pimang62
2 of 4 tasks
Why tensorrt_llm_bls backend doesn't support speculative decoding streaming or bsz > 1?
#676
opened Jan 9, 2025 by
meowcoder22
Whisper - Missing parameters for triton deployment using tensorrt_llm backend
bug
Something isn't working
#672
opened Jan 2, 2025 by
eleapttn
2 of 4 tasks
Mllama example does not run properly for v0.15 when using the
tensorrt_llm_bls
endpoint
#669
opened Dec 24, 2024 by
here4dadata
Inflight Batching not working with OpenAI-Compatible Frontend
bug
Something isn't working
#667
opened Dec 22, 2024 by
frosk1
2 of 4 tasks
triton server multi request dynamic_batching not work
bug
Something isn't working
#661
opened Dec 13, 2024 by
kazyun
2 of 4 tasks
when the End to end workflow to run a Multimodal model Support for InternVL2?
#659
opened Dec 13, 2024 by
ChenJian7578
Previous Next
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.