triton-inference-server / tensorrtllm_backend Public

Notifications You must be signed in to change notification settings
Fork 114
Star 770

Code
Issues 292
Pull requests 21
Discussions
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Security
Insights

Issues: triton-inference-server/tensorrtllm_backend

[Issue Template]Short one-line summary of the issue

#270 opened Jan 1, 2024 by juney-nvidia

Open

Labels 13 Milestones 0

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

292 Open 234 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

Error launching model on Triton on multigpu nodes

#698 opened Feb 7, 2025 by sujituk

Dockerfile is using internal base images

#694 opened Feb 6, 2025 by lkm2835

Mllama ignores input image when deployed in triton bug

Something isn't working

#692 opened Feb 5, 2025 by mutkach

2 of 4 tasks

Build triton_trt_llm image in debug mode

#691 opened Feb 5, 2025 by nzarif

Performance of triton+trtllm on llava-onevision compared to vllm and sglang

#689 opened Feb 3, 2025 by alexemme

Unable to build from source for tag v0.16.0. bug

Something isn't working

#686 opened Jan 30, 2025 by jingzhaoou

2 of 4 tasks

DeepSeek-R1-Distill-Qwen-32B FP16 model does not work with Triton server + tensorrtllm_backend (but it works with just TensorRT-LLM) bug

Something isn't working

#685 opened Jan 30, 2025 by kelkarn

2 of 4 tasks

What is the purpose of shm-region-prefix-name and what is the prefix0_ files used for?

#684 opened Jan 28, 2025 by sugam-nexusflow

"error": "Unable to parse 'inputs': attempt to access non-existing object member 'inputs'"

#683 opened Jan 28, 2025 by adityarap

Beam search diversity lost with in-flight batching bug

Something isn't working

#682 opened Jan 24, 2025 by Grace-YingHuang

2 of 4 tasks

How to deploy a quantized FP8 multimodel llm ?

#681 opened Jan 22, 2025 by zhishao

obj_size <= remaining_buffer_size

#680 opened Jan 20, 2025 by qzq-123

Assertion failed: sizeof(T) <= remaining_buffer_size bug

Something isn't working

#679 opened Jan 14, 2025 by gawain000000

2 of 4 tasks

Inference error encountered while using the draft target model. bug

Something isn't working

#678 opened Jan 13, 2025 by pimang62

2 of 4 tasks

Why tensorrt_llm_bls backend doesn't support speculative decoding streaming or bsz > 1?

#676 opened Jan 9, 2025 by meowcoder22

import PIL on demand

#674 opened Jan 2, 2025 by ShuaiShao93

Whisper - Missing parameters for triton deployment using tensorrt_llm backend bug

Something isn't working

#672 opened Jan 2, 2025 by eleapttn

2 of 4 tasks

problem: lora_weights data type

#671 opened Dec 25, 2024 by Alireza3242

Mllama example does not run properly for v0.15 when using the tensorrt_llm_bls endpoint

#669 opened Dec 24, 2024 by here4dadata

Inflight Batching not working with OpenAI-Compatible Frontend bug

Something isn't working

#667 opened Dec 22, 2024 by frosk1

2 of 4 tasks

Inference VILA 3b

#666 opened Dec 22, 2024 by anhnhust

lora_cache_gpu_memory_fraction is not a good parameter

#665 opened Dec 22, 2024 by Alireza3242

triton server multi request dynamic_batching not work bug

Something isn't working

#661 opened Dec 13, 2024 by kazyun

2 of 4 tasks

InternVL deploy

#660 opened Dec 13, 2024 by ChenJian7578

when the End to end workflow to run a Multimodal model Support for InternVL2?

#659 opened Dec 13, 2024 by ChenJian7578

Previous 1 2 3 4 5 … 11 12 Next

Previous Next

ProTip! Type g i on any issue or pull request to go back to the issue listing page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly