Disabling dynamic batching and enabling TRTLLM's continuous batching #570
Unanswered
dhruvmullick
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
In the file: https://github.com/triton-inference-server/tensorrtllm_backend/blob/main/all_models/inflight_batcher_llm/tensorrt_llm/config.pbtxt
To enable TRTLLM's continuous batching and disabling Triton's batching, do I only need to:
Beta Was this translation helpful? Give feedback.
All reactions