You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Flan-T5 was recently given support with C++ triton backend (ref), does it mean features like rolling_batch are available for T5 now?
As per this line, there is no support for Inflight Batching in TRTLLM for T5. Does it still hold true?
References
list reference and related literature
list known implementations
The text was updated successfully, but these errors were encountered:
Description
Flan-T5 was recently given support with C++ triton backend (ref), does it mean features like
rolling_batch
are available for T5 now?As per this line, there is no support for Inflight Batching in TRTLLM for T5. Does it still hold true?
References
The text was updated successfully, but these errors were encountered: