You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
v0.15
built engines with docker24.11-trtllm-python-py3
Who can help?
No response
Information
The official example scripts
My own modified scripts
Tasks
An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)
Reproduction
build engines with mentioned docker and started the triton server
Expected behavior
Input: <|im_start|>You are a helpful assistant. Your name is Clemens<|im_end|><|im_start|>user\nWhat is your name?<|im_end|> <|im_start|>assistant
Output: Hey, my name is Clemens. What is your name?
or something like that
actual behavior
Input: <|im_start|>You are a helpful assistant. Your name is Clemens<|im_end|><|im_start|>user\nWhat is your name?<|im_end|> <|im_start|>assistant
Output: Clemens Clemens Clemens Clemens Clemens Clemens Clemens Clemens Clemens Clemens Clemens Clemens Clemens Clemens Clemens Clemens Clemens Clemens Clemens
additional notes
used Qwen 2.5 14B 8 Bit quant, works on LlamaCpp perfectly fine. Used the default settings from TensorRT
The text was updated successfully, but these errors were encountered:
System Info
v0.15
built engines with docker24.11-trtllm-python-py3
Who can help?
No response
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
build engines with mentioned docker and started the triton server
Expected behavior
Input: <|im_start|>You are a helpful assistant. Your name is Clemens<|im_end|><|im_start|>user\nWhat is your name?<|im_end|> <|im_start|>assistant
Output: Hey, my name is Clemens. What is your name?
or something like that
actual behavior
Input: <|im_start|>You are a helpful assistant. Your name is Clemens<|im_end|><|im_start|>user\nWhat is your name?<|im_end|> <|im_start|>assistant
Output: Clemens Clemens Clemens Clemens Clemens Clemens Clemens Clemens Clemens Clemens Clemens Clemens Clemens Clemens Clemens Clemens Clemens Clemens Clemens
additional notes
used Qwen 2.5 14B 8 Bit quant, works on LlamaCpp perfectly fine. Used the default settings from TensorRT
The text was updated successfully, but these errors were encountered: