-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
benchmark for chatglm2-6b failed #138
Comments
Thanks @elinx for reporting this issue. |
I see, is there any quick way to fix it? Have you finalized the release date for your upcoming version, would it be possible for me to know? thanks |
Hi @elinx, As explained in #55, we plan to have two branches: the stable and the dev branches. We will update the dev branch soon with a bunch of fixes. The goal is to have a push to the dev branch this Friday (Oct. 27th). To be transparent, we might have to slip the schedule and do it only on Monday (Oct. 30th) but, in both cases, it's coming soon :). The fix will be included in that update of the dev branch. Thanks, |
This issue is similar to #93 |
Has this bug been fixed on main yet? |
yes, based on my testing. |
Thank you for the verification, @elinx . Close this bug. Feel free to reopen it if needed. |
execuse me . I have got a similar error ,when using it in triton. |
Can you try the latest main branch? It should be fixed. |
i have try the main branch, i get the same error too when i try chatglm3-6b-chat tensorrt_llm_backend release/0.5.0 [TensorRT-LLM][ERROR] Encountered an error in forward function: [TensorRT-LLM][ERROR] Assertion failed: position_ids: expected 2 dims, provided 3 dims (/app/tensorrt_llm/cpp/tensorrt_llm/runtime/tllmRuntime.cpp:139) |
I'm tring main branch with chatglm2-6b,sync you my result later。 |
I
I tried main branch lastest,and encountered the same error。 @byshiue |
need your help, @Nisoka and I all met this error ,when using chatglm2 or chatglm3. main branch latest. |
We cannot reproduce your issue. That's our scripts: cd $TRTLLM_PATH
cd examples/chatglm/
python build.py -m chatglm2_6b
cd ../../cpp/build
make make gptSessionBenchmark
chmod +x ./benchmarks/gptSessionBenchmark
./benchmarks/gptSessionBenchmark \
--duration 30 --model chatglm2_6b --engine_dir ../../examples/chatglm/trtModel --batch_size 1 --input_output_len 32,1 and the result is Benchmarking done. Iteration: 1509, duration: 30.02 sec.
[BENCHMARK] batch_size 1 input_length 32 output_length 1 latency(ms) 19.89 tokensPerSec 50.27 Can you make sure you are using the latest main branch (the commit is 6755a3f). |
I have double confirmed my branch is main and lastest comit , but triggered in triton , is there some differentia between them. |
You also need to make sure the TRT LLM is on latest main branch. They are different but related. |
I convert chagglm2-6b model and run fine with the build command:
python3 build.py --model_dir=${model_dir} \ --dtype float16 \ --use_gpt_attention_plugin float16 \ --use_gemm_plugin float16
but benchmark failed with the following command:
error message:
The text was updated successfully, but these errors were encountered: