Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

benchmark for chatglm2-6b failed #138

Closed
elinx opened this issue Oct 26, 2023 · 16 comments
Closed

benchmark for chatglm2-6b failed #138

elinx opened this issue Oct 26, 2023 · 16 comments
Assignees
Labels
bug Something isn't working triaged Issue has been triaged by maintainers

Comments

@elinx
Copy link

elinx commented Oct 26, 2023

I convert chagglm2-6b model and run fine with the build command:

python3 build.py --model_dir=${model_dir} \
                 --dtype float16 \
                 --use_gpt_attention_plugin float16 \
                 --use_gemm_plugin float16

but benchmark failed with the following command:

../../cpp/build/benchmarks/gptSessionBenchmark --duration 30 --model chatglm2-6b --engine_dir /code/tensorrt_llm/examples/chatglm2-6b/trtModel --batch_size 1 --input_output_len 32,1

error message:

[TensorRT-LLM][ERROR] [TensorRT-LLM][ERROR] Assertion failed: position_ids: expected 2 dims, provided 3 dims (/code/tensorrt_llm/cpp/tensorrt_llm/runtime/tllmRuntime.cpp:139)
1       0x561e1c97c6ee tensorrt_llm::common::throwRuntimeError(char const*, int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 100
2       0x7f6be25bd53b tensorrt_llm::runtime::TllmRuntime::setInputTensors(int, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<tensorrt_llm::runtime::ITensor>, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<tensorrt_llm::runtime::ITensor> > > > const&) + 1867
3       0x7f6be2587453 tensorrt_llm::runtime::GptSession::generateSingleBatch(tensorrt_llm::runtime::GenerationOutput&, tensorrt_llm::runtime::GenerationInput const&, tensorrt_llm::runtime::SamplingConfig const&) + 2211
4       0x561e1c980537 ../../cpp/build/benchmarks/gptSessionBenchmark(+0x17537) [0x561e1c980537]
5       0x7f6ba4edcd90 /usr/lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7f6ba4edcd90]
6       0x7f6ba4edce40 __libc_start_main + 128
7       0x561e1c981fe5 ../../cpp/build/benchmarks/gptSessionBenchmark(+0x18fe5) [0x561e1c981fe5]
@nv-guomingz
Copy link
Collaborator

Thanks @elinx for reporting this issue.
We've fixed it internally and will publish the patch soon.

@nv-guomingz nv-guomingz added bug Something isn't working triaged Issue has been triaged by maintainers labels Oct 26, 2023
@elinx
Copy link
Author

elinx commented Oct 26, 2023

Thanks @elinx for reporting this issue. We've fixed it internally and will publish the patch soon.

I see, is there any quick way to fix it?

Have you finalized the release date for your upcoming version, would it be possible for me to know? thanks

@jdemouth-nvidia
Copy link
Collaborator

Hi @elinx,

As explained in #55, we plan to have two branches: the stable and the dev branches. We will update the dev branch soon with a bunch of fixes. The goal is to have a push to the dev branch this Friday (Oct. 27th). To be transparent, we might have to slip the schedule and do it only on Monday (Oct. 30th) but, in both cases, it's coming soon :). The fix will be included in that update of the dev branch.

Thanks,
Julien

@byshiue
Copy link
Collaborator

byshiue commented Oct 27, 2023

This issue is similar to #93

@byshiue byshiue self-assigned this Oct 27, 2023
@calico-niko
Copy link

calico-niko commented Nov 6, 2023

Has this bug been fixed on main yet?

@elinx
Copy link
Author

elinx commented Nov 10, 2023

Has this bug been fixed on main yet?

yes, based on my testing.

@byshiue
Copy link
Collaborator

byshiue commented Nov 13, 2023

Thank you for the verification, @elinx . Close this bug. Feel free to reopen it if needed.

@byshiue byshiue closed this as completed Nov 13, 2023
@chaos318
Copy link

Hi @elinx,

As explained in #55, we plan to have two branches: the stable and the dev branches. We will update the dev branch soon with a bunch of fixes. The goal is to have a push to the dev branch this Friday (Oct. 27th). To be transparent, we might have to slip the schedule and do it only on Monday (Oct. 30th) but, in both cases, it's coming soon :). The fix will be included in that update of the dev branch.

Thanks, Julien

execuse me . I have got a similar error ,when using it in triton.
[TensorRT-LLM][ERROR] Encountered an error in forward function: [TensorRT-LLM][ERROR] Assertion failed: position_ids: expected 2 dims, provided 3 dims (/app/tensorrt_llm/cpp/tensorrt_llm/runtime/tllmRuntime.cpp:139)
Is it the same reason and have you fixed it? version : origin/release/0.5.0
I'm looking forward to getting your answer
thank you

@byshiue
Copy link
Collaborator

byshiue commented Nov 13, 2023

Hi @elinx,
As explained in #55, we plan to have two branches: the stable and the dev branches. We will update the dev branch soon with a bunch of fixes. The goal is to have a push to the dev branch this Friday (Oct. 27th). To be transparent, we might have to slip the schedule and do it only on Monday (Oct. 30th) but, in both cases, it's coming soon :). The fix will be included in that update of the dev branch.
Thanks, Julien

execuse me . I have got a similar error ,when using it in triton. [TensorRT-LLM][ERROR] Encountered an error in forward function: [TensorRT-LLM][ERROR] Assertion failed: position_ids: expected 2 dims, provided 3 dims (/app/tensorrt_llm/cpp/tensorrt_llm/runtime/tllmRuntime.cpp:139) Is it the same reason and have you fixed it? version : origin/release/0.5.0 I'm looking forward to getting your answer thank you

Can you try the latest main branch? It should be fixed.

@Nisoka
Copy link

Nisoka commented Nov 17, 2023

i have try the main branch, i get the same error too when i try chatglm3-6b-chat

tensorrt_llm_backend release/0.5.0
tensorrt_llm is main branch

[TensorRT-LLM][ERROR] Encountered an error in forward function: [TensorRT-LLM][ERROR] Assertion failed: position_ids: expected 2 dims, provided 3 dims (/app/tensorrt_llm/cpp/tensorrt_llm/runtime/tllmRuntime.cpp:139)
1 0x7f6b6e7ff645 /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x17645) [0x7f6b6e7ff645]
2 0x7f6b6e8d5ef8 /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0xedef8) [0x7f6b6e8d5ef8]
3 0x7f6b6e8a137b /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0xb937b) [0x7f6b6e8a137b]
4 0x7f6b6e86004f /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x7804f) [0x7f6b6e86004f]
5 0x7f6b6e83b241 /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x53241) [0x7f6b6e83b241]
6 0x7f6b6e83c38a /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x5438a) [0x7f6b6e83c38a]
7 0x7f6bf5064253 /lib/x86_64-linux-gnu/libstdc++.so.6(+0xdc253) [0x7f6bf5064253]
8 0x7f6bf4df4ac3 /lib/x86_64-linux-gnu/libc.so.6(+0x94ac3) [0x7f6bf4df4ac3]
9 0x7f6bf4e85bf4 clone + 68
[TensorRT-LLM][ERROR] Encountered error for requestId 1804289384: Encountered an error in forward function: [TensorRT-LLM][ERROR] Assertion failed: position_ids: expected 2 dims, provided 3 dims (/app/tensorrt_llm/cpp/tensorrt_llm/runtime/tllmRuntime.cpp:139)
1 0x7f6b6e7ff645 /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x17645) [0x7f6b6e7ff645]
2 0x7f6b6e8d5ef8 /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0xedef8) [0x7f6b6e8d5ef8]
3 0x7f6b6e8a137b /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0xb937b) [0x7f6b6e8a137b]
4 0x7f6b6e86004f /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x7804f) [0x7f6b6e86004f]
5 0x7f6b6e83b241 /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x53241) [0x7f6b6e83b241]
6 0x7f6b6e83c38a /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x5438a) [0x7f6b6e83c38a]
7 0x7f6bf5064253 /lib/x86_64-linux-gnu/libstdc++.so.6(+0xdc253) [0x7f6bf5064253]
8 0x7f6bf4df4ac3 /lib/x86_64-linux-gnu/libc.so.6(+0x94ac3) [0x7f6bf4df4ac3]
9 0x7f6bf4e85bf4 clone + 68
[TensorRT-LLM][WARNING] Step function failed, continuing.
^C

@chaos318
Copy link

i have try the main branch, i get the same error too when i try chatglm3-6b-chat

tensorrt_llm_backend release/0.5.0 tensorrt_llm is main branch

[TensorRT-LLM][ERROR] Encountered an error in forward function: [TensorRT-LLM][ERROR] Assertion failed: position_ids: expected 2 dims, provided 3 dims (/app/tensorrt_llm/cpp/tensorrt_llm/runtime/tllmRuntime.cpp:139) 1 0x7f6b6e7ff645 /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x17645) [0x7f6b6e7ff645] 2 0x7f6b6e8d5ef8 /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0xedef8) [0x7f6b6e8d5ef8] 3 0x7f6b6e8a137b /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0xb937b) [0x7f6b6e8a137b] 4 0x7f6b6e86004f /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x7804f) [0x7f6b6e86004f] 5 0x7f6b6e83b241 /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x53241) [0x7f6b6e83b241] 6 0x7f6b6e83c38a /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x5438a) [0x7f6b6e83c38a] 7 0x7f6bf5064253 /lib/x86_64-linux-gnu/libstdc++.so.6(+0xdc253) [0x7f6bf5064253] 8 0x7f6bf4df4ac3 /lib/x86_64-linux-gnu/libc.so.6(+0x94ac3) [0x7f6bf4df4ac3] 9 0x7f6bf4e85bf4 clone + 68 [TensorRT-LLM][ERROR] Encountered error for requestId 1804289384: Encountered an error in forward function: [TensorRT-LLM][ERROR] Assertion failed: position_ids: expected 2 dims, provided 3 dims (/app/tensorrt_llm/cpp/tensorrt_llm/runtime/tllmRuntime.cpp:139) 1 0x7f6b6e7ff645 /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x17645) [0x7f6b6e7ff645] 2 0x7f6b6e8d5ef8 /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0xedef8) [0x7f6b6e8d5ef8] 3 0x7f6b6e8a137b /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0xb937b) [0x7f6b6e8a137b] 4 0x7f6b6e86004f /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x7804f) [0x7f6b6e86004f] 5 0x7f6b6e83b241 /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x53241) [0x7f6b6e83b241] 6 0x7f6b6e83c38a /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x5438a) [0x7f6b6e83c38a] 7 0x7f6bf5064253 /lib/x86_64-linux-gnu/libstdc++.so.6(+0xdc253) [0x7f6bf5064253] 8 0x7f6bf4df4ac3 /lib/x86_64-linux-gnu/libc.so.6(+0x94ac3) [0x7f6bf4df4ac3] 9 0x7f6bf4e85bf4 clone + 68 [TensorRT-LLM][WARNING] Step function failed, continuing. ^C

I'm tring main branch with chatglm2-6b,sync you my result later。

@chaos318
Copy link

I

i have try the main branch, i get the same error too when i try chatglm3-6b-chat

tensorrt_llm_backend release/0.5.0 tensorrt_llm is main branch

[TensorRT-LLM][ERROR] Encountered an error in forward function: [TensorRT-LLM][ERROR] Assertion failed: position_ids: expected 2 dims, provided 3 dims (/app/tensorrt_llm/cpp/tensorrt_llm/runtime/tllmRuntime.cpp:139) 1 0x7f6b6e7ff645 /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x17645) [0x7f6b6e7ff645] 2 0x7f6b6e8d5ef8 /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0xedef8) [0x7f6b6e8d5ef8] 3 0x7f6b6e8a137b /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0xb937b) [0x7f6b6e8a137b] 4 0x7f6b6e86004f /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x7804f) [0x7f6b6e86004f] 5 0x7f6b6e83b241 /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x53241) [0x7f6b6e83b241] 6 0x7f6b6e83c38a /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x5438a) [0x7f6b6e83c38a] 7 0x7f6bf5064253 /lib/x86_64-linux-gnu/libstdc++.so.6(+0xdc253) [0x7f6bf5064253] 8 0x7f6bf4df4ac3 /lib/x86_64-linux-gnu/libc.so.6(+0x94ac3) [0x7f6bf4df4ac3] 9 0x7f6bf4e85bf4 clone + 68 [TensorRT-LLM][ERROR] Encountered error for requestId 1804289384: Encountered an error in forward function: [TensorRT-LLM][ERROR] Assertion failed: position_ids: expected 2 dims, provided 3 dims (/app/tensorrt_llm/cpp/tensorrt_llm/runtime/tllmRuntime.cpp:139) 1 0x7f6b6e7ff645 /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x17645) [0x7f6b6e7ff645] 2 0x7f6b6e8d5ef8 /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0xedef8) [0x7f6b6e8d5ef8] 3 0x7f6b6e8a137b /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0xb937b) [0x7f6b6e8a137b] 4 0x7f6b6e86004f /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x7804f) [0x7f6b6e86004f] 5 0x7f6b6e83b241 /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x53241) [0x7f6b6e83b241] 6 0x7f6b6e83c38a /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x5438a) [0x7f6b6e83c38a] 7 0x7f6bf5064253 /lib/x86_64-linux-gnu/libstdc++.so.6(+0xdc253) [0x7f6bf5064253] 8 0x7f6bf4df4ac3 /lib/x86_64-linux-gnu/libc.so.6(+0x94ac3) [0x7f6bf4df4ac3] 9 0x7f6bf4e85bf4 clone + 68 [TensorRT-LLM][WARNING] Step function failed, continuing. ^C

I tried main branch lastest,and encountered the same error。 @byshiue

@chaos318
Copy link

chaos318 commented Nov 20, 2023

Hi @elinx,
As explained in #55, we plan to have two branches: the stable and the dev branches. We will update the dev branch soon with a bunch of fixes. The goal is to have a push to the dev branch this Friday (Oct. 27th). To be transparent, we might have to slip the schedule and do it only on Monday (Oct. 30th) but, in both cases, it's coming soon :). The fix will be included in that update of the dev branch.
Thanks, Julien

execuse me . I have got a similar error ,when using it in triton. [TensorRT-LLM][ERROR] Encountered an error in forward function: [TensorRT-LLM][ERROR] Assertion failed: position_ids: expected 2 dims, provided 3 dims (/app/tensorrt_llm/cpp/tensorrt_llm/runtime/tllmRuntime.cpp:139) Is it the same reason and have you fixed it? version : origin/release/0.5.0 I'm looking forward to getting your answer thank you

Can you try the latest main branch? It should be fixed.

need your help, @Nisoka and I all met this error ,when using chatglm2 or chatglm3. main branch latest.
could you tell us how to reslove it.
thank you.

@byshiue
Copy link
Collaborator

byshiue commented Nov 21, 2023

We cannot reproduce your issue. That's our scripts:

cd $TRTLLM_PATH

cd examples/chatglm/

python build.py -m chatglm2_6b

cd ../../cpp/build

make make gptSessionBenchmark

chmod +x ./benchmarks/gptSessionBenchmark

./benchmarks/gptSessionBenchmark \
    --duration 30 --model chatglm2_6b --engine_dir ../../examples/chatglm/trtModel --batch_size 1 --input_output_len 32,1

and the result is

Benchmarking done. Iteration: 1509, duration: 30.02 sec.
[BENCHMARK] batch_size 1 input_length 32 output_length 1 latency(ms) 19.89 tokensPerSec 50.27

Can you make sure you are using the latest main branch (the commit is 6755a3f).

@chaos318
Copy link

We cannot reproduce your issue. That's our scripts:

cd $TRTLLM_PATH

cd examples/chatglm/

python build.py -m chatglm2_6b

cd ../../cpp/build

make make gptSessionBenchmark

chmod +x ./benchmarks/gptSessionBenchmark

./benchmarks/gptSessionBenchmark \
    --duration 30 --model chatglm2_6b --engine_dir ../../examples/chatglm/trtModel --batch_size 1 --input_output_len 32,1

and the result is

Benchmarking done. Iteration: 1509, duration: 30.02 sec.
[BENCHMARK] batch_size 1 input_length 32 output_length 1 latency(ms) 19.89 tokensPerSec 50.27

Can you make sure you are using the latest main branch (the commit is 6755a3f).

I have double confirmed my branch is main and lastest comit , but triggered in triton , is there some differentia between them.

@byshiue
Copy link
Collaborator

byshiue commented Nov 21, 2023

You also need to make sure the TRT LLM is on latest main branch. They are different but related.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triaged Issue has been triaged by maintainers
Projects
None yet
Development

No branches or pull requests

8 participants
@elinx @byshiue @jdemouth-nvidia @Nisoka @chaos318 @calico-niko @nv-guomingz and others