benchmark for chatglm2-6b failed #138

elinx · 2023-10-26T07:37:28Z

I convert chagglm2-6b model and run fine with the build command:

python3 build.py --model_dir=${model_dir} \
                 --dtype float16 \
                 --use_gpt_attention_plugin float16 \
                 --use_gemm_plugin float16

but benchmark failed with the following command:

../../cpp/build/benchmarks/gptSessionBenchmark --duration 30 --model chatglm2-6b --engine_dir /code/tensorrt_llm/examples/chatglm2-6b/trtModel --batch_size 1 --input_output_len 32,1

error message:

[TensorRT-LLM][ERROR] [TensorRT-LLM][ERROR] Assertion failed: position_ids: expected 2 dims, provided 3 dims (/code/tensorrt_llm/cpp/tensorrt_llm/runtime/tllmRuntime.cpp:139)
1       0x561e1c97c6ee tensorrt_llm::common::throwRuntimeError(char const*, int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 100
2       0x7f6be25bd53b tensorrt_llm::runtime::TllmRuntime::setInputTensors(int, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<tensorrt_llm::runtime::ITensor>, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<tensorrt_llm::runtime::ITensor> > > > const&) + 1867
3       0x7f6be2587453 tensorrt_llm::runtime::GptSession::generateSingleBatch(tensorrt_llm::runtime::GenerationOutput&, tensorrt_llm::runtime::GenerationInput const&, tensorrt_llm::runtime::SamplingConfig const&) + 2211
4       0x561e1c980537 ../../cpp/build/benchmarks/gptSessionBenchmark(+0x17537) [0x561e1c980537]
5       0x7f6ba4edcd90 /usr/lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7f6ba4edcd90]
6       0x7f6ba4edce40 __libc_start_main + 128
7       0x561e1c981fe5 ../../cpp/build/benchmarks/gptSessionBenchmark(+0x18fe5) [0x561e1c981fe5]

The text was updated successfully, but these errors were encountered:

nv-guomingz · 2023-10-26T10:55:18Z

Thanks @elinx for reporting this issue.
We've fixed it internally and will publish the patch soon.

elinx · 2023-10-26T12:20:02Z

Thanks @elinx for reporting this issue. We've fixed it internally and will publish the patch soon.

I see, is there any quick way to fix it?

Have you finalized the release date for your upcoming version, would it be possible for me to know? thanks

jdemouth-nvidia · 2023-10-26T12:27:16Z

Hi @elinx,

As explained in #55, we plan to have two branches: the stable and the dev branches. We will update the dev branch soon with a bunch of fixes. The goal is to have a push to the dev branch this Friday (Oct. 27th). To be transparent, we might have to slip the schedule and do it only on Monday (Oct. 30th) but, in both cases, it's coming soon :). The fix will be included in that update of the dev branch.

Thanks,
Julien

byshiue · 2023-10-27T05:36:32Z

This issue is similar to #93

calico-niko · 2023-11-06T13:06:50Z

Has this bug been fixed on main yet?

elinx · 2023-11-10T10:27:17Z

Has this bug been fixed on main yet?

yes, based on my testing.

byshiue · 2023-11-13T08:25:26Z

Thank you for the verification, @elinx . Close this bug. Feel free to reopen it if needed.

chaos318 · 2023-11-13T10:44:24Z

Hi @elinx,

As explained in #55, we plan to have two branches: the stable and the dev branches. We will update the dev branch soon with a bunch of fixes. The goal is to have a push to the dev branch this Friday (Oct. 27th). To be transparent, we might have to slip the schedule and do it only on Monday (Oct. 30th) but, in both cases, it's coming soon :). The fix will be included in that update of the dev branch.

Thanks, Julien

execuse me . I have got a similar error ,when using it in triton.
[TensorRT-LLM][ERROR] Encountered an error in forward function: [TensorRT-LLM][ERROR] Assertion failed: position_ids: expected 2 dims, provided 3 dims (/app/tensorrt_llm/cpp/tensorrt_llm/runtime/tllmRuntime.cpp:139)
Is it the same reason and have you fixed it? version : origin/release/0.5.0
I'm looking forward to getting your answer
thank you

byshiue · 2023-11-13T14:19:55Z

Hi @elinx,
As explained in #55, we plan to have two branches: the stable and the dev branches. We will update the dev branch soon with a bunch of fixes. The goal is to have a push to the dev branch this Friday (Oct. 27th). To be transparent, we might have to slip the schedule and do it only on Monday (Oct. 30th) but, in both cases, it's coming soon :). The fix will be included in that update of the dev branch.
Thanks, Julien

execuse me . I have got a similar error ,when using it in triton. [TensorRT-LLM][ERROR] Encountered an error in forward function: [TensorRT-LLM][ERROR] Assertion failed: position_ids: expected 2 dims, provided 3 dims (/app/tensorrt_llm/cpp/tensorrt_llm/runtime/tllmRuntime.cpp:139) Is it the same reason and have you fixed it? version : origin/release/0.5.0 I'm looking forward to getting your answer thank you

Can you try the latest main branch? It should be fixed.

Nisoka · 2023-11-17T09:06:31Z

i have try the main branch, i get the same error too when i try chatglm3-6b-chat

tensorrt_llm_backend release/0.5.0
tensorrt_llm is main branch

[TensorRT-LLM][ERROR] Encountered an error in forward function: [TensorRT-LLM][ERROR] Assertion failed: position_ids: expected 2 dims, provided 3 dims (/app/tensorrt_llm/cpp/tensorrt_llm/runtime/tllmRuntime.cpp:139)
1 0x7f6b6e7ff645 /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x17645) [0x7f6b6e7ff645]
2 0x7f6b6e8d5ef8 /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0xedef8) [0x7f6b6e8d5ef8]
3 0x7f6b6e8a137b /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0xb937b) [0x7f6b6e8a137b]
4 0x7f6b6e86004f /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x7804f) [0x7f6b6e86004f]
5 0x7f6b6e83b241 /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x53241) [0x7f6b6e83b241]
6 0x7f6b6e83c38a /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x5438a) [0x7f6b6e83c38a]
7 0x7f6bf5064253 /lib/x86_64-linux-gnu/libstdc++.so.6(+0xdc253) [0x7f6bf5064253]
8 0x7f6bf4df4ac3 /lib/x86_64-linux-gnu/libc.so.6(+0x94ac3) [0x7f6bf4df4ac3]
9 0x7f6bf4e85bf4 clone + 68
[TensorRT-LLM][ERROR] Encountered error for requestId 1804289384: Encountered an error in forward function: [TensorRT-LLM][ERROR] Assertion failed: position_ids: expected 2 dims, provided 3 dims (/app/tensorrt_llm/cpp/tensorrt_llm/runtime/tllmRuntime.cpp:139)
1 0x7f6b6e7ff645 /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x17645) [0x7f6b6e7ff645]
2 0x7f6b6e8d5ef8 /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0xedef8) [0x7f6b6e8d5ef8]
3 0x7f6b6e8a137b /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0xb937b) [0x7f6b6e8a137b]
4 0x7f6b6e86004f /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x7804f) [0x7f6b6e86004f]
5 0x7f6b6e83b241 /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x53241) [0x7f6b6e83b241]
6 0x7f6b6e83c38a /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x5438a) [0x7f6b6e83c38a]
7 0x7f6bf5064253 /lib/x86_64-linux-gnu/libstdc++.so.6(+0xdc253) [0x7f6bf5064253]
8 0x7f6bf4df4ac3 /lib/x86_64-linux-gnu/libc.so.6(+0x94ac3) [0x7f6bf4df4ac3]
9 0x7f6bf4e85bf4 clone + 68
[TensorRT-LLM][WARNING] Step function failed, continuing.
^C

chaos318 · 2023-11-17T10:36:13Z

i have try the main branch, i get the same error too when i try chatglm3-6b-chat

tensorrt_llm_backend release/0.5.0 tensorrt_llm is main branch

[TensorRT-LLM][ERROR] Encountered an error in forward function: [TensorRT-LLM][ERROR] Assertion failed: position_ids: expected 2 dims, provided 3 dims (/app/tensorrt_llm/cpp/tensorrt_llm/runtime/tllmRuntime.cpp:139) 1 0x7f6b6e7ff645 /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x17645) [0x7f6b6e7ff645] 2 0x7f6b6e8d5ef8 /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0xedef8) [0x7f6b6e8d5ef8] 3 0x7f6b6e8a137b /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0xb937b) [0x7f6b6e8a137b] 4 0x7f6b6e86004f /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x7804f) [0x7f6b6e86004f] 5 0x7f6b6e83b241 /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x53241) [0x7f6b6e83b241] 6 0x7f6b6e83c38a /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x5438a) [0x7f6b6e83c38a] 7 0x7f6bf5064253 /lib/x86_64-linux-gnu/libstdc++.so.6(+0xdc253) [0x7f6bf5064253] 8 0x7f6bf4df4ac3 /lib/x86_64-linux-gnu/libc.so.6(+0x94ac3) [0x7f6bf4df4ac3] 9 0x7f6bf4e85bf4 clone + 68 [TensorRT-LLM][ERROR] Encountered error for requestId 1804289384: Encountered an error in forward function: [TensorRT-LLM][ERROR] Assertion failed: position_ids: expected 2 dims, provided 3 dims (/app/tensorrt_llm/cpp/tensorrt_llm/runtime/tllmRuntime.cpp:139) 1 0x7f6b6e7ff645 /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x17645) [0x7f6b6e7ff645] 2 0x7f6b6e8d5ef8 /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0xedef8) [0x7f6b6e8d5ef8] 3 0x7f6b6e8a137b /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0xb937b) [0x7f6b6e8a137b] 4 0x7f6b6e86004f /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x7804f) [0x7f6b6e86004f] 5 0x7f6b6e83b241 /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x53241) [0x7f6b6e83b241] 6 0x7f6b6e83c38a /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x5438a) [0x7f6b6e83c38a] 7 0x7f6bf5064253 /lib/x86_64-linux-gnu/libstdc++.so.6(+0xdc253) [0x7f6bf5064253] 8 0x7f6bf4df4ac3 /lib/x86_64-linux-gnu/libc.so.6(+0x94ac3) [0x7f6bf4df4ac3] 9 0x7f6bf4e85bf4 clone + 68 [TensorRT-LLM][WARNING] Step function failed, continuing. ^C

I'm tring main branch with chatglm2-6b，sync you my result later。

chaos318 · 2023-11-20T07:31:47Z

I

i have try the main branch, i get the same error too when i try chatglm3-6b-chat

tensorrt_llm_backend release/0.5.0 tensorrt_llm is main branch

[TensorRT-LLM][ERROR] Encountered an error in forward function: [TensorRT-LLM][ERROR] Assertion failed: position_ids: expected 2 dims, provided 3 dims (/app/tensorrt_llm/cpp/tensorrt_llm/runtime/tllmRuntime.cpp:139) 1 0x7f6b6e7ff645 /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x17645) [0x7f6b6e7ff645] 2 0x7f6b6e8d5ef8 /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0xedef8) [0x7f6b6e8d5ef8] 3 0x7f6b6e8a137b /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0xb937b) [0x7f6b6e8a137b] 4 0x7f6b6e86004f /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x7804f) [0x7f6b6e86004f] 5 0x7f6b6e83b241 /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x53241) [0x7f6b6e83b241] 6 0x7f6b6e83c38a /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x5438a) [0x7f6b6e83c38a] 7 0x7f6bf5064253 /lib/x86_64-linux-gnu/libstdc++.so.6(+0xdc253) [0x7f6bf5064253] 8 0x7f6bf4df4ac3 /lib/x86_64-linux-gnu/libc.so.6(+0x94ac3) [0x7f6bf4df4ac3] 9 0x7f6bf4e85bf4 clone + 68 [TensorRT-LLM][ERROR] Encountered error for requestId 1804289384: Encountered an error in forward function: [TensorRT-LLM][ERROR] Assertion failed: position_ids: expected 2 dims, provided 3 dims (/app/tensorrt_llm/cpp/tensorrt_llm/runtime/tllmRuntime.cpp:139) 1 0x7f6b6e7ff645 /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x17645) [0x7f6b6e7ff645] 2 0x7f6b6e8d5ef8 /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0xedef8) [0x7f6b6e8d5ef8] 3 0x7f6b6e8a137b /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0xb937b) [0x7f6b6e8a137b] 4 0x7f6b6e86004f /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x7804f) [0x7f6b6e86004f] 5 0x7f6b6e83b241 /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x53241) [0x7f6b6e83b241] 6 0x7f6b6e83c38a /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x5438a) [0x7f6b6e83c38a] 7 0x7f6bf5064253 /lib/x86_64-linux-gnu/libstdc++.so.6(+0xdc253) [0x7f6bf5064253] 8 0x7f6bf4df4ac3 /lib/x86_64-linux-gnu/libc.so.6(+0x94ac3) [0x7f6bf4df4ac3] 9 0x7f6bf4e85bf4 clone + 68 [TensorRT-LLM][WARNING] Step function failed, continuing. ^C

I tried main branch lastest，and encountered the same error。 @byshiue

chaos318 · 2023-11-20T10:37:03Z

Hi @elinx,
As explained in #55, we plan to have two branches: the stable and the dev branches. We will update the dev branch soon with a bunch of fixes. The goal is to have a push to the dev branch this Friday (Oct. 27th). To be transparent, we might have to slip the schedule and do it only on Monday (Oct. 30th) but, in both cases, it's coming soon :). The fix will be included in that update of the dev branch.
Thanks, Julien

execuse me . I have got a similar error ,when using it in triton. [TensorRT-LLM][ERROR] Encountered an error in forward function: [TensorRT-LLM][ERROR] Assertion failed: position_ids: expected 2 dims, provided 3 dims (/app/tensorrt_llm/cpp/tensorrt_llm/runtime/tllmRuntime.cpp:139) Is it the same reason and have you fixed it? version : origin/release/0.5.0 I'm looking forward to getting your answer thank you

Can you try the latest main branch? It should be fixed.

need your help， @Nisoka and I all met this error ,when using chatglm2 or chatglm3. main branch latest.
could you tell us how to reslove it.
thank you.

byshiue · 2023-11-21T07:36:17Z

We cannot reproduce your issue. That's our scripts:

cd $TRTLLM_PATH

cd examples/chatglm/

python build.py -m chatglm2_6b

cd ../../cpp/build

make make gptSessionBenchmark

chmod +x ./benchmarks/gptSessionBenchmark

./benchmarks/gptSessionBenchmark \
    --duration 30 --model chatglm2_6b --engine_dir ../../examples/chatglm/trtModel --batch_size 1 --input_output_len 32,1

and the result is

Benchmarking done. Iteration: 1509, duration: 30.02 sec.
[BENCHMARK] batch_size 1 input_length 32 output_length 1 latency(ms) 19.89 tokensPerSec 50.27

Can you make sure you are using the latest main branch (the commit is 6755a3f).

chaos318 · 2023-11-21T08:58:46Z

We cannot reproduce your issue. That's our scripts:

cd $TRTLLM_PATH

cd examples/chatglm/

python build.py -m chatglm2_6b

cd ../../cpp/build

make make gptSessionBenchmark

chmod +x ./benchmarks/gptSessionBenchmark

./benchmarks/gptSessionBenchmark \
    --duration 30 --model chatglm2_6b --engine_dir ../../examples/chatglm/trtModel --batch_size 1 --input_output_len 32,1

and the result is

Benchmarking done. Iteration: 1509, duration: 30.02 sec.
[BENCHMARK] batch_size 1 input_length 32 output_length 1 latency(ms) 19.89 tokensPerSec 50.27

Can you make sure you are using the latest main branch (the commit is 6755a3f).

I have double confirmed my branch is main and lastest comit , but triggered in triton , is there some differentia between them.

byshiue · 2023-11-21T09:24:49Z

You also need to make sure the TRT LLM is on latest main branch. They are different but related.

nv-guomingz added bug Something isn't working triaged Issue has been triaged by maintainers labels Oct 26, 2023

byshiue self-assigned this Oct 27, 2023

AndreWanga mentioned this issue Nov 10, 2023

How to use tensorrt_llm for V100 16G version triton-inference-server/tensorrtllm_backend#102

Closed

byshiue closed this as completed Nov 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

benchmark for chatglm2-6b failed #138

benchmark for chatglm2-6b failed #138

elinx commented Oct 26, 2023 •

edited

Loading

nv-guomingz commented Oct 26, 2023

elinx commented Oct 26, 2023

jdemouth-nvidia commented Oct 26, 2023

byshiue commented Oct 27, 2023

calico-niko commented Nov 6, 2023 •

edited

Loading

elinx commented Nov 10, 2023

byshiue commented Nov 13, 2023

chaos318 commented Nov 13, 2023

byshiue commented Nov 13, 2023

Nisoka commented Nov 17, 2023

chaos318 commented Nov 17, 2023

chaos318 commented Nov 20, 2023

chaos318 commented Nov 20, 2023 •

edited

Loading

byshiue commented Nov 21, 2023

chaos318 commented Nov 21, 2023

byshiue commented Nov 21, 2023

benchmark for chatglm2-6b failed #138

benchmark for chatglm2-6b failed #138

Comments

elinx commented Oct 26, 2023 • edited Loading

nv-guomingz commented Oct 26, 2023

elinx commented Oct 26, 2023

jdemouth-nvidia commented Oct 26, 2023

byshiue commented Oct 27, 2023

calico-niko commented Nov 6, 2023 • edited Loading

elinx commented Nov 10, 2023

byshiue commented Nov 13, 2023

chaos318 commented Nov 13, 2023

byshiue commented Nov 13, 2023

Nisoka commented Nov 17, 2023

chaos318 commented Nov 17, 2023

chaos318 commented Nov 20, 2023

chaos318 commented Nov 20, 2023 • edited Loading

byshiue commented Nov 21, 2023

chaos318 commented Nov 21, 2023

byshiue commented Nov 21, 2023

elinx commented Oct 26, 2023 •

edited

Loading

calico-niko commented Nov 6, 2023 •

edited

Loading

chaos318 commented Nov 20, 2023 •

edited

Loading