-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Qwen-72B-chat-GPTQ TP=4 ERROR #1344
Comments
Do you test your code before releasing it? |
Please fill the info to help reproduce the issue. We have tests before releasing. |
I followed the official documentation: python3 examples/qwen/convert_checkpoint.py \
--model_dir ./Qwen-72B-Chat-Int4/ \
--output_dir ./Qwen-72B-Chat-Int4-TRT/tllm_checkpoint_4gpu_tp4_gptq/ \
--dtype float16 \
--use_weight_only \
--weight_only_precision int4_gptq \
--tp_size 4 \
--pp_size 1 \
--per_group ValueError: You are trying to save a non contiguous tensor: `transformer.layers.0.mlp.gate.weights_scaling_factor` which is not allowed. It either means you are trying to save tensors which are reference of each other in which case it's recommended to save only the full tensors, and reslice at load time, or simply call `.contiguous()` on your tensor to pack it before saving. I can fix this error by following the modifications below: TensorRT-LLM/examples/qwen/convert_checkpoint.py Lines 250 to 254 in 66ca337
Then, I try to build the engine: trtllm-build \
--checkpoint_dir ./Qwen-72B-Chat-Int4-TRT/tllm_checkpoint_4gpu_tp4_gptq/ \
--output_dir ./Qwen-72B-Chat-Int4-TRT/trt_engines/int4-gptq/4-gpu/ \
--max_batch_size 1 \
--max_input_len 2048 \
--max_output_len 512 \
--gather_all_token_logits \
--gemm_plugin float16 \
--tp_size 4 RuntimeError: Encounter error 'The value updated is not the same shape as the original. Updated: (8192, 6144), original: (8192, 1536)' for parameter 'transformer.layers.0.attention.qkv.weight' I don't know how to fix it |
One more point to add, |
Looking forward to your reply @byshiue |
I suffered this problem too. |
I suffered this problem too. |
+1 |
Is the error message you encountered the same as mine? |
Yes. And I used same cmds to build the engine under trt-llm 0.9.0.dev2024040200. |
@Hukongtao Sure, I am working on this, I will keep you posted. |
@Hukongtao @HermitSun
|
@adamydwang @ZhangJinxin1 |
It works in my situation. Thank you for your effort! |
Thank you for your work! |
System Info
xx
Who can help?
No response
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
xx
Expected behavior
xx
actual behavior
xx
additional notes
xx
The text was updated successfully, but these errors were encountered: