Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: MiniCPM3在做gptq量化时报错 #252

Open
1 task done
vokkko opened this issue Oct 11, 2024 · 2 comments
Open
1 task done

[Bug]: MiniCPM3在做gptq量化时报错 #252

vokkko opened this issue Oct 11, 2024 · 2 comments
Labels
bug Something isn't working triage

Comments

@vokkko
Copy link

vokkko commented Oct 11, 2024

Is there an existing issue ? / 是否已有相关的 issue ?

  • I have searched, and there is no existing issue. / 我已经搜索过了,没有相关的 issue。

Describe the bug / 描述这个 bug

安装完AutoGPTQ下MiniCPM分支auto_gptq依赖,更改权重地址后运行脚本报错:
(base) root@autodl-container-aaf946aa7c-bb3ebebc:~/autodl-tmp/MiniCPM/quantize# python gptq_quantize.py
/root/miniconda3/envs/LLMQ/lib/python3.10/site-packages/auto_gptq/nn_modules/triton_utils/kernels.py:411: FutureWarning: torch.cuda.amp.custom_fwd(args...) is deprecated. Please use torch.amp.custom_fwd(args..., device_type='cuda') instead.
def forward(ctx, input, qweight, scales, qzeros, g_idx, bits, maxq):
/root/miniconda3/envs/LLMQ/lib/python3.10/site-packages/auto_gptq/nn_modules/triton_utils/kernels.py:419: FutureWarning: torch.cuda.amp.custom_bwd(args...) is deprecated. Please use torch.amp.custom_bwd(args..., device_type='cuda') instead.
def backward(ctx, grad_output):
/root/miniconda3/envs/LLMQ/lib/python3.10/site-packages/auto_gptq/nn_modules/triton_utils/kernels.py:461: FutureWarning: torch.cuda.amp.custom_fwd(args...) is deprecated. Please use torch.amp.custom_fwd(args..., device_type='cuda') instead.
@custom_fwd(cast_inputs=torch.float16)
Generating train split: 256 examples [00:00, 69732.55 examples/s]
Parameter 'function'=<function load_data..tokenize at 0x785813624ca0> of the transform datasets.arrow_dataset.Dataset._map_single couldn't be hashed properly, a random hash was used instead. Make sure your transforms and parameters are serializable with pickle or dill for the dataset fingerprinting and caching to work. If you reuse this transform, the caching mechanism will consider it to be different from the previous calls and recompute everything. This warning is only showed once. Subsequent hashing failures won't be showed.
WARNING:datasets.fingerprint:Parameter 'function'=<function load_data..tokenize at 0x785813624ca0> of the transform datasets.arrow_dataset.Dataset._map_single couldn't be hashed properly, a random hash was used instead. Make sure your transforms and parameters are serializable with pickle or dill for the dataset fingerprinting and caching to work. If you reuse this transform, the caching mechanism will consider it to be different from the previous calls and recompute everything. This warning is only showed once. Subsequent hashing failures won't be showed.
Map: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 256/256 [00:00<00:00, 803.98 examples/s]
Traceback (most recent call last):
File "/root/autodl-tmp/MiniCPM/quantize/gptq_quantize.py", line 244, in
main()
File "/root/autodl-tmp/MiniCPM/quantize/gptq_quantize.py", line 196, in main
model.quantize(
File "/root/miniconda3/envs/LLMQ/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/root/miniconda3/envs/LLMQ/lib/python3.10/site-packages/auto_gptq/modeling/_base.py", line 209, in quantize
examples = self._prepare_examples_for_quantization(examples, batch_size)
File "/root/miniconda3/envs/LLMQ/lib/python3.10/site-packages/auto_gptq/modeling/_base.py", line 165, in _prepare_examples_for_quantization
new_examples = [
File "/root/miniconda3/envs/LLMQ/lib/python3.10/site-packages/auto_gptq/modeling/_base.py", line 166, in
collate_data(new_examples[start : start + batch_size], pad_token_id)
File "/root/miniconda3/envs/LLMQ/lib/python3.10/site-packages/auto_gptq/utils/data_utils.py", line 161, in collate_data
input_ids_blocks[i] = pad_block(input_ids_blocks[i], torch.ones((block_bsz, pad_num)) * pad_token_id)
TypeError: only integer tensors of a single element can be converted to an index
是校准集格式的问题吗?

To Reproduce / 如何复现

打印 new_examples的内容: {'input_ids': [[73441, 1345, 43686, 5, 9468, 1358, 2800, 16694, 1385, 3382, 5498, 1592, 1384, 2144, 1476, 6380, 1753, 5069, 59358, 1496, 4221, 1502, 4757, 1536, 1421, 1807, 2701, 2471, 59354, 73440, 1345, 5, 73441, 1345, 1836, 5, 4221, 59361, 1352, 4757, 1536, 1421, 1807, 2701, 2471, 72, 73440, 1345, 5]], 'attention_mask': [[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]], 'labels': [[73441, 1345, 43686, 5, 9468, 1358, 2800, 16694, 1385, 3382, 5498, 1592, 1384, 2144, 1476, 6380, 1753, 5069, 59358, 1496, 4221, 1502, 4757, 1536, 1421, 1807, 2701, 2471, 59354, 73440, 1345, 5, 73441, 1345, 1836, 5, 4221, 59361, 1352, 4757, 1536, 1421, 1807, 2701, 2471, 72, 73440, 1345, 5]]}

Expected behavior / 期望的结果

No response

Screenshots / 截图

No response

Environment / 环境

- OS: [e.g. Ubuntu 22.04]
- Pytorch: [e.g. torch 2.4.1]
- CUDA: [e.g. CUDA 12.1]
- Device: [e.g. RTX4090]

Additional context / 其他信息

No response

@vokkko vokkko added bug Something isn't working triage labels Oct 11, 2024
@LDLINGLINGLING
Copy link
Collaborator

你好,可以使用我们自己量化好的模型哈,做过精度校准的

@vokkko
Copy link
Author

vokkko commented Oct 11, 2024

你好,可以用我们自己确定好的模型哈,我们的精度安排的
想用4B模型做全精度微调后用GPTQ量化,但目前有一点问题

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage
Projects
None yet
Development

No branches or pull requests

2 participants