You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[10/29/2024-17:08:19] [TRT-LLM] [W] Found pynvml==11.5.3 and cuda driver version 470.182.03. Please use pynvml>=11.5.0 and cuda driver>=526 to get accurate memory usage.
[TensorRT-LLM] TensorRT-LLM version: 0.13.0
0.13.0
229it [00:02, 93.96it/s]
Traceback (most recent call last):
File "/home/tensorrtllm_backend/tensorrt_llm/examples/qwen/convert_checkpoint.py", line 303, in <module>
main()
File "/home/tensorrtllm_backend/tensorrt_llm/examples/qwen/convert_checkpoint.py", line 295, in main
convert_and_save_hf(args)
File "/home/tensorrtllm_backend/tensorrt_llm/examples/qwen/convert_checkpoint.py", line 251, in convert_and_save_hf
execute(args.workers, [convert_and_save_rank] * world_size, args)
File "/home/tensorrtllm_backend/tensorrt_llm/examples/qwen/convert_checkpoint.py", line 258, in execute
f(args, rank)
File "/home/tensorrtllm_backend/tensorrt_llm/examples/qwen/convert_checkpoint.py", line 241, in convert_and_save_rank
qwen = QWenForCausalLM.from_hugging_face(
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/models/qwen/model.py", line 427, in from_hugging_face
loader.generate_tllm_weights(model)
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/models/model_weights_loader.py", line 357, in generate_tllm_weights
self.load(tllm_key,
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/models/model_weights_loader.py", line 278, in load
v = sub_module.postprocess(tllm_key, v, **postprocess_kwargs)
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/layers/linear.py", line 391, in postprocess
weights = weights.to(str_dtype_to_torch(self.dtype))
AttributeError: 'NoneType' object has no attribute 'to'
Exception ignored in: <function PretrainedModel.__del__ at 0x7f778f992050>
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/models/modeling_utils.py", line 453, in __del__
self.release()
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/models/modeling_utils.py", line 450, in release
release_gc()
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/_utils.py", line 471, in release_gc
torch.cuda.ipc_collect()
File "/usr/local/lib/python3.10/dist-packages/torch/cuda/__init__.py", line 901, in ipc_collect
_lazy_init()
File "/usr/local/lib/python3.10/dist-packages/torch/cuda/__init__.py", line 330, in _lazy_init
raise DeferredCudaCallError(msg) from e
torch.cuda.DeferredCudaCallError: CUDA call failed lazily at initialization with error: 'NoneType' object is not iterable
Expected behavior
convert success
actual behavior
convert failed
additional notes
tensorRT-llm 0.13.0
The text was updated successfully, but these errors were encountered:
System Info
Who can help?
No response
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
I execute convert script in docker:
command:
model file:
exception stack:
Expected behavior
convert success
actual behavior
convert failed
additional notes
tensorRT-llm 0.13.0
The text was updated successfully, but these errors were encountered: