We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
If you upgrade to the new accelerate, 0.33.0, BNB QLoRA training crashes with this stack trace:
loading checkpoint file model-00001-of-00030.safetensors load params into module <class 'llama_pipe.LlamaDecoderLayerPipe'> Traceback (most recent call last): File "/home/alyssa/lm_fun/qlora-pipe/train.py", line 418, in <module> pipeline_model, lora_model, lora_config = load_pipeline_model_with_lora(config, model_type) File "/home/alyssa/lm_fun/qlora-pipe/train.py", line 279, in load_pipeline_model_with_lora pipeline_model = engine.CustomPipelineModule( File "/home/alyssa/lm_fun/qlora-pipe/engine.py", line 274, in __init__ super().__init__(layers, **kwargs) File "/home/alyssa/anaconda3/envs/lm_fun/lib/python3.10/site-packages/deepspeed/runtime/pipe/module.py", line 212, in __init__ self._build() File "/home/alyssa/anaconda3/envs/lm_fun/lib/python3.10/site-packages/deepspeed/runtime/pipe/module.py", line 268, in _build module = layer.build() File "/home/alyssa/lm_fun/qlora-pipe/pipeline_model.py", line 75, in build return self.typename(*self.module_args, **self.module_kwargs) File "/home/alyssa/lm_fun/qlora-pipe/llama_pipe.py", line 113, in __init__ loader_util.load_state_dict_into_module(self) File "/home/alyssa/lm_fun/qlora-pipe/pipeline_model.py", line 316, in load_state_dict_into_module transformers.modeling_utils._load_state_dict_into_meta_model( File "/home/alyssa/anaconda3/envs/lm_fun/lib/python3.10/site-packages/transformers/modeling_utils.py", line 961, in _load_state_dict_into_meta_model set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs) File "/home/alyssa/anaconda3/envs/lm_fun/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 436, in set_module_tensor_to_device new_value = param_cls(new_value, requires_grad=old_value.requires_grad, **kwargs).to(device) TypeError: Params4bit.__new__() got an unexpected keyword argument 'original_name' [rank0]: Traceback (most recent call last): [rank0]: File "/home/alyssa/lm_fun/qlora-pipe/train.py", line 418, in <module> [rank0]: pipeline_model, lora_model, lora_config = load_pipeline_model_with_lora(config, model_type) [rank0]: File "/home/alyssa/lm_fun/qlora-pipe/train.py", line 279, in load_pipeline_model_with_lora [rank0]: pipeline_model = engine.CustomPipelineModule( [rank0]: File "/home/alyssa/lm_fun/qlora-pipe/engine.py", line 274, in __init__ [rank0]: super().__init__(layers, **kwargs) [rank0]: File "/home/alyssa/anaconda3/envs/lm_fun/lib/python3.10/site-packages/deepspeed/runtime/pipe/module.py", line 212, in __init__ [rank0]: self._build() [rank0]: File "/home/alyssa/anaconda3/envs/lm_fun/lib/python3.10/site-packages/deepspeed/runtime/pipe/module.py", line 268, in _build [rank0]: module = layer.build() [rank0]: File "/home/alyssa/lm_fun/qlora-pipe/pipeline_model.py", line 75, in build [rank0]: return self.typename(*self.module_args, **self.module_kwargs) [rank0]: File "/home/alyssa/lm_fun/qlora-pipe/llama_pipe.py", line 113, in __init__ [rank0]: loader_util.load_state_dict_into_module(self) [rank0]: File "/home/alyssa/lm_fun/qlora-pipe/pipeline_model.py", line 316, in load_state_dict_into_module [rank0]: transformers.modeling_utils._load_state_dict_into_meta_model( [rank0]: File "/home/alyssa/anaconda3/envs/lm_fun/lib/python3.10/site-packages/transformers/modeling_utils.py", line 961, in _load_state_dict_into_meta_model [rank0]: set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs) [rank0]: File "/home/alyssa/anaconda3/envs/lm_fun/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 436, in set_module_tensor_to_device [rank0]: new_value = param_cls(new_value, requires_grad=old_value.requires_grad, **kwargs).to(device) [rank0]: TypeError: Params4bit.__new__() got an unexpected keyword argument 'original_name'
Suspect it's because of this PR:
huggingface/accelerate#2934
This PR might also be relevant:
huggingface/accelerate#2986
Reverting to Accelerate 0.32.0 resolves the crash. Thank you!
The text was updated successfully, but these errors were encountered:
This should be fixed as of the latest commits now.
Sorry, something went wrong.
No branches or pull requests
If you upgrade to the new accelerate, 0.33.0, BNB QLoRA training crashes with this stack trace:
Suspect it's because of this PR:
huggingface/accelerate#2934
This PR might also be relevant:
huggingface/accelerate#2986
Reverting to Accelerate 0.32.0 resolves the crash. Thank you!
The text was updated successfully, but these errors were encountered: