You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While training the policy I get following error:
RuntimeError: expected scalar type Float but found BFloat16
Here is the logs
Traceback (most recent call last):
File "/home/LLaVA-RLHF/RLHF_t/finetune_lora_ppo.py", line 560, in <module>
train()
File "/home/LLaVA-RLHF/RLHF_t/finetune_lora_ppo.py", line 552, in train
trainer.train(
File "/home/LLaVA-RLHF/RLHF_t/models/rl_trainer.py", line 324, in train
self.log_history.append(self.step(infinite_train_dataloader, step_idx))
File "/home/LLaVA-RLHF/RLHF_t/models/rl_trainer.py", line 249, in step
rollouts = self.rollout(queries_batches)
File "/home/miniconda3/envs/rlhf/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/LLaVA-RLHF/RLHF_t/models/ppo_trainer.py", line 245, in rollout
respond_outputs = unwrapped_policy.respond(
File "/home/LLaVA-RLHF/RLHF_t/models/rl_models.py", line 339, in respond
return self.policy.respond(
File "/home/LLaVA-RLHF/RLHF_t/models/rl_models.py", line 74, in respond
self._respond(
File "/home/LLaVA-RLHF/RLHF_t/models/rl_models.py", line 164, in _respond
sequences = self.base_model.generate(
File "/home/miniconda3/envs/rlhf/lib/python3.10/site-packages/peft/peft_model.py", line 977, in generate
outputs = self.base_model.generate(**kwargs)
File "/home/miniconda3/envs/rlhf/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/miniconda3/envs/rlhf/lib/python3.10/site-packages/transformers/generation/utils.py", line 1588, in generate
return self.sample(
File "/home/miniconda3/envs/rlhf/lib/python3.10/site-packages/transformers/generation/utils.py", line 2642, in sample
outputs = self(
File "/home/miniconda3/envs/rlhf/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/miniconda3/envs/rlhf/lib/python3.10/site-packages/accelerate/hooks.py", line 166, in new_forward
output = module._old_forward(*args, **kwargs)
File "/home/LLaVA/llava/model/language_model/llava_llama.py", line 90, in forward
logits = self.lm_head(hidden_states)
File "/home/miniconda3/envs/rlhf/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/miniconda3/envs/rlhf/lib/python3.10/site-packages/accelerate/hooks.py", line 166, in new_forward
output = module._old_forward(*args, **kwargs)
File "/home/miniconda3/envs/rlhf/lib/python3.10/site-packages/torch/nn/modules/linear.py", line 114, in forward
return F.linear(input, self.weight, self.bias)
Here is the pytorch version: 2.0.1+cu118
Cuda version: CUDA Version installed: 12.6
Looking forward to the response.
The text was updated successfully, but these errors were encountered:
While training the policy I get following error:
RuntimeError: expected scalar type Float but found BFloat16
Here is the logs
Here is the pytorch version: 2.0.1+cu118
Cuda version: CUDA Version installed: 12.6
Looking forward to the response.
The text was updated successfully, but these errors were encountered: