You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Some weights of the model checkpoint at ./ were not used when initializing InternLMForCausalLM: ['lm_head.0.weight']
Some weights of InternLMForCausalLM were not initialized from the model checkpoint at ./ and are newly initialized: ['lm_head.weight']
Some weights of the model checkpoint at ./ were not used when initializing LlamaForCausalLM: ['lm_head.0.weight']
Some weights of LlamaForCausalLM were not initialized from the model checkpoint at ./ and are newly initialized: ['lm_head.weight']
Describe the bug
https://github.com/shibing624/MedicalGPT/blob/main/supervised_finetuning.py#L839
这行代码将model.lm_head从torch.nn.Linear改为了torch.nn.Sequential
这会导致使用
AutoModelForCausalLM.from_pretrained()
、LlamaForCausalLM.from_pretrained()
加载模型时无法找到对应的weight:因此,当全量sft模型完成后,load模型时如果忽视了这些warning,会对结果造成非常大的负面影响,相当于lm_head输出层全部随机初始化了。
另外想问下,为什么要将输出转为float32呢?fp16不行吗?
The text was updated successfully, but these errors were encountered: