改变model.lm_head类型后会导致load时无法找到权重 #213

jiangtann · 2023-09-19T13:07:51Z

Describe the bug

https://github.com/shibing624/MedicalGPT/blob/main/supervised_finetuning.py#L839
这行代码将model.lm_head从torch.nn.Linear改为了torch.nn.Sequential

这会导致使用AutoModelForCausalLM.from_pretrained()、LlamaForCausalLM.from_pretrained()加载模型时无法找到对应的weight：

Some weights of the model checkpoint at ./ were not used when initializing InternLMForCausalLM: ['lm_head.0.weight']
Some weights of InternLMForCausalLM were not initialized from the model checkpoint at ./ and are newly initialized: ['lm_head.weight']

Some weights of the model checkpoint at ./ were not used when initializing LlamaForCausalLM: ['lm_head.0.weight']
Some weights of LlamaForCausalLM were not initialized from the model checkpoint at ./ and are newly initialized: ['lm_head.weight']

因此，当全量sft模型完成后，load模型时如果忽视了这些warning，会对结果造成非常大的负面影响，相当于lm_head输出层全部随机初始化了。

另外想问下，为什么要将输出转为float32呢？fp16不行吗？

The text was updated successfully, but these errors were encountered:

shibing624 · 2023-09-20T03:22:33Z

merged. CastOutputToFloat 只是保证模型训练过程中精度，对chatglm训练需要。其他模型可以不加。

jiangtann added the bug Something isn't working label Sep 19, 2023

This was referenced Sep 19, 2023

fix lm_head type changed bug #214

Closed

fix lm_head type changed bug #215

Merged

shibing624 closed this as completed Sep 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

改变model.lm_head类型后会导致load时无法找到权重 #213

改变model.lm_head类型后会导致load时无法找到权重 #213

jiangtann commented Sep 19, 2023

shibing624 commented Sep 20, 2023

改变model.lm_head类型后会导致load时无法找到权重 #213

改变model.lm_head类型后会导致load时无法找到权重 #213

Comments

jiangtann commented Sep 19, 2023

Describe the bug

shibing624 commented Sep 20, 2023