Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

改变model.lm_head类型后会导致load时无法找到权重 #213

Closed
jiangtann opened this issue Sep 19, 2023 · 1 comment
Closed

改变model.lm_head类型后会导致load时无法找到权重 #213

jiangtann opened this issue Sep 19, 2023 · 1 comment
Labels
bug Something isn't working

Comments

@jiangtann
Copy link
Contributor

Describe the bug

https://github.com/shibing624/MedicalGPT/blob/main/supervised_finetuning.py#L839
这行代码将model.lm_head从torch.nn.Linear改为了torch.nn.Sequential

这会导致使用AutoModelForCausalLM.from_pretrained()LlamaForCausalLM.from_pretrained()加载模型时无法找到对应的weight:

Some weights of the model checkpoint at ./ were not used when initializing InternLMForCausalLM: ['lm_head.0.weight']
Some weights of InternLMForCausalLM were not initialized from the model checkpoint at ./ and are newly initialized: ['lm_head.weight']

Some weights of the model checkpoint at ./ were not used when initializing LlamaForCausalLM: ['lm_head.0.weight']
Some weights of LlamaForCausalLM were not initialized from the model checkpoint at ./ and are newly initialized: ['lm_head.weight']

因此,当全量sft模型完成后,load模型时如果忽视了这些warning,会对结果造成非常大的负面影响,相当于lm_head输出层全部随机初始化了。

另外想问下,为什么要将输出转为float32呢?fp16不行吗?

@jiangtann jiangtann added the bug Something isn't working label Sep 19, 2023
@shibing624
Copy link
Owner

merged. CastOutputToFloat 只是保证模型训练过程中精度,对chatglm训练需要。其他模型可以不加。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants