We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
感谢作者的有帮助性的工作。想问一下在模型预训练阶段的一些问题: 1、针对生物或医学类的词汇,如何扩充到现有的llama词汇表中? 2、重新制作目前新语料的tokens会带来更好的loss收益么? 3、我试着用目前的预料切了一下生物类的专业词汇,看起来切得比较散,不知道您有没有注意到这一点。4、我发现在训练过程中1个epoch下来loss降得有限,后面必须要多个epoch的loss才能降下来,这样的话无疑增加了很多训练时间? 5、预训练的loss一般需要达到多少是比较能够往SFT继续走的水平?
期待作者的回复,感谢!
The text was updated successfully, but these errors were encountered:
No branches or pull requests
感谢作者的有帮助性的工作。想问一下在模型预训练阶段的一些问题:
1、针对生物或医学类的词汇,如何扩充到现有的llama词汇表中?
2、重新制作目前新语料的tokens会带来更好的loss收益么?
3、我试着用目前的预料切了一下生物类的专业词汇,看起来切得比较散,不知道您有没有注意到这一点。4、我发现在训练过程中1个epoch下来loss降得有限,后面必须要多个epoch的loss才能降下来,这样的话无疑增加了很多训练时间?
5、预训练的loss一般需要达到多少是比较能够往SFT继续走的水平?
期待作者的回复,感谢!
The text was updated successfully, but these errors were encountered: