-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
关于零初始化和扩展层的位置 #28
Comments
超级感谢!关于第二点我看了论文,很理解,但是至于为啥是down_proj和o_proj,还是存在疑问,up_proj是不是也是可行的? |
我们参照了adapter的方式在输出的地方清零,在up的时候清零我没算过,不确定有没有梯度,你可以试一下 |
你好,请问下down_proj, o_proj初始化为0,o_proj,down_proj有梯度吗。 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
感谢解答【拱手】
The text was updated successfully, but these errors were encountered: