We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
请问这里 的实现为啥和论文里面不一样?
def rotate_half(x): """Rotates half the hidden dims of the input.""" x1 = x[..., : x.shape[-1] // 2] x2 = x[..., x.shape[-1] // 2:] return torch.cat((-x2, x1), dim=-1)
论文里的计算是
按照这种实现最后的计算结果会是
我看huggingface里面也是这样,好奇为啥选择这种实现?
The text was updated successfully, but these errors were encountered:
embedding 里面神经元的位置是没有顺序的,随便选一半做反转就行了;
Sorry, something went wrong.
No branches or pull requests
Required prerequisites
Questions
请问这里 的实现为啥和论文里面不一样?
论文里的计算是
按照这种实现最后的计算结果会是
我看huggingface里面也是这样,好奇为啥选择这种实现?
Checklist
The text was updated successfully, but these errors were encountered: